Semantics. A podcast is and was something that was typically long format akin to a talk show - that was something that could be listened to without requiring you to watch it. It is not audio exclusive. Many radio shows may and do have video feeds but that does not prevent them from being called radio shows.
I don’t think it’s that unreasonable to have something called “video podcast” in the scenario where you have an actual podcast, which also happens to have a video recording available on the internet as well. Sometimes I like to watch the video versions of podcasts to see the facial expressions of the speakers. “video podcast” seems like a natural shortening of “video of a podcast”. I think the important part is that the content is first and foremost a podcast, where it is meant to be listened to. As soon as it stops being possible to listen to the podcast as audio only, for example if they start relying on visuals that can only be seen in the video, then it is no longer a podcast.