It might be funny to hear but I am specialized in vr. Well, I could criticize it in many ways. In the case of this picture, it's comparable to people being excited about GMO, but being against it because of how capitalism manages to fuck it up.
Honey I thought you'd never ask, here's my two bits in lay terms:
If I'd have to give one quick answer it would be memory latency. The fact is that memory and computational power have grown immensely over the years, but the time it takes to retrieve a bunch of data from the memory hasn't really improved at the same rate.
Some quick math shows that the speed of light must be an issue. The solution to that is to create smaller devices, such as the SOCs (system on a chip) that we are starting to see the past few years.
In less technical words: The postal service is darn slow. Only a few days ago you figured out you needed something small to continue your work, and since then you've been waiting and idling. The roads are fantastic, it's just that there's a speed limit. The solution is to take all the villages and condense them into a city, shortening the distances.
There's a lot more to it than that, and that's just one of the issues on only a hardware level and only one of the solutions.
Yeah, that's pretty typical for a lot of computing these days. People are talking about exotic things like in-memory processing as a way forward because of that.
Is that the whole thing, or is there something more specific to VR? You can make a smartphone no problem, but portable goggles end up with an ungodly short battery life.
The battery life is actually one of the downsides of accessing a lot of memory. A typical way to solve this is to do a depth draw first and then another one that actually samples textures. Textures and even meshes use a lot of bandwidth. But that won't work for all devices because many use their own special ways to solve this by using a screen grid with buckets and depth sorting the tris.
A unique issue for vr is that you have to render for two eyes and at a high frequency. A typical mobile game might target 30 fps instead of the typical 60 when running on battery. On the contrary, if a vr game would run at 60 fps you'd get nauseated pretty easily. A low end device will run at 100, and in an overly simplified sense that means you're actually doing 200 fps because of the two eyes.
Further, you have to consider the tracking cameras. I am not knowledgeable about those but it's safe to assume they need to send a lot of data around.