The Stage for VR Metaverse
One of the key technologies expected to usher in a fully immersive metaverse is the VR or AR headset. However, it's not clear if there is any compelling reason to believe that a trend towards VR and the metaverse is inevitable, other than novelty. For any technology to achieve widespread adoption, it must provide clear economic benefits that are orders of magnitude better than the current state of affairs. In addition to providing compelling benefits, technologies that achieve widespread adoption often do so through network effects, where the value of the technology increases as more people join the network.
Creating VR worlds is incredibly costly, akin to game development on steroids, and demands an entirely new pipeline. Many optimizations and rendering techniques used in traditional games don't apply to VR. Additionally, 3D assets for these worlds are expensive and slow to create, requiring hundreds if not thousands of man-hours, with no clear path to ROI. For example, despite investing billions of dollars, Meta has little to show for it.
VR devices are also constrained by computing power and battery life, making them bulky novelties that fall short of the transformative experience people expect.
In the past year, AI has demonstrated clear 10x time-saving and productivity benefits in image and text generation. Its potential to enhance other industries is tantalizing.
The continuing impact of AI on socio-economic conditions will be surprising and counter-intuitive to many:
AI will out-compete many white-collar jobs and depress the labour value of these positions towards the cost of computing, a phenomenon known as technological unemployment.
ROI on generative AI in the short term will be low. AI models that generate image/text/music can be readily copied and redeployed, offering no moat and incredible competitions on the same abilities.
Open-source AI will progress rapidly, but capturing value will be challenging due to the abundance of free alternatives.
Digital media asset prices will plummet.
While this may seem negative for the digital economy, it will be a boon for VR and the metaverse. The cost and speed of digital asset creation will be so low that grassroots efforts will become feasible. VR world creation will no longer be the exclusive domain of large companies with hundreds of artists. We'll see a resurgence of game studios experimenting with AI to create expansive sandbox worlds, reminiscent of the shareware games boom of the late 90s.
However, the economic value that can be generated and captured by AI remains an open question. As pure digital media creation has little value to capture, AI must connect with a source of scarcity – the physical world. This may occur through devices, self-driving cars, and ultimately, robotics. The holy grail of AI productivity in the physical world will be robotics capable of generalizing across many tasks. These robots will likely be human-scaled for better interaction. The productivity of these robots will be where most market value is added in the physical world, autonomous AI robotics is also the most difficult challenge to solve. A flood of capital will be deployed to facilitate this AI-human-robot development and integration, setting the stage for a viable VR metaverse.
Robotics, AI Alignment, and Games
AI alignment and continuous learning will be the primary drivers for the shift towards a compute-centric virtual economy, namely the metaverse. AI alignment refers to designing AI systems whose goals and behaviours align with human values.
Several forces will contribute:
The most lucrative field will be robotics, where AI control systems facilitate physical manufacturing and all aspects of physical labour.
AI alignment in robotics will be paramount, with games and simulations serving as the primary sources of data collection.
Lowering the cost of compute and increasing efficiency of simulations becomes a primary motivation, that will drive the integration of closed virtual economies.
The virtualization of the world in all aspects, as previously discussed, will eventually result in the virtual economy becoming the primary value exchange, with the physical world being the value generator, where the economy is anchored to compute.
AI Alignment with LLM is an actively emerging field of research. The basic idea is to impart human-like value alignment to AI behaviours. While most of the research today are focused on NLP tasks, once AI application start to broaden to involve physical machinery, value alignment becomes paramount. It's all fun and games until somebody loses an eye, or life, literally speaking.
The definition of "human value" quickly becomes nebulous the more one thinks about it. It's more than simply our words, and cannot be completely captured by writing alone. A glance, a fleeting feeling of vulnerability, the appreciation of nature and the love for a newborn. These are sensations that are not found in Common-Crawl training data, but are a critical part of the human condition.
Then there is also the issue of iteration speed and data availability. Real world events happen in real time, and it's often too slow to gather enough data in rarely occurring tasks in the tail-end of the event distribution. Self-driving car failures are full of these examples. In some cases, there just isn't enough data in the world for a particular situation.
The solution to capturing human value in AI is games. Simulated self-play was crucial to AlphaZero's success, and accelerated StarCraft play by AlphaStar demonstrated that games can serve as simulations for learning in open-ended strategic problems. These simulations, or games, can serve as proxies for real-world interactions and scenarios, providing a complex set of data modalities and behavioural varieties.
What better ways are there to create a simulated world that people can participate and play in, thereby generating valuable behavioural data in all types of scenarios. This is already being worked on in various ways. Google and OpenAI fully simulate robotic control environments for training, Nvidia created a fully realistic driving simulation for self-driving training, and is now working on Nvidia Omniverse to have 1:1 3D models of real world objects, for simulated AI training. Few years ago, Kindred started working on VR-controlled robots to gather human control data for AI training as well. Simulated data for robotic AI training is an unavoidable step. Traditional simulations for robots however, are for pure mechanistic motion. To capture human value, a much more complex set of data modalities and behaviour varieties are needed.
VR/AR and Virtualization of Devices
Paradoxically or serendipitously, the best simulated environments featuring complex human behaviour are massive multiplayer online games (MMOs) and sandbox games, involving millions of players and encompassing the full spectrum of human behaviour.
Current games are limited by 2D user interfaces and the modalities of user input available. VR headsets overcome these limitations with head and motion tracking, and new devices like the Apple Vision Pro now offer real-time eye-tracking. Soon, sensors capturing an even broader range of biosensor data will become available. Implementing these sensors will serve not only for enjoyment but also for capturing human emotives for better AI control systems.
Screen-based VR headsets are likely to dominate the VR/AR market. Enhancing a projection of the world on a screen with AI image processing is simpler and cheaper for AR purposes than trying to overcome physical limits on lens design.
This aligns with the trend of virtualization, which refers to the creation of a digital representation of something, that can run on different computing hardware. Not only has 3D printing facilitated the design and prototyping of VR headsets, but virtual devices and AI controllers are also replacing physical ones. Their accuracy and capacity can only improve. The size of VR devices will be constrained mainly by screens, GPUs, and batteries, while devices like TVs, monitors, keyboards, laptops, and phones are virtualized within the VR device. The trend is driven by the cost savings of owning a single VR device instead of multiple physical ones.
As remote working is normalized in the workplace culture, the drive for cost-savings, as well as the need for better remote meetings, the demand for workplace VR gears will increase. At the same time, more varieties of AI-enabled sandbox VR games will spur the market demand for VR headset for games. This will happen within the next 3 to 5 years, as the young generation that grew up with Oculus Quest and Roblox enter into early adulthood, and start to become a new generation of hackers and developers that experiment with AI and VR.
As these separate VR environments and worlds develop, the value of capturing VR telemetry for AI development will become evident. The remaining challenge will be connecting these digital worlds into a viable global economy, a topic I will explore in the next discussion.