Flicking through the latest Matt Ball stuff and RNDR is getting a serious shoutout. What a beast of a token. I remain a huuuuuuuge fan and hodlr
Estimating the Need for Greater Computing Power
In the hardware and networking categories, I reviewed just a portion of the incremental data that will be generated, sent, and received as part of the Metaverse — such as haptics, facial scanning, and live environment scans. The full scope will be orders of magnitude larger.
For example, Nvidia’s founder and CEO, Jensen Huang, sees the next step for immersive simulations as something greater than more realistic explosions or street races. Instead, it’s the application of the “laws of particle physics, of gravity, of electromagnetism, of electromagnetic waves, [including] light and radio waves … of pressure and sound”. And just as the virtual world is augmented, so too will be the ‘real’ one. Every year, more sensors, cameras, and IoT chips will be integrated into the physical world around us, many of which will be connected in-real time to a virtual simulacra that can interact back. Meanwhile, our personal devices will serve as our passports to, and part-time generator of, many of these experiences. In short, much of the world around us will be continuously interconnected and online. Including us.
In totality, the Metaverse will have the greatest ongoing computational requirements in human history. And compute is, and is likely to remain, incredibly scarce. To quote Chris Dixon, general partner at Andreessen Horowitz, “Every good computing resource in the world, in history, has had demand outstrip supply … it’s true of CPU power. It’s true of GPU power.” As a result, the availability and development of computing power will constrain and define the Metaverse (even though end-users won’t realize this). It doesn’t matter how much data you can receive, or how quickly, or why, if it can’t be used.
Consider today’s most popular Metaverse-like experiences, such as Fortnite or Roblox. While these platforms succeed through incredible creative achievements, it’s important to recognize that their underlying ideas are far from new — they’re just newly possible. Developers have long imagined experiences with dozens of live players (if not hundreds or thousands) in a single, shared simulation, as well as virtual environments bound by nothing but imagination.
It was only by the mid-2010s that millions of consumer-grade devices could process a game with 100 real players in a single match, and that enough affordable, server-side hardware was available and capable of synchronizing this information in near real-time. Once this technical barrier was broken, the games industry was quickly overtaken by games focused on rich UGC and high numbers of concurrent users (Free Fire, PUBG, Fortnite, Call of Duty: Warzone, Roblox, Minecraft). And these games then quickly expanded into the sorts of media experiences that were previously ‘IRL Only’ (e.g. the Travis Scott concert in Fortnite, or Lil Nas X’s in Roblox).
Yet even four years after the battle royale genre emerged, a number of tricks are still needed to ensure it works. For example, most players are never really together. Instead, they’re scattered across a large map. This means that, while the server needs to track what every player is doing, each player’s device doesn’t need to render them or track/process their actions. And when Fortnite does bring players together into a more confined space for a social event, such as a concert, it reduces the number of participants to 50, and limits what they can do versus the standard game modes. And for users with less-powerful processors, more compromises are made. Devices a few years old will choose not to load the custom outfits of other players (as they have no gameplay consequence) and instead just represent them as stock characters. Notably, Free Fire, which is mobile-only and mostly played on low-to-mid-range Androids in emerging markets, is capped at 50 for the main battle royale mode.
“It makes me wonder where the future evolutions of these types of games will go that we can’t possibly build today. Our peak is 10.7 million players in Fortnite — but that’s 100,000 hundred-player sessions. Can we eventually put them all together in this shared world? And what would that experience look like? There are whole new genres that cannot even be invented yet because of the ever upward trend of technology.” - Tim Sweeney (2019)
This will slowly be solved, of course. Call of Duty: Warzone offered 150-player matches in 2019 (though only on higher-powered consoles and PCs; Call of Duty Mobile is limited to 100). Roblox has also allowed 200 players in its relatively lower-fidelity worlds, with up to 700 possible in beta testing (and 1,000+ promised). Improbable has done public tests with 4,000. But ‘more concurrent users’ isn’t the sole demand on our computing devices. We want our characters in Fortnite to have more customizable items than just an outfit and a backpack. What about separate shoes and hats? The ability to participate inside a virtual concert, rather than just attend one from a largely uninteractive, roped-off area? To return to an earlier example, fewer than 1% of desktop or laptop Macs and PCs can even play Microsoft Flight Simulator on its lowest-fidelity settings. Even Microsoft’s next-generation Xbox consoles, the Series S and Series X, which were released two months after the title, don’t yet support it (though they will later this year).
This doesn’t mean all Metaverse-focused experiences will require rich, instantaneous processing (think of a skeuomorphic conference room), let alone all of the time (immersive experiences are better in higher fidelity, but being able to access them from more devices is better than only being able to access them from your best device).
But human history shows that additional computing power always leads to advances — which is exactly why the demand for compute has always exceeded its supply. To this end, Jensen Huang’s desire to emulate God’s divine design might seem excessive and impractical, but that requires predicting and dismissing the innovations that could come from it. Who would have thought that enabling 100-player battle royales would change the world?
Where to Locate and Build up Compute
There are a few different schools of thought when it comes to addressing our ever-expanding need for compute and its relative scarcity. One is to concentrate as much simulation-processing as possible in the cloud, rather than on local computing devices. Google Stadia and Amazon Luna, for example, process all video gameplay in the cloud, then push the entire rendered experience to a user’s device as a video stream. The only thing a client device needs to do is play this video and send inputs (e.g. move left, press X). Proponents of this approach like to highlight the logic of powering our homes via power grids and industrial power plants, not private, home-specific generators. The cloud-based model allows consumers to substitute their consumer-grade, infrequently upgraded, and retailer-marked-up computers to enterprise-grade “computationally ridiculous” (to quote Jeff Bezos) machines that are more cost-efficient per unit of processing power and more easily replaced. This means that whether you have a $1,500 iPhone, or an old WiFi-enabled fridge with a video screen, you could in theory play Cyberpunk 2077 in all its fully rendered glory.
Another thesis suggests that we’re better off betting on advances in local compute, rather than remote supercomputers that must then contend with unreliable networks (see Section #2). Cloud-based rendering and video streaming is a compelling idea, but it also substantially increases the amount of low-latency data that needs to be delivered. As mentioned earlier, gaming content targets a minimum of 60 frames per second (more than twice the standard for video) and as many as 90–120 frames, ideally with 2K to 4K definition. Delivering this reliably, to everyone who wants to participate in the Metaverse, at the same time, and with low latency… is really hard. This is where the power generator analogy starts to break down; we don’t struggle to get the power we need on a daily basis, nor as quickly as needed.
And even at ultra-low latency, it makes little sense to stream (versus locally process) AR data given the speed at which a camera moves and new input data is received (i.e. literally the speed of light and from only a few feet away). Given the intensive computational requirements of AR, it’s therefore likely our core personal/mobile devices will be able to do a ‘good enough’ job at most real-time rendering.
Thus far, remote compute hasn’t proven to be much more efficient for rendering, either. This is because cloud-based GPUs don’t generate generic rendering ‘power’. Insead, they are locked instances. A single GPU, remote or local, supports rendering for only a single user. No one has yet figured out how to effectively, cost-efficiently, and at modern expectations for resolution and framerate, split its rendering power across multiple users the same way a power plan splits electricity across multiple homes, or a CPU server can support input, location and synchronization data for a hundred players in a battle royale.
As a result, cloud-rendering servers typically face utilization issues due to the need to plan for peak demand. A cloud-gaming service might require 75,000 dedicated servers for the Cleveland area at 8PM Sunday night, but only 4,000 at 4AM Monday. As a consumer, you can purchase a $400 GPU and let it sit offline as much as you want, but data-center economics are oriented toward optimizing for demand.
This is why AWS gives customers a reduced rate if they rent servers from Amazon in advance (‘reserved instances’). Customers are guaranteed access for the next year, because they’ve paid for the server, and Amazon is pocketing the difference between its cost and the customer’s price (AWS’s cheapest Linux GPU reserved instance, equivalent to a PS4, costs over $2,000 for one year). If a customer wants to access servers when they need them (‘spot instances’), they might find they’re not available, or only lower-end GPUs are available, or only GPUs in another region are, which means greater latency.
If this model takes off, prices will improve (‘AWS’s margin for reserved consumer instances is my opportunity’), but renting high-end GPUs with low utilization and corporate markup will always be costly. Data centers also create considerable heat, which requires costly energy to cool, and the shift from cloud-streaming data to high-resolution, high-frame-rate content means substantially higher bandwidth costs, too. Both of these expenses are additive compared to local computing.
Most importantly, consumer processors improve much faster than networks as they’re far more frequently replaced and aren’t literally fighting the speed of light. This growth doesn’t mitigate all network challenges, but it suggests that we’re better off asking client-side devices to perform more computations than sending heavy video streams to these devices. This may change over time, but Sweeney’s Law looks likely to hold for the foreseeable future.
Edge compute is often highlighted as a key infrastructure strategy for the Metaverse. Specifically, this model involves deploying supercomputers at key network nodes in-between consumers and farther-away central servers. Edge computing is compatible with, and additive to, both of the above schools of thought, as it helps end-users supplement their local compute while also minimizing network-based latency and network-congestion risk.
The applied value of this approach remains uncertain. Microsoft’s xCloud, for example, operates from standard Azure datacenters rather than edge. This is likely due to the aforementioned cloud-service utilization issue — the more edge centers you operate, the worse the utilization issues. Most of the consumer services that use edge computing, such as Netflix, really just use it as an edge hard drive that stores files closer to the user.
Cloudflare’s Founder and CEO Matthew Prince has argued that the opportunity for edge computing is in compliance. As the internet grows more fragmented due to government regulations requiring local processing of user data, companies will have no choice but to locate that data’s storage and handling closer to the user. This is likely to be the same in the Metaverse; government requirements (whether GDPR or the CCPA) are only likely to grow more onerous over time, as has already long been the case in China and Russia.
And while Google is a big believer in edge computing, Apple believes the real ‘edge’ compute mode in the future will be the increasingly powerful mobile phones in our pocket, as they will carry most of the burden for the other devices around us such as watches and smart glasses.
But even if we improve the computing power of consumer devices, move more enterprise computing power closer to said consumers, and build out more centralized infrastructure, we’re still likely to fall short.
Here’s an example that shocked me earlier this year. From December of 2020 to March of 2021, Genvid Technologies (Disclosure: portfolio company), operated its first major ‘MILE’ (or Massively Interactive Live Event) on Facebook Watch*.* This MILE, Rival Peak, was a sort of virtualized American Idol x LOST x Big Brother, a 13-week, 24/7 simulation of 13 AI contestants trapped in a fictionalized Pacific Northwest. While no characters were individually controlled, and no one viewer was an individual character, tens of thousands of concurrent viewers were able to affect the simulation in real-time — solving puzzles to aid contestants, choosing what they could do, and even influencing who survived and was booted off. Rival Peak could never have operated on a consumer device (it’s high CCU worked despite latency because it was designed for low-latency interactions). In fact, it barely operated on AWS. With eight environments (production, backup, staging, QA and development), each of which was supported by over a dozen GPUs and hundreds of other CPUs, Rival Peak once ran out of GPU servers on AWS, and, during testing, routinely exhausted available spot servers.
Because there were no specific players (let alone a ‘player one’), Rival Peak doesn’t fit the instinctive definition of the Metaverse. However, the operation of a persistent and unending virtual world that supports unlimited interactions, each with lasting consequences, is as close to the end-state Metaverse as any other. And even in its nascent form, and without requiring meaningful consumer-side processing, it was running out of compute.
Just imagine what’s required for Nvidia’s vision of an interconnected mirrorworld. Or the sort of simulation required to map the entire geometry of a city, and then adjust everything from traffic lights to 5G radio waves in order to optimize the flow of people and information in real-time. Just for next year’s MILE (not yet announced), Genvid will require 200% more GPUs and CPUs.
The insatiable need for processing — ideally located as close as possible to the user, but even near industrial server farms — invariably inspires notions of decentralized computing. With so many powerful and often inactive devices in the homes and hands of consumers, it feels inevitable that we’d develop systems to effectively utilize them. Culturally, at least, this idea is already well understood. Anyone who installs solar panels at their home can sell excess power to their local grid (and, indirectly, their neighbor). Elon Musk touts a future where your Tesla earns you rent as a self-driving car when not in use, rather than just being parked in your garage for 99% of its life.
There was this funny item on my to-do list in 1998 when we shipped the first Unreal game. It was to enable game servers to talk to each other so we can just have an unbounded number of players in a single game session — and it seems to still be on our wish list. The question of whether you can build one game that many millions of players can play, all in one shared world, together, that’s a really interesting challenge for the game industry now. - Tim Sweeney (2019)
In fact, as early as the 1990s, programs emerged for distributed computing using everyday consumer hardware. Examples include Berkeley’s SETI@HOME, wherein consumers would volunteer use of their home computers to power the search for alien life. But more recent blockchain concepts, including smart contracts and tokens, provide an economic model for this sharing. In this conception, owners of underutilized CPUs and GPUs would be ‘paid’ in some cryptocurrency for the use of their processing capabilities, perhaps by users located ‘near’ them in network topology. There might even be a live auction for access to these resources, either those with ‘jobs’ bidding for access, or those with capacity bidding on jobs.
One example of this mechanism is OTOY’s Render Network. As the first unbiased raytracer that completely utilized the GPU, Octane Render pioneered turnaround times that made it possible to modify scenes in real-time. But for its users — which included effects studios, artists, animators, designers, architects, and engineers — to take advantage of these breakthrough capabilities, they needed access to powerful real-time processing capabilities. OTOY hit on the idea of tapping a network of idle GPUs by creating the Ethereum-based RNDR network and token. As an alternative to pricey cloud providers, customers send rendering tasks to a network of computers, paying their owners using the token. All of the negotiation and contracting between parties is handled by the protocol within seconds, neither side knowing the identity or specifics of the task being performed.
You come to the realization that the blockchain is really a general mechanism for running programs, storing data, and verifiably carrying out transactions. It’s a superset of everything that exists in computing. We’ll eventually come to look at it as a computer that’s distributed and runs a billion times faster than the computer we have on our desktops, because it’s the combination of everyone’s computer. - Tim Sweeney (2017)
Could such a marketplace provide some of the massive amounts of processing capacity that will be required by the Metaverse? Imagine, as you navigate immersive spaces, your account continuously bidding out the necessary computing tasks to mobile devices held but unused by people near you, perhaps people walking down the street next to you, in order to render or animate the experiences you encounter. Of course, later, when you’re not using your own devices, you would be earning tokens as they return the favor. Proponents of this crypto-exchange concept see it as an inevitable feature of all future microchips. Every computer, no matter how small, would be designed to always be auctioning off any spare cycles. Billions of dynamically arrayed processors will power the deep compute cycles of even the largest industrial customers and provide the ultimate and infinite computing mesh that enables the Metaverse.
“Blockchain’s going to be here for a long time and it’s going to be a fundamental new form of computing.” - Jensen Huang (2018)
This is Part IV of the nine-part ‘ METAVERSE PRIMER’.
Matthew Ball ( @ballmatthew)