Wang Sheng: Metaverse Success Depends on New Graph-based AI Paradigm

In this article, we invited Mr. Wang Sheng, a partner of Innoangel Fund, to share his ideas about the metaverse and the new paradigm of graph-based AI.

According to Wang Sheng, the metaverse is designed to build new internet applications and content that are 3D and immersive, utilizing ongoing improvements in graphics, artificial intelligence, and computing power. Without advancing technical paradigms like the new 2D/3D graph-based AIs, there would be no metaverse.

New paradigms trigger scientific revolutions

The concept of paradigm was proposed by the famous philosopher of science, Thomas S. Kuhn, and further explained in his book "The Structure of Scientific Revolution."

A paradigm is a set of hypotheses, theories, guidelines, or methods accepted by a professional community. At a particular stage, a paradigm is considered relatively close to correctness or truth and offers a relatively better answer to a scientific problem. In addition, paradigms undergo continuous iteration, and when a paradigm cannot resolve existing scientific phenomena, a paradigm crisis ensues, and another new paradigm is created, which leads to a scientific revolution.

Metaverse and Web 3

Using the paradigm concept to explain the metaverse and Web 3, we will find that the two terms are different paradigms with different communities, goals, methods, theories, and systems.

Essentially, the metaverse is a brand-new 3D immersive content and application scene based on existing Internet applications, which utilize graphics, artificial intelligence, and massively increased computing power. So, it is clear that Microsoft, Google, and Apple are the leading promoters of the metaverse. Basically, it is an upgrade to the existing Internet technology and business scenarios.

In contrast, Web 3 is different, as it is a new generation of network systems built upon blockchain technology and cryptocurrency as its underlying layer. Essentially, Web 3 reconstructs the global network into a decentralized structure and transfers ownership, governance, and benefit rights of the assets generated by the Web to the users. Thus, Web 3 is essentially intended to bring down the previous Internet giants and redistribute their assets. This is a significant difference between the two.

Presently, the metaverse in China is a hot topic with various emerging projects on the way, while since Web 3 involves blockchain and cryptocurrency, few projects are seen at the moment.

Progress of civilization paradigm

From slash-and-burn to agricultural civilizations and then to industrialization, the progression of the civilization paradigm cannot be separated from a few recognized factors: energy, productivity, communication, and transportation.

For example, much emphasis was placed on machinery during industrial civilization. For mechanical energy, steam was used to drive machines. Considering productivity, the Jenny spinning machine and other water-driven automated textile equipment spearheaded the industrial revolution even before the invention of the improved steam engine.

Moreover, the enormous influence of communications and transportation on commerce and the economy cannot be overlooked. Communication growth added to the scale of organizations and contributed to the formation of business cooperation and corporate organizations. In addition, transportation has been integral to the distribution of goods and services.

Within the metaverse framework, these elements are not only worthy of expectation but will also likely spur the fourth industrial revolution. In terms of energy, the goals of reaching a carbon peak, achieving carbon neutrality, and using clean energy will significantly change the source, utilization, and scale of energy use. Productivity will be enhanced through the development of AI, whereas previous efforts were to enhance human capabilities. Communication will likely be the focus of attention due to the revolutionary nature of media caused by the metaverse. Because of the rise and spread of intelligent transportation systems, the transportation sector will continue to be the focus of attention.

We are on the verge of entering a fourth industrial revolution within a short period. Technological advancements and human paradigms have already changed in many fields, and these factors together will propel us into a new metaverse civilization.

The metaverse is a media transformation at its core.

As the metaverse unfolds, a media revolution is underpinning it, which is a significant element of civilization and promotes value creation.

In reviewing the history of media change, from newspapers, radio, television, the Internet, mobiles, and now the use of virtual reality/augmented reality, it is easily observed that when emerging technologies drive media change, it generates a lot of economic value, new businesses, and new economic behaviors.

The ultimate form of media

The next step is to consider the logic of media evolution. With the rapid evolution of new technological tools, information density and accessibility will be essential.

Information density determines the extent to which information is delivered in a limited amount of time and hence the quality of the media. Clearly, there is a difference between the amount of information conveyed by SMS, radio, and video. The question is, what is the highest quality of information for us? Our planet is close to the occasions of our reality, the mountains, the rivers, and the scenarios we see for work or leisure, and this constitutes the greatest density of information for human beings. Considering this perspective, we can see why VR and immersive experiences are so significant.

The term accessibility (media efficiency) refers to the cost of acquiring a piece of content, including production, distribution, circulation, access, time, and cognitive costs. The extreme level of accessibility, assuming there will not be any brain-computer interfaces in the future, is augmented reality. Generally, technology drives media development. Media evolves from low information density to high, which leads to enhanced media efficiencies.

In the figure below, we can observe the general trend of media evolution. Within the next few years, its ideal state will be the combination of VR and AR, also known as MR or XR, which will offer us extremely high information quality and the best information efficiency available.

The media of the metaverse

A paradigm shift in the media will impact the other paradigms of human society. The media is a carrier of information, but it is also the space, the environment, and the channel through which we deal with other subjects (people, things, and knowledge) in a broader sense. The media is of value because it is through which all economic activities (transactions) are carried out.

The media of the metaverse should be a 3D immersive one driven by technological innovation. Over the past few decades, the PC and mobile Internet have abstracted the media efficiently. Metaverse is a media that extends Internet figuration to the extreme, simulating the human world to enhance economic activity and human behavior.

Pillars of Web 3

Web 3 is composed of two pillars: consensus and social networks.

Logic, language, and sociality contribute to group consensus and strengthen community bonds. Several studies have shown that the adequate social size of individuals can be increased from 150 to 300, primarily because of the Internet and mobile Internet. There is no doubt that there has been a significant improvement in the efficiency of social activities since the appearance of social networks. Consensus generates great value. At its peak last year, cryptocurrencies accumulated a market capitalization of $3 trillion.

The structure of Web 3 is more like that of an efficient and high-trust social network model and even of a small-world network model. As it evolves from the most primitive and inefficient cave model of human beings in the past agricultural era to the more complex large-world model we have today. Then, in the future small-world model, technology becomes more efficient and effective.

New paradigms of metaverse technologies

The following is a description of the current paradigms of metaverse technologies.

1. The new paradigm of graph-based AI. From 2014 onwards, Convolutional Neural Networks, or CNNs, have become the gold standard for AI, and Deep Learning is the recent paradigm. These advancements have significantly enhanced efficiency over previous paradigms. Next, we saw CNN, LSTM, BERT, Transformer, and GPT becoming increasingly sophisticated and active.

It wasn't that long ago that OpenAI released DALL-E 2, a text-to-image generator. When given a text such as: "An astronaut lounging in a tropical resort in space + in a vaporwave style," DALL-E 2 will produce the following images:

Rather than a composite of existing photos, this is an automatically generated image after the machine recognizes the semantics. The system can comprehend individual objects and the connections between objects, which helps explain why this paradigm is no longer an outdated approach that many developers followed in the past. There has been a shift in graphics from some graphics-related algorithms to AI that will generate its own paradigm iterations.

Moreover, the new paradigms of 3D graph-based artificial intelligence. For many years, the entire 3D pipeline used Maya, 3DMAX, C4D, and Blender to model, bind, move, drive, and animate objects. The traditional production pipeline was built on raster, ray tracing, Mesh, etc. In the present case, we need to pay attention to the 3D GAN and neural radiance fields (NeRFs), which may have the potential to transform the traditional 3D graphics pipelines.

There is a possibility that AI will replace the geometric paradigm, but the geometric paradigm will not necessarily disappear completely. Both paradigms may coexist in the future. Furthermore, the geometric paradigm may also be able to complement the artificial intelligence paradigm in the near future.

NVIDIA has published a new paper related to Instant NeRF. The left side of the figure below shows a more complex neural radiance field trained on 34 images. The rendering quality is excellent, using only 34 images. In addition, the model training may take just a few minutes at most. On the right is an example of model training. It is possible to train such a model in a few seconds, and then it may be used to render various objects.


The success of the metaverse depends on the implementation of new technical paradigms. Previous paradigms have become a bit outdated. According to NVIDIA, to feel real in a metaverse, we should increase computing power by one million times.

Therefore, the conventional pipeline is not sustainable in the long run, so we have to embrace a new way of doing things. As Deep Learning dominates AI today, we think that in the future, the fundamental approach to AI will be a graph-based paradigm that will hold sway over the entire metaverse.

