Integration of DePIN and embodied intelligence: technical challenges and future prospects

Integration of DePIN and embodied intelligence: technical challenges and future prospects

Table of Contents

Introduction: On February 27, Massari hosted a podcast on “Building Decentralized AI” and invited Michael Chu, co-founder of FrodoBot Lab. They talked about the challenges and opportunities that Decentralized Physical Infrastructure Networks (DePIN) provide in the field of robotics. Although this field is still in its infancy, it has great potential and could completely change the way AI robots operate in the real world. However, unlike traditional AI that relies on large amounts of Internet data, DePIN faces more complex problems, such as data collection, hardware limitations, evaluation bottlenecks, and sustainability of economic models.

In today’s article, we will break down the main points of this discussion and see what problems DePIN bots have encountered, what are the main obstacles to scaling decentralized bots, and why DePIN has advantages over centralized methods. Finally, we will also explore the future of DePIN bots and see if we are about to enter a “ChatGPT moment” for DePIN bots.

Where is the bottleneck in DePIN intelligent bots?

When Michael Chu first started working on FrodoBot, the biggest headache was the cost of robotics technology. The market prices of trading robots were ridiculously high, making it difficult to promote real-world AI applications. His initial solution was to build a low-cost autonomous robot that cost only $500, and he intended to win by being cheaper than most existing projects.

But as he and his team delved into R&D, Michael realized that cost wasn’t the real bottleneck. The challenges of decentralized physical infrastructure networks (DePIN) in robotics are much more complex than just “expensive or not”. As FrodoBotLab continues to advance, multiple bottlenecks in DePIN bots gradually emerge. To achieve large-scale deployment, the following bottlenecks must be overcome.

Bottleneck 1: Data

Unlike large “online” AI models that are trained using large amounts of Internet data, embodied AI needs to interact with the real world to develop intelligence. The problem is that the world does not currently have such a large-scale basis, and there is no consensus on how to collect this data. Data collection for embodied AI can be divided into the following three categories:

▎The first category is human process data, which is data generated when humans manually control robots. This type of data is of high quality and can capture video streams and action labels – that is, what humans see and how they react accordingly. This is the most effective way to train AI to mimic human behavior, but its drawback is that it is expensive and labor-intensive.

▎The second type is synthetic data (simulated data), which is useful for training robots to move in complex terrain, such as training robots to walk on rough terrain, and is very useful for some specialized fields. But for some tasks that have a variety of changes, such as cooking, the simulation environment is not very good. We can imagine the situation of training a robot to fry eggs: the type of fryer, oil temperature, and slight changes in room conditions will affect the results, and it is difficult for the virtual environment to cover all scenarios.

▎The third category is video learning, which is allowing the AI ​​model to learn by observing real-world videos. Although this method has potential, it lacks the direct physical interactive feedback required for intelligence.

Bottleneck 2: Level of autonomy

Michael mentioned that when he first tested FrodoBot in the real world, he used the robot primarily for last-mile delivery. From a data point of view, the results were actually very good – the robot completed 90% of the delivery tasks successfully. But in real life, a failure rate of 10% is unacceptable. A robot that fails once out of every ten deliveries is not marketable at all. Just like automated driving technology, unmanned driving can achieve a record 10,000 successful drives, but one failure is enough to defeat the confidence of commercial consumers.

Therefore, for bots to be truly useful, the success rate must be close to 99.99% or higher. But the problem is that every 0.001% increase in accuracy requires enormous time and effort. Many people underestimate the difficulty of this last step.

Michael recalls that when he sat in the prototype of Google’s self-driving car in 2015, he felt that full self-driving was just around the corner. Ten years later, we are still debating when we can achieve Level 5 full autonomy. The progress of robots is not linear, but exponential – with every step forward, the difficulty increases exponentially. Achieving this 1% accuracy may take years or even decades.

Bottleneck 3: Hardware: AI alone cannot solve the robot problem

Taking a step back, even if AI models are powerful, current robots are not ready for true autonomy. For example, the most easily overlooked problem with devices is the lack of touch sensors — the best current technology, like Meta AI research, is a far cry from the sensitivity of human fingertips. Humans interact with the world through vision and touch, while robots know almost nothing about texture, grip, and pressure reactions.

There is also the issue of occlusion – when an object is partially occluded, it is difficult for a robot to recognize and interact with it, while humans can intuitively understand the object even if they cannot see its full appearance.

In addition to perception problems, automata themselves have drawbacks. Most humanoid robots place motors directly on joints, making them bulky and potentially dangerous. In contrast, the structure of human tendons allows for smoother and safer movements. This is why current humanoid robots appear stiff and inflexible. Companies like Apptronik are developing more nature-inspired motor designs, but these innovations will take time to mature.

Bottleneck 4: Why is hardware scaling so difficult?

Unlike traditional AI models that rely solely on computing power, realizing intelligent robotics technology requires the deployment of physical devices in the real world. This poses a major challenge to capital. Building robots is expensive, and only the richest companies can afford large-scale experiments. Even the most efficient humanoid robots now cost tens of thousands of dollars, making their widespread adoption unrealistic.

Bottleneck 5: Evaluation effectiveness

This is the “invisible” bottleneck. Think about it, large online AI models like ChatGPT can test its functionality almost instantly – after releasing a new language model, researchers or ordinary users around the world can draw conclusions about its performance within a few hours. But evaluating physical AI requires deploying it in the real world, which takes time.

Tesla’s Full Self-Driving (FSD) program is a good example. If a Tesla has traveled a million miles without an accident, does this mean it has already achieved Level 5 autonomous driving? How about 10 million miles? The problem with machine intelligence is that the only way to validate it is to see where it ultimately fails, which means direct, large-scale, long-term deployment.

Bottleneck 6: Human Resources

Another underappreciated challenge is that human labor remains indispensable in the development of automated AI. Artificial intelligence alone is not enough. Robots need human operators to provide training data; Maintenance teams to keep the robots running; and core researchers/developers to continuously improve AI models. Unlike AI models that can be trained in the cloud, bots require constant human intervention, a major challenge that DePIN must address.

The Future: When will the ChatGPT moment for bots come?

Some believe that a ChatGPT moment for bots is coming. Michael is somewhat skeptical. Given the hardware, data and evaluation challenges, he believes AGI for robots is still a long way from mass adoption. However, the progress made by DePIN bots gives people hope. Robotics development should be decentralized, and not controlled by a few large companies. The size and format of a decentralized network can spread the capital burden. Instead of relying on a large company to fund the construction of thousands of robots, it is better to put individuals who can contribute into a shared network.

For example, first, DePIN speeds up the data collection and evaluation process. Instead of waiting for a company to deploy a limited number of bots to collect data, decentralized networks can run in parallel and collect data on a much larger scale. For example, in the recent AI vs. Human Robots competition in Abu Dhabi, researchers from institutions such as DeepMind and the University of Texas at Austin tested their AI models against human players. Although humans still have the upper hand, the researchers were excited by the unique data set collected from the robots’ real-world interactions. This indirectly proves the need for subnetworks connecting different components of robots. The research community’s enthusiasm also shows that even if full autonomy remains a long-term goal, DePIN robots have demonstrated tangible value from data collection and training to real-world deployment and validation.

On the other hand, AI-based device design improvements, such as chip optimization and materials engineering using AI, could significantly shorten the timeline. One specific example is that FrodoBot Lab has worked with other organizations to ensure two NVIDIA H100 GPU boxes, each containing eight H100 chips. This provides researchers with the computational power needed to process and improve AI models on real-world data collected from robot deployments. Without these computational resources, even the most valuable datasets cannot be fully utilized. It can be seen that by accessing DePIN’s decentralized computing infrastructure, a botnet can enable researchers around the world to train and evaluate models without being restricted by capital-intensive GPU ownership. If DePIN successfully crowdsources data and hardware advances, the future of robotics could come sooner than expected.

Additionally, AI agents like Sam, a mobile KOL bot carrying meme tokens, are offering new profit models for decentralized botnets. Sam operates independently, streaming live 24/7 in multiple cities while its tokens rise in value. This model demonstrates how smart bots powered by DEPIN can financially support themselves through decentralized ownership and token incentives. In the future, these AI agents could use tokens to pay human operators for assistance, rent additional robotic assets, or bid on real-world missions, creating an economic cycle that benefits both AI development and DePIN participants.

Final summary

The development of robot AI depends not only on algorithms, but also on hardware upgrades, data accumulation, financial support, and human participation. In the past, the development of the robotics industry was limited by high costs and the dominance of large companies, which hindered the pace of innovation. The creation of the DePIN Robot Network means that with the power of decentralized networks, robot data collection, computing resources, and capital investment can be coordinated on a global scale, which not only accelerates AI training and hardware optimization, but also lowers the development threshold and allows more researchers, entrepreneurs, and individual users to participate. We also hope that the robotics industry will no longer depend on a few technology giants, but will be jointly pushed by the global community to move towards a truly open and sustainable technology ecosystem.

Our offer on Sallar Marketplace