In the physical artificial intelligence (AI) market, "action data" is emerging as a future revenue source. For robots to replace people on manufacturing and logistics sites, they must repeatedly learn actions such as walking, grasping, carrying, and assembling, but the high-quality on-site data needed for this process is still woefully lacking.
Nvidia, the No. 1 AI chip maker, is expanding an ecosystem for synthetic data and robot-training simulations to boost GPU sales. China believes on-site data determines the pace of robot commercialization and is already rapidly increasing government-led data training infrastructure. Domestic actuator powerhouse ROBOTIS is also seeking to broaden its own robot parts ecosystem by selling real-world task data.
◇In Uzbekistan, large-scale workforce mobilized to accumulate action data
According to the industry on the 26th, ROBOTIS is accumulating action data in Uzbekistan using the Humanoid Robot platform "AI Worker." Ahead of the official launch of its data factory in the fourth quarter of this year, the company has hired about 100 people locally to conduct trial operations and prepare for data collection. When a person wears an exoskeleton-type leader arm and moves, the robot mirrors the same actions, and during this process, video, joint angles, torque, voice and language instructions, and success and failure cases are accumulated together. Rather than simple video, the data captures how the robot actually sees, moves, and fails in real environments.
ROBOTIS plans to link the data factory with its actuator production plant and increase local staff to 2,000 by 2028 and up to 20,000 by 2031. About half of them will be assigned to physical AI data collection and processing. Because robot action data must be accumulated by people repeatedly varying their movements, securing sufficient labor can boost the speed and quality of data accumulation. ROBOTIS expects revenue from data sales as early as this year.
ROBOTIS decided to fully pursue the data sales business because, as physical AI spreads, securing robot training data has emerged as a key task for commercialization. Generative AI like ChatGPT improved performance by learning text and images amassed on the internet, but it is difficult to secure large quantities of data on robots picking up and moving objects from the web. It must be created as people move in actual industrial settings and robots imitate them or go through trial and error.
Data sales are carried out by deploying robots and personnel to collect and process data tailored to tasks requested by clients. ROBOTIS is promoting this in tandem with its actuator business. Because task hours and labor input determine the unit price of data, Uzbekistan, where large-scale labor can be secured at low expense, was chosen as the production base. An industry official said, "Labor costs in Uzbekistan are about one-tenth of Korea's, and as low labor costs are combined with local government support for sites and infrastructure, interest from AI and advanced manufacturing corporations is growing."
◇U.S. and China focus on accumulating data in-house
Robotics powerhouses in the United States and Chinese corporations are also focusing on the bottleneck of robot training data. This year, Nvidia unveiled a blueprint for building a physical AI data factory, supporting data processing, synthetic data generation, and reinforcement learning needed to train robot, autonomous driving, and vision AI models by expanding real data into synthetic data. The strategy is to supplement scarce real-world data with synthetic data and simulations to grow demand for its GPUs and software ecosystem.
Major U.S. robot corporations are using real task data to advance their own robots' brains and algorithms rather than selling it externally. Corporations directly developing Humanoid Robots, such as Tesla and Figure AI, prioritize deployment on production lines and training their own models, so they are not publicly promoting a structure of selling data externally.
China is fostering the robotics industry under government leadership, rapidly accumulating real task data and building supply chains centered on robot data centers established by local governments and corporate test sites. Across China, large-scale robot training centers and data training grounds are being built one after another to deploy Humanoid Robots in real jobs.
There, workers teach robots one-on-one, repeatedly running tasks such as carrying trays, folding clothes, and sorting items, and in the process they accumulate real task data such as joint movements, visual information, force, and pressure.
AgiBot, a leading corporation by global Humanoid Robot shipments in China, has released robot manipulation data with more than 1 million trajectories collected by over 100 robots, seeking to expand the developer ecosystem. As Chinese flagship robot corporations such as Unitree and UBTECH join the mass-production race, they are using data secured during demonstration and operation to advance their own models.
An industry official said, "Humanoid Robots differ in structure and movement by hardware, so it is difficult to handle every task with a single general-purpose model or dataset," adding, "corporations that secure hardware capable of moving in real sites along with high-quality action data tailored to that hardware will seize the initiative in the physical AI ecosystem."