Exploring the Future of Robot Intelligence: Are They Thinking?
Written on
Chapter 1: Understanding Robot Cognition
Can robots be designed to think similarly to humans prior to performing tasks? This inquiry leads us to examine the foundational models being developed for robotic language understanding, particularly in the realm of large language models (LLMs).
What happens when the principles of ChatGPT are applied to robotics? Companies such as Covariant and Skild AI are pursuing this very concept. Their efforts are bolstered by the ECoT (Embodied Chain-of-Thought) research, which enables robots to think prior to taking action. Covariant introduced the RFM-1 (Robotics Foundation Model 1), utilizing their AI platform known as Covariant Brain. Meanwhile, Skild AI, which has secured $300 million in funding from notable investors like Amazon and Softbank, aims to transform how robots engage with their surroundings through their foundation model, referred to as Skild Brain.
The term "human-like reasoning" encompasses a wide array of capabilities. In this context, it refers to a robotic system's ability to analyze extensive real-world data to determine the most effective actions for executing tasks. Therefore, it seems that your philosophical debates with robots may have to wait.
About ECoT Research
Researchers from UC Berkeley, Stanford University, and the University of Warsaw have developed a method known as Embodied Chain-of-Thought Reasoning (ECoT), which enables robots to contemplate tasks systematically and assess their environment before proceeding.
This innovative approach allows robots to adapt to new tasks and environments seamlessly. Importantly, it empowers less skilled human operators to modify a robot’s reasoning (programming) through natural language feedback.
Despite the potential of Vision-Language-Action (VLA) models highlighted by researchers at Google DeepMind, a critical limitation remains. According to ECoT researchers, VLAs often learn through observations of actions, lacking the element of "reasoning." Consequently, they struggle to adjust and perform in complex, novel situations.
ECoT trains VLAs to engage in multi-step reasoning regarding plans, sub-tasks, motions, and visually grounded features, such as object locations, before predicting robot actions.
Chapter 2: Advancements by Covariant and Skild AI
The following video discusses how AI robots are evolving and becoming more realistic, featuring shocking updates on AI and robotics for 2024.
Covariant's mission is to drive the automation of the future. Currently, their Covariant Brain platform serves e-commerce sectors, but they are also eyeing various applications across different industries. This platform is built by training on the largest multimodal robotics dataset from warehouses worldwide, allowing robots to identify virtually any SKU or item from day one.
Covariant aims to integrate robots into manufacturing, food processing, recycling, agriculture, the service sector, and even residential spaces. By deploying these robots on-site, they are effectively creating a large language model database tailored for robotic applications.
Covariant’s standout feature is its Covariant Brain platform, which claims to endow robots with human-like reasoning capabilities. This is primarily based on the robots' ability to process real-world data and determine optimal actions for given tasks.
Unlike conventional robots that are programmed for repetitive tasks and struggle with any deviation, Covariant's technology allows for greater adaptability. Even minor environmental changes, such as lighting shifts, can disrupt traditional robotic systems, necessitating reprogramming by external specialists, which can be costly and time-consuming.
Imagine a factory worker simply stating, "The lighting has changed from yellow to white; now pick up the 7mm screw." With its AI capabilities, Covariant’s platform interprets this input and provides a functioning solution for the robot. Remarkably, this input can also be given verbally.
During a demonstration, Covariant’s system executed complex commands, such as "pick up what you put on your feet before you put on your shoes," showcasing its potential for simplifying tasks on the factory floor.
Additionally, ABB organized a challenge inviting 20 robotics firms to tackle 26 intricate piece-picking tasks—Covariant was the only company to successfully complete all of them, winning the competition.
The next video features Bill Nye addressing whether robots will take over jobs, providing insights into the future of robotics in the workforce.
Covariant has plans for humanoid robots in the future and asserts that their approach will be hardware-agnostic. However, CEO Peter Chen notes that advancements in humanoid robotics remain in the early stages, primarily due to hardware development rather than software.
About Skild AI
Founded in 2023, Skild AI has quickly garnered $300 million in investments. They refer to their Robotics Foundation Model as Skild Brain, sharing a name with Covariant’s model.
Skild AI claims its Skild Brain is the first scalable robotics foundation model, capable of adapting across various hardware and tasks, thereby offering the robustness essential for industrial applications. Their robots are designed to operate in diverse environments, from construction sites to homes.
Skild AI emphasizes the ability of its foundation model to generalize across different settings, showcasing human-like adaptability. They also allow users to build impactful robotics applications on top of the Skild AI platform.
Skild AI facilitates the development of high-level AI algorithms and applications for robots, claiming that users can power their robots with simple API calls.
Differentiating itself from Covariant, Skild AI also focuses on applications in security and inspection, in addition to traditional industrial roles.
"The large-scale model we are building demonstrates unparalleled generalization and emergent capabilities across robots and tasks, providing significant potential for automation within real-world environments," states Deepak Pathak, Skild AI's co-founder and CEO.
Skild’s foundation model is reported to contain 1,000 times more data points than competing systems, branding it as a shared, general-purpose brain applicable across various robot types and tasks, whether humanoid or quadrupedal.
Other Endeavors
The quest for human-like cognitive abilities in AI is not limited to Covariant and Skild AI. Researchers at Georgia Tech have developed RTNet, a neural network designed to incorporate human-like confidence and variability in decision-making.
RTNet has outperformed all competing deterministic models, particularly in high-speed scenarios, owing to its ability to mimic human psychological traits. For instance, people tend to feel more confident when they make correct decisions, a trait that RTNet automatically applies without specific training.
Researchers found that human subjects' predictions closely aligned with those of RTNet, marking it as the first neural network to capture fundamental aspects of human decision-making behavior.
Conclusion
Progress in robotics is steadily advancing, with numerous researchers and companies striving to push the boundaries of robotic intelligence. Covariant’s guaranteed performance and Skild AI’s adaptability in different environments represent just the beginning of a long journey toward smarter robots. Additionally, advancements in human-like decision-making, exemplified by RTNet, contribute to this growing field.
I’m eager to see where the development of robotic foundation models and human-like neural networks will lead us. Stay tuned for updates on this fascinating journey.
Please clap and follow me for more stories on innovation and deep tech. Subscribe to my bi-weekly Innovation Snaps newsletter to inspire creativity in your work.
This article is published on Generative AI. Connect with us on LinkedIn and follow Zeniteq for the latest AI developments. Subscribe to our newsletter and YouTube channel to stay informed on generative AI advancements. Let’s shape the future of AI together!