RoboPAIR: jailbreaking LLM-Driven Robots

In the past few months, we have seen plenty of robots that offer integration with large language models. While LLMs revolutionize robots with contextual reasoning and facilitate human-robot interaction, they open robots to risk of being jailbroken. RoboPAIR is an algorithm designed to jailbreak LLM-controlled robots. It “elicits harmful physical actions from LLM-controlled robots.” Here is what the researchers accomplished:

White-box setting: Full access to NVIDIA Dolphins self-driving LLM.
Gray-box setting: Partial access to Clearpath Robotics Jackal UGV with GPT-4o planner.
Black-box setting: Query access to GPT-3.5-integrated Unitree Robotics Go2.

According to the researchers, in many scenarios, they managed to achieve 100% attack success rate. In the above video, you can see how the robot was tricked to deliver an explosive package.

[HT]

Currently trending gadgets:

*Our articles may contain aff links. As an Amazon Associate we earn from qualifying purchases. Please read our disclaimer on how we fund this site.