In the past few months, we have seen plenty of robots that offer integration with large language models. While LLMs revolutionize robots with contextual reasoning and facilitate human-robot interaction, they open robots to risk of being jailbroken. RoboPAIR is an algorithm designed to jailbreak LLM-controlled robots. It “elicits harmful physical actions from LLM-controlled robots.” Here is what the researchers accomplished:
According to the researchers, in many scenarios, they managed to achieve 100% attack success rate. In the above video, you can see how the robot was tricked to deliver an explosive package.
[HT]