HSAC-LLM (Active request)
The Robot proactively sends voice avoidance requests after detecting collision risk.
Ask the pedestrian margin to his closer side.
The Robot proactively sends voice avoidance requests after detecting collision risk.
Ask the pedestrian margin to his closer side.
After receiving the pedestrian's state of his moving trajectory to the left side, the robot promptly shifts its position to the opposite side to avert a collision and responds with its next intended move.
Robot navigation is an important research field with applications in various domains. However, traditional approaches often prioritize efficiency and obstacle avoidance, neglecting a nuanced understanding of human behavior or intent in shared spaces. With the rise of service robots, there’s an increasing emphasis on endowing robots with the capability to navigate and interact in complex real-world environments. Socially aware navigation has recently become a key research area. However, existing work either predicts pedestrian move- ments or simply emits alert signals to pedestrians, falling short of facilitating genuine interactions between humans and robots. In this paper, we introduce the Hybrid Soft Actor- Critic with Large Language Model (HSAC-LLM), a novel model for robot socially aware navigation that seamlessly integrates deep reinforcement learning and large language models, which concurrently handles the robot’s continuous and discrete actions. When a collision risk with a pedestrian is detected, it initiates a conversation either by emitting speech or receiving the pedestrian’s speech. After the dialogue concludes, the robot adjusts its motion based on the conversation’s outcome. Experimental results in a 2D simulation and Gazebo environment demonstrate that HSAC-LLM not only efficiently enables interaction with humans but also exhibits superior performance in navigation and obstacle avoidance compared to state-of-the-art DRL algorithms. We believe this innovative paradigm opens up new avenues for effective and socially aware human-robot interactions in dynamic environments.
Raw observations like radar scans, pedestrian position, and speed information from the environment will be processed by Img_PreNet built by convolutional and fully connected layers. Pedestrian's vocal messages will be handled by Lang_PreNet built with LLM into action state vectors. Finally State_PreNet built by one fully connected layer will combine previously processed data with the robot's position info into a 512-dimensional vector state used for the HSAC-LLM algorithm.
The behavior of the robot involves both continuous and discrete elements. We categorize the robot’s motion controls, angular and linear velocities, as continuous actions, while post-interaction decisions are discrete actions. HSAC was used to model both continuous and discrete robot actions.
LLM was introduced to serve as a conduit between discrete action codes and intuitive voice messages. Through the presentation of prompts that encapsulate the contextual information and environment dynamics, coupled with dialogue examples, the LLM performs a two-way translation. This translation facilitates the transformation between action code and the respective voice messages from both pedestrians and the robot.
Proactive made by robot
Robot: "Could you move to your right please?"
Pedestrian: "Sure I will do that!"
Proactive made by pedestrian
Pedestrian: "On your left!"
Robot: "I will shift to my right."
Proactive made by pedestrian
Pedestrian: "Moving towards your right side!"
Robot: "I shift to the left then."
Proactive made by pedestrian
Pedestrian: "On your right!"
Robot: "I will move to my left."
Proactive made by robot
Robot: "Could you please stop so I can pass?"
Pedestrian: "Sure, go ahead!"
Proactive made by pedestrian
Pedestrian: "Could you let me pass first?"
Robot: "Of course, I will stop."
Proactive made by pedestrian
Pedestrian: "I am already late for my work!"
Robot: "Ok I will stop."
Proactive made by pedestrian
Pedestrian: "I am in a hurry!"
Robot: "Ok, I will stop to left you pass."
@article{hsacllm2024,
author = {Congcong Wen, Yifan Liu, Wenyu Han, Geeta Chandra Raju Bethala, Zheng Peng, Yu-Shen Liu and Yi Fang},
title = {Exploring Socially Robot Navigation via LLM-Based Human Interaction},
journal = {},
year = {2023},
}