Efficient Human-Robot Interaction via Deep Perception and flexible Motion Planning
Abstract
Human-Robot Interaction (HRI) is an emergent field propelled by advancements in artificial intelligence, yet achieving seamless human understanding and responsive robot control remains a significant challenge. This paper introduces an efficient, integrated HRI system for a custom-built dual-arm robot, presenting two primary contributions. First, we propose a sophisticated perception system that leverages deep learning to interpret human states. This system employs a Multi-task Cascaded Convolutional Neural Network (MTCNN) for robust face detection, a Deep Convolutional Neural Network (DCNN) for recognizing emotions, and a Long Short-Term Memory (LSTM) network to identify dynamic gestures from image sequences. Second, we detail a flexible dual-arm robot control system built on the Robot Operating System (ROS). This control system utilizes the Rapidly-exploring Random Tree (RRT) algorithm for efficient path planning, enabling the robot to translate recognized human cues into corresponding actions. Comprehensive evaluations on benchmarks, including the WIDER FACE and FER2013 datasets, validate the perception models. The proposed system was validated through both simulation and physical experiments, demonstrating high accuracy in perception and control. The results highlight the framework’s effectiveness in creating fluid and responsive interactions for complex HRI scenarios.