Moving Person Detection based on modified YOLOv5

Keywords: CBAM attention mechanism, Bidirectional Feature Pyramid Network, occlusion, dynamic SLAM, YOLO detector, person detection

Abstract

Visual Dynamic SLAM (Simultaneous Localization and Mapping) is a fundamental technology for intelligent mobile systems, enabling applications in robotics, augmented reality, and self-driving cars. This paper presents a novel approach to improve the performance of Visual Dynamic SLAM by integrating the YOLOv5 object detection framework with attention mechanisms (CBAM) and a BiFPN (Bidirectional Feature Pyramid Network) structure. The dynamic feature points, which are located in the bounding box of the dynamic object, are removed in the tracking thread, and only the static feature points are used to estimate the position of the camera. The primary focus is on improving the detection performance of dynamic objects, particularly persons, and addressing challenges such as occlusion and small object detection. Overall, the integration of YOLOv5 with attention mechanisms and a BiFPN structure presents a significant advancement in visual dynamic SLAM. The proposed approach enhances the detection of persons, addresses challenges related to small objects and occlusion, and improves the overall performance of the system in dynamic environments. These findings demonstrate the effectiveness of the proposed methodology and its potential for real-world applications in various domains, including robotics, augmented reality, and self-driving cars.

Published
2024-01-22