Алгоритмы непрерывного управления для маршрутизации конвейера на основе мультиагентного глубокого обучения с подкреплением

Ярослав Сергеевич Журба; Андрей Александрович Фильченков; Артур Александрович Азаров; Анатолий Абрамович Шалыто

doi:10.31799/1684-8853-2022-6-10-19

Zhurba Yaroslav Saint-Petersburg National Research University of Information Technologies, Mechanics and Optics https://orcid.org/0000-0003-3281-9216
Filchenkov Andrei Saint-Petersburg National Research University of Information Technologies, Mechanics and Optics https://orcid.org/0000-0002-1133-8432
Azarov Arthur Saint-Petersburg National Research University of Information Technologies, Mechanics and Optics https://orcid.org/0000-0003-3240-597X
Shalyto Anatoly Saint-Petersburg National Research University of Information Technologies, Mechanics and Optics https://orcid.org/0000-0002-2723-2077

DOI:

https://doi.org/10.31799/1684-8853-2022-6-10-19

Keywords:

routing, multi-agent learning, reinforcement learning, conveyor belt

Abstract

Introduction: We consider the problem of routing of piece cargo by a conveyor system. When moving cargo pieces, it is necessary not only to minimize the time of transportation, but also to minimize the energy spent on it. Purpose: Development of a routing algorithm that is adaptive to changes in the topology of the routing graph and is able to optimize the delivery time and the consumed energy. Results: We propose an algorithm based on multi-agent deep reinforcement learning that places agents at the vertices of a conveyor network graph and uses a new state value function. The algorithm has two tunable parameters: the length of the path along which the state value function is calculated, and the learning coefficient. Through the selection of parameters, we have revealed that the optimal values are 2 and 1, respectively. An experimental study of the algorithm using a simulation model has shown that it allows to reduce the number of collisions of moving objects to zero, demonstrates stable results for both optimized scores, and also leads to a lower energy consumption compared with the method used as a baseline. Practical relevance: The proposed algorithm can be used to reduce delivery time and energy when managing conveyor systems.

Information and control systems

Continuous control algorithms for conveyer belt routing based on multi-agent deep reinforcement learning

DOI:

Keywords:

Abstract

Published

How to Cite

Issue

Section

Impact Factor

Navigate

In the Web

In the Web