Studies in Computational Intelligence, volume 856, pages 62-70
Hierarchical actor-critic with hindsight for mobile robot with continuous state space
Staroverov Aleksey
1
,
Panov Aleksandr I
2, 3
1
2
Publication type: Book Chapter
Publication date: 2019-09-04
Quartile SCImago
Q4
Quartile WOS
—
Impact factor: —
ISSN: 1860949X, 18609503
Abstract
Hierarchies are used in reinforcement learning to increase learning speed in sparse reward tasks. In this kind of tasks, the main problem is elapsed time, required for the initial policy to reach the goal during the first steps. Hierarchies can split a problem into a set of subproblems that could be reached in less time. In order to implement this idea, Hierarchical Reinforcement Learning (HRL) algorithms need to be able to learn the multiple levels within a hierarchy in parallel, so these smaller subproblems could be solved at the same time. Most famous existing HRL algorithms that can learn multi-level hierarchies are not able to efficiently learn levels of policies simultaneously, especially in continuous space and action space environment. To address this problem, we had analyzed the newest existing framework, Hierarchical Actor-Critic with Hindsight (HAC), test it in the simulated mobile robot environment and determine the optimal configuration of parameters and ways to encode information about the environment states.
Citations by journals
1
|
|
Lecture Notes in Networks and Systems
|
Lecture Notes in Networks and Systems
1 publication, 50%
|
IEEE Transactions on Intelligent Transportation Systems
|
IEEE Transactions on Intelligent Transportation Systems
1 publication, 50%
|
1
|
Citations by publishers
1
|
|
Springer Nature
|
Springer Nature
1 publication, 50%
|
IEEE
|
IEEE
1 publication, 50%
|
1
|
- We do not take into account publications that without a DOI.
- Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
- Statistics recalculated weekly.
{"yearsCitations":{"type":"bar","data":{"show":true,"labels":[2021,2022,2023],"ids":[0,0,0],"codes":[0,0,0],"imageUrls":["","",""],"datasets":[{"label":"Citations number","data":[1,0,1],"backgroundColor":["#3B82F6","#3B82F6","#3B82F6"],"percentage":["50",0,"50"],"barThickness":null}]},"options":{"indexAxis":"x","maintainAspectRatio":true,"scales":{"y":{"ticks":{"precision":0,"autoSkip":false,"font":{"family":"Montserrat"},"color":"#000000"}},"x":{"ticks":{"stepSize":1,"precision":0,"font":{"family":"Montserrat"},"color":"#000000"}}},"plugins":{"legend":{"position":"top","labels":{"font":{"family":"Montserrat"},"color":"#000000"}},"title":{"display":true,"text":"Citations per year","font":{"size":24,"family":"Montserrat","weight":600},"color":"#000000"}}}},"journals":{"type":"bar","data":{"show":true,"labels":["Lecture Notes in Networks and Systems","IEEE Transactions on Intelligent Transportation Systems"],"ids":[17269,5270],"codes":[0,0],"imageUrls":["\/storage\/images\/resized\/voXLqlsvTwv5p3iMQ8Dhs95nqB4AXOG7Taj7G4ra_medium.webp","\/storage\/images\/resized\/6scCJegesojp2jubwY3uKCzTAmgsaH2GIFlg6Hfk_medium.webp"],"datasets":[{"label":"","data":[1,1],"backgroundColor":["#3B82F6","#3B82F6"],"percentage":[50,50],"barThickness":13}]},"options":{"indexAxis":"y","maintainAspectRatio":false,"scales":{"y":{"ticks":{"precision":0,"autoSkip":false,"font":{"family":"Montserrat"},"color":"#000000"}},"x":{"ticks":{"stepSize":null,"precision":0,"font":{"family":"Montserrat"},"color":"#000000"}}},"plugins":{"legend":{"position":"top","labels":{"font":{"family":"Montserrat"},"color":"#000000"}},"title":{"display":true,"text":"Journals","font":{"size":24,"family":"Montserrat","weight":600},"color":"#000000"}}}},"publishers":{"type":"bar","data":{"show":true,"labels":["Springer Nature","IEEE"],"ids":[8,6953],"codes":[0,0],"imageUrls":["\/storage\/images\/resized\/voXLqlsvTwv5p3iMQ8Dhs95nqB4AXOG7Taj7G4ra_medium.webp","\/storage\/images\/resized\/6scCJegesojp2jubwY3uKCzTAmgsaH2GIFlg6Hfk_medium.webp"],"datasets":[{"label":"","data":[1,1],"backgroundColor":["#3B82F6","#3B82F6"],"percentage":[50,50],"barThickness":13}]},"options":{"indexAxis":"y","maintainAspectRatio":false,"scales":{"y":{"ticks":{"precision":0,"autoSkip":false,"font":{"family":"Montserrat"},"color":"#000000"}},"x":{"ticks":{"stepSize":null,"precision":0,"font":{"family":"Montserrat"},"color":"#000000"}}},"plugins":{"legend":{"position":"top","labels":{"font":{"family":"Montserrat"},"color":"#000000"}},"title":{"display":true,"text":"Publishers","font":{"size":24,"family":"Montserrat","weight":600},"color":"#000000"}}}}}
Metrics
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Staroverov A., Panov A. I. Hierarchical actor-critic with hindsight for mobile robot with continuous state space // Studies in Computational Intelligence. 2019. Vol. 856. pp. 62-70.
GOST all authors (up to 50)
Copy
Staroverov A., Panov A. I. Hierarchical actor-critic with hindsight for mobile robot with continuous state space // Studies in Computational Intelligence. 2019. Vol. 856. pp. 62-70.
Cite this
RIS
Copy
TY - GENERIC
DO - 10.1007/978-3-030-30425-6_6
UR - https://doi.org/10.1007%2F978-3-030-30425-6_6
TI - Hierarchical actor-critic with hindsight for mobile robot with continuous state space
T2 - Studies in Computational Intelligence
AU - Staroverov, Aleksey
AU - Panov, Aleksandr I
PY - 2019
DA - 2019/09/04 00:00:00
PB - Springer Nature
SP - 62-70
VL - 856
SN - 1860-949X
SN - 1860-9503
ER -
Cite this
BibTex
Copy
@incollection{2019_Staroverov
author = {Aleksey Staroverov and Aleksandr I Panov},
title = {Hierarchical actor-critic with hindsight for mobile robot with continuous state space},
publisher = {Springer Nature},
year = {2019},
volume = {856},
pages = {62--70},
month = {sep}
}
Profiles