Dynamic Coverage Meets Regret: Unifying Two Control Performance Measures for Mobile Agents in Spatiotemporally Varying Environments.
Dynamic Coverage Meets Regret: Unifying Two Control Performance Measures for Mobile Agents in Spatiotemporally Varying Environments.
CDC
@inproceedings{DBLP:conf/cdc/HaydonMKPCMV21,
author = {Ben Haydon and
Kirti D. Mishra and
Patrick Keyantuo and
Dimitra Panagou and
Fotini K. Chow and
Scott J. Moura and
Chris Vermillion},
title = {Dynamic Coverage Meets Regret: Unifying Two Control Performance Measures
for Mobile Agents in Spatiotemporally Varying Environments},
booktitle = {2021 60th {IEEE} Conference on Decision and Control (CDC), Austin,
TX, USA, December 14-17, 2021},
pages = {521--526},
publisher = {{IEEE}},
year = {2021},
url = {https://doi.org/10.1109/CDC45484.2021.9682826},
doi = {10.1109/CDC45484.2021.9682826},
timestamp = {Tue, 17 May 2022 15:53:17 +0200},
biburl = {https://dblp.org/rec/conf/cdc/HaydonMKPCMV21.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Abstract
Numerous mobile robotic applications require agents to persistently explore and exploit spatiotemporally varying, partially observable environments. Ultimately, the mathematical notion of regret, which quite simply represents the instantaneous or time-averaged difference between the optimal reward and realized reward, serves as a meaningful measure of how well the agents have exploited the environment. However, while numerous theoretical regret bounds have been derived within the machine learning community, restrictions on the manner in which the environment evolves preclude their application to persistent missions. On the other hand, meaningful theoretical properties can be derived for the related concept of dynamic coverage, which serves as an exploration measurement but does not have an immediately intuitive connection with regret. In this paper, we demonstrate a clear correlation between an appropriately defined measure of dynamic coverage and regret, then go on to derive performance bounds on dynamic coverage as a function of the environmental parameters. We evaluate the correlation for several variants of an airborne wind energy system, for which the objective is to adjust the operating altitude in order to maximize power output in a spatiotemporally evolving wind field.
Authors
Bib
@inproceedings{DBLP:conf/cdc/HaydonMKPCMV21, author = {Ben Haydon and Kirti D. Mishra and Patrick Keyantuo and Dimitra Panagou and Fotini K. Chow and Scott J. Moura and Chris Vermillion}, title = {Dynamic Coverage Meets Regret: Unifying Two Control Performance Measures for Mobile Agents in Spatiotemporally Varying Environments}, booktitle = {2021 60th {IEEE} Conference on Decision and Control (CDC), Austin, TX, USA, December 14-17, 2021}, pages = {521--526}, publisher = {{IEEE}}, year = {2021}, url = {https://doi.org/10.1109/CDC45484.2021.9682826}, doi = {10.1109/CDC45484.2021.9682826}, timestamp = {Tue, 17 May 2022 15:53:17 +0200}, biburl = {https://dblp.org/rec/conf/cdc/HaydonMKPCMV21.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }