Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward. Kobayashi, T. 2023. (submitted for publication)
Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward [link]Paper  bibtex   
@misc{kobayashi2023intentionallyunderestimated,
      title={Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward},
      author={Taisuke Kobayashi},
      year={2023},
      eprint={2308.12772},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2308.12772},
      note={(submitted for publication)},
      youtube={https://youtu.be/AxXr8uFOe7M},
}

%RSJ seminar 2023/07/06

Downloads: 0