Human as AI mentor: Enhanced human-in-the-loop reinforcement learning for safe and efficient autonomous driving

Zilin Huang; Zihao Sheng; Chengyuan Ma; Sikai Chen

doi:10.1016/j.commtr.2024.100127

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

PDF (5.2 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Research Article | Open Access

Human as AI mentor: Enhanced human-in-the-loop reinforcement learning for safe and efficient autonomous driving

Zilin Huang, Zihao Sheng, Chengyuan Ma, Sikai Chen(

)

Department of Civil and Environmental Engineering, University of Wisconsin-Madison, Madison, WI, 53706, USA

Show Author Information

Abstract

Despite significant progress in autonomous vehicles (AVs), the development of driving policies that ensure both the safety of AVs and traffic flow efficiency has not yet been fully explored. In this paper, we propose an enhanced human-in-the-loop reinforcement learning method, termed the Human as AI mentor-based deep reinforcement learning (HAIM-DRL) framework, which facilitates safe and efficient autonomous driving in mixed traffic platoon. Drawing inspiration from the human learning process, we first introduce an innovative learning paradigm that effectively injects human intelligence into AI, termed Human as AI mentor (HAIM). In this paradigm, the human expert serves as a mentor to the AI agent. While allowing the agent to sufficiently explore uncertain environments, the human expert can take control in dangerous situations and demonstrate correct actions to avoid potential accidents. On the other hand, the agent could be guided to minimize traffic flow disturbance, thereby optimizing traffic flow efficiency. In detail, HAIM-DRL leverages data collected from free exploration and partial human demonstrations as its two training sources. Remarkably, we circumvent the intricate process of manually designing reward functions; instead, we directly derive proxy state-action values from partial human demonstrations to guide the agents’ policy learning. Additionally, we employ a minimal intervention technique to reduce the human mentor’s cognitive load. Comparative results show that HAIM-DRL outperforms traditional methods in driving safety, sampling efficiency, mitigation of traffic flow disturbance, and generalizability to unseen traffic scenarios.

Keywords

Human as AI mentor paradigm Autonomous driving Deep reinforcement learning Human-in-the-loop learning Driving policy Mixed traffic platoon

References

【1】

Crossref Google Scholar

Communications in Transportation Research

Volume 4 Issue 2,
June 2024

Article number: 100127

DOI: 10.1016/j.commtr.2024.100127

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Huang Z, Sheng Z, Ma C, et al. Human as AI mentor: Enhanced human-in-the-loop reinforcement learning for safe and efficient autonomous driving. Communications in Transportation Research, 2024, 4(2): 100127. https://doi.org/10.1016/j.commtr.2024.100127

1826

Views

149

Downloads

Crossref

Web of Science

Scopus

Google Scholar
Citation

Received: 20 November 2023

Revised: 06 January 2024

Accepted: 07 January 2024

Published: 08 May 2024

This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).