Yuping Luo, Huazhe Xu, Tengyu Ma.
Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling,
International Conference on Learning Representation (ICLR), 2020.
Yang Gao*,Huazhe Xu*, Fisher Yu, Sergey Levine, Trevor Darrell.
Reinforcement Learning from Imperfect Demonstrations,
Neurips 2018 Deep RL Symposium (Neurips Symposium), 2018.
Haoran Tang*, Dennis Lee*, Jeffrey O Zhang, Huazhe Xu, Trevor Darrell, Pieter Abbeel.
Modular Architecture for StarCraft II with Deep Reinforcement Learning,
he 14th AAAI Conference on Artificial Intelligenceand Interactive Digital Entertainmen (AIIDE), 2018.