I am a first year Ph.D student in AI at UC Berkeley advised by Prof. Trevor Darrell's group. I used to be an undergraduate student in Department of Electronic Engineering, School of Information Science and Technology, Tsinghua University. I studied as an exchange student in Electrical and Computer Engineering Department, University of Toronto in Fall 2014.
I work in Center for Intelligent Image and Document Information Processing, advised by Prof. Qi Tian(UTSA) and Prof. Liangrui Peng since 2014. In Fall 2014, I worked at Machine Learning Group in University of Toronto co-advised by Prof. Sanja Fidler and Prof. Raquel Urtasun. In Summer 2015, I worked at Berkeley Vision and Learning Center(BVLC) and International Computer Science Institute(ICSI) as a research assistant, advised by Prof. Trevor Darrell and Prof. Jiashi Feng.
My research interests lie in computer vision, deep learning, machine learning and their applications.
Aug. 2012 - Jul. 2016 (Expected), Department of Electronic Engineering, Tsinghua University,
Balchlor of Engineering, GPA: 93/100, ranking: 5/238. br> Average of Math and Math-Related Courses: 95.4/100.
Aug. 2014 - Dec. 2014, School of Electrical and Computer Engineering, University of Toronto,
Exchange Student, GPA: 4.0/4.0.
July. 2015 - Sept. 2015, Department of Electrical and Computer Engineering, University of California, Berkeley,
Weakly Supervised Deep Scene Parsing with Attributes,
Huazhe Xu, Jiashi Feng, Trevor Darrell,
Jul 2015 - Nov 2015, UC Berkeley,
In this work, we consider a challenging problem of automatically parsing scene images with only weakly-supervised information. br> The core technique we develop to solve this problem is a novel deep convolutional network architecture built on the fully convolutional network (FCN) model -- termed as category-attribute mutually boosted FCN (CAM-FCN). We formulate the problem of updating or training CAM-FCN with non-linear global constraints into an alternative optimization problem through decoupling the updating procedure of two sub-FCNs. Extensive experiments on benchmark datasets clearly demonstrate the superior effectiveness of CAM-FCN over well-established baselines for the weakly supervised scene parsing task.
Natural Language Object Retrieval with RNN,
Ronghang Hu, Huazhe Xu, Marcus Rohrbach, Jiashi Feng, Kate Saenko and Trevor Darrell,
Jul 2015 - Nov 2015, UC Berkeley,
We address the task of natural language object retrieval, to localize a target object within a given image based on a natural language query of the object. Natural language object retrieval differs from text-based image retrieval task as it involves spatial information about objects within the scene and global scene context. To address this issue, we propose a novel Spatial Context Recurrent ConvNet (SCRC) model as scoring function on candidate boxes for object retrieval, integrating spatial conﬁgurations and global scene-level contextual information into the net-work.
Automobile Visual Taste Ranking,
Huazhe Xu, Sanja Fidler, Raquel Urtasun,
2015 Fall , University of Toronto,
This project explores with issue of human's visual taste on automobiles. We find that most recommendation systems lack of personality and customized service. We utilize image visual features and well-established survey questionaire with ranksvm and CNN techniques to detect personalized visual preference for customer.
R. Hu, H. Xu, M. Rohrbach, J. Feng, K. Saenko, T. Darrell, Natural Language Object Retrieval. in CVPR 2016(oral). [pdf]
H. Xu*, Y. Gao*, F. Yu, T. Darrell, End-to-end Learning of Driving Models from Large-scale Video Datasets. in CVPR 2017(oral). [pdf]
Tian Xie, Qian Han, Huazhe Xu, Zihao Qi, Wenqian Shen. A Low-Complexity Linear Precoding Scheme Based on SOR Method for Massive MIMO Systems. Vehicular Technology Conference (VTC Spring), 2015 IEEE 81st , vol., no., pp.1-5, 11-14 May 2015. [pdf]
Comprehensive Excellent Scholarship, Tsinghua Univ, 2013
Academic Excellent Scholarship, Tsinghua, Univ, 2014
Academic Excellent Scholarship, Tsinghua, Univ, 2015
Our paper 'End-to-end Learning of Driving Model from Largescale Dataset' appeared at 'New_era' Wechat Media.
Served as reviewer for IEEE Transactions on Multimedia.