Bolei Zhou

Assistant Professor
Department of Information Engineering, The Chinese University of Hong Kong
Office: Room 717, Ho Sin-Hang Engineering Building
Email:
CVGoogle ScholarGithubLinkedinZhihu


Research

  • My research is on computer vision and machine learning, with a particular interest in visual scene understanding and interpretable AI systems.

News

All Publications
Representative Publications
2018
Semantic Understanding of Scenes through ADE20K Dataset.
Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso and Antonio Torralba.
International Journal on Computer Vision (IJCV), 2018.
[PDF][Dataset][Pretrained Models][Benchmark Page][Demo]
Revisiting the Importance of Individual Units in CNNs via Ablation.
Bolei Zhou, Yiyou Sun, David Bau, and Antonio Torralba
arXiv:1806.02891, 2018.
[arXiv]
Temporal Relational Reasoning in Videos.
Bolei Zhou, Alex Andonian, Aude Oliva, and Antonio Torralba
European Conference on Computer Vision (ECCV), 2018.
[PDF][arXiv][Webpage][Demo Video][Code][MIT News]
Interpretable Basis Decomposition for Visual Explanation.
Bolei Zhou*, Yiyou Sun*, David Bau*, Antonio Torralba.
European Conference on Computer Vision (ECCV), 2018.
[PDF][Code(soon)]
Unified Perceptual Parsing for Scene Understanding.
Tete Xiao*, Yingcheng Liu*, Bolei Zhou*, Yuning Jiang, Jian Sun
European Conference on Computer Vision (ECCV), 2018.
[PDF][Code & Data]
Single Image Intrinsic Decomposition without a Single Intrinsic Image.
Wei-Chiu Ma, Hang Chu, Bolei Zhou, Raquel Urtasun, Antonio Torralba.
European Conference on Computer Vision (ECCV), 2018.
[PDF(soon)]
Factorizable Net: An Efficient Subgraph based Framework for Scene Graph Generation.
Yikang Li, Wanli Ouyang, Bolei Zhou, Yawen Cui, Jianping Shi, Xiaogang Wang.
European Conference on Computer Vision (ECCV), 2018.
[PDF]
Real-Time Object Pose Estimation with Pose Interpreter Networks.
Jimmy Wu, Bolei Zhou, Rebecca Russell, Vincent Kee, Syler Wagner, Mitchell Hebert, Antonio Torralba, and David M.S. Johnson
International Conference on Intelligent Robots (IROS), 2018.
[PDF][Code][Video]
Interpretable Representation Learning for Visual Intelligence.
Bolei Zhou
PhD thesis submitted to MIT EECS, May 17, 2018.
Committee: Antonio Torralba, Aude Oliva, Bill Freeman.
[PDF][Defense Talk]
DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation.
Jimmy Wu, Bolei Zhou, Diondra Peck, Scott Hsieh, Vandana Dialani, Vasilis Syrgkanis, Lester Mackey, and Genevieve Patterson
arXiv:1805.12323, 2018.
[arXiv]
Expert identification of visual primitives used by CNNs during mammogram classification.
Jimmy Wu, Diondra Peck, Scott Hsieh, Vandana Dialani, Constance D. Lehman, Bolei Zhou, Vasilis Syrgkanis, Lester Mackey, and Genevieve Patterson
SPIE Medical Imaging, 2018.
[PDF]
Visual Question Generation as Dual Task of Visual Question Answering.
Yikang Li, Nan Duan, Bolei Zhou, Xiao Chu, Wanli Ouyang, and Xiaogang Wang
Computer Vision and Pattern Recognition (CVPR), 2018, spotlight.
[arXiv][Webpage][Code]
Recurrent Residual Module for Fast Inference in Videos.
Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu
Computer Vision and Pattern Recognition (CVPR), 2018.
[arXiv]
Moments in Time Dataset: one million videos for event understanding.
Mathew Monfort, Bolei Zhou, Sarah Adel Bargal, Alex Andonian, Tom Yan, Kandan Ramakrishnan, Lisa Brown, Quanfu Fan, Dan Gutfreund, Carl Vondrick, Aude Oliva.
under revision of TPAMI, arXiv:1801.03150, 2018.
[Tech Report][Website][Code+Model]
Interpreting Deep Visual Representations via Network Dissection.
Bolei Zhou*, David Bau*, Aude Oliva, and Antonio Torralba.
IEEE Transactions on Pattern Analysis and Machine Intelligence, June 2018. *-indicates equal contributions
[arXiv][Webpage][Code]
2017
Places: A 10 Million Image Database for Scene Recognition.
Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba.
IEEE Transactions on Pattern Analysis and Machine Intelligence, July 2017.
[PDF][Places2 Dataset][Challenge Page][Places365 CNN models][Demo]
Scene Graph Generation from Objects, Phrases and Region Captions.
Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, and Xiaogang Wang
International Conference on Computer Vision (ICCV), 2017.
[PDF][Code]
Open Vocabulary Scene Parsing.
Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, and Antonio Torralba
International Conference on Computer Vision (ICCV), 2017.
(arXiv:1703.08769).
[PDF][arXiv][Webpage]
Scene Parsing through ADE20K Dataset.
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso and Antonio Torralba.
Computer Vision and Pattern Recognition (CVPR), 2017.
[PDF][Dataset][Benchmark Page][Challenge Page][Toolkit&Code][Demo]
Network Dissection: Quantifying Interpretability of Deep Visual Representations.
David Bau*, Bolei Zhou*, Aditya Khosla, Aude Oliva, and Antonio Torralba.
Computer Vision and Pattern Recognition (CVPR), 2017. as oral. *-indicates equal contribution.
[PDF][arXiv][webpage][code][Talk Video]
Person Search with Natural Language Description.
Shuang Li, Tong Xiao, Hongsheng Li, Bolei Zhou, Dayu Yue, and Xiaogang Wang.
Computer Vision and Pattern Recognition (CVPR), 2017.
[PDF][Dataset]
SegICP: Integrated Deep Semantic Segmentation and Pose Estimation.
J. Wong, V. Kee, T. Le, S.Wagner, G. Mariottini, A. Schneider, L. Hamilton, R. Chiaplkatty, M. Herbert, D. Johnson
J. Wu, B. Zhou, and A. Torralba.
IEEE International Conference on Intelligent Robots and Systems (IROS'17) as Oral (arXiv:1703.01661)
[PDF]
2016
Learning Deep Features for Discriminative Localization.
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba
Computer Vision and Pattern Recognition (CVPR), 2016.
[PDF] [arXiv][Project Page][Video of CNN shifting its attention]
Optimization as Estimation with Gaussian Processes in Bandit Settings.
Zi Wang, Bolei Zhou, Stephanie Jegelka
Artificial Intelligence and Statistics (AISTATS'16) as oral, 2016.
[PDF][Project][Code]
2015
Understanding Intra-Class Knowledge inside CNN.
Donglai Wei, Bolei Zhou, Antonio Torralba, William Freeman
arXiv:1507.02379, 2015.
[PDF][Page][Code]
Simple Baseline for Visual Question Answering.
Bolei Zhou, Yuandong Tian, Sainbar Suhkbaatar, Arthur Szlam, Rob Fergus
arXiv:1512.02167, 2015.
[PDF][Demo][Code]
Object Detectors Emerge in Deep Scene CNNs.
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba
International Conference on Learning Representations (ICLR) as oral, 2015.
[PDF][Project Page][More Visualization][Code]
ConceptLearner: Discovering Visual Concepts from Weakly Labeled Image Collections.
Bolei Zhou, Vignesh Jagadeesh, and Robinson Piramuthu
Computer Vision and Pattern Recognition (CVPR), 2015.
[PDF][Project Page & Demo]
2014
Learning Deep Features for Scene Recognition using Places Database.
Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva
Advances in Neural Information Processing Systems 27 (NIPS) spotlight, 2014.
[PDF][Project Page][Demo]
Recognizing City Identity via Attribute Analysis of Geo-tagged Images.
Bolei Zhou, Liu Liu, Aude Oliva and Antonio Torralba
Proceedings of 13th European Conference on Computer Vision (ECCV) , 2014.
[PDF][Project Page]
Liu Liu, Bolei Zhou, Jinhua Zhao, Brent D. Ryan
C-IMAGE: City Cognitive Mapping through Geo-tagged Photos
GeoJournal, Springer, 2016.
[PDF]
Measuring Crowd Collectiveness.
Bolei Zhou, Xiaoou Tang, Hepeng Zhang and Xiaogang Wang
IEEE transaction on Pattern Analysis and Machine Intelligence (PAMI), 2014.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) oral, 2013.
[PDF(CVPR)][PDF(TPAMI)][Project Page]
2013 and earlier
Learning Collective Crowd Behaviors with Dynamic Pedestrian-Agents.
Bolei Zhou, Xiaoou Tang and Xiaogang Wang.
International Journal of Computer Vision (IJCV), 2014.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) oral, 2012.
[PDF(CVPR)] [PDF(IJCV)][Project Page]
Coherent Filtering: Detecting Coherent Motions from Crowd Clutters.
Bolei Zhou, Xiaoou Tang and Xiaogang Wang.
In Proceedings of 12th European Conference on Computer Vision (ECCV), 2012.
[PDF] [Project Page]
Random Field Topic Model for Semantic Region Analysis in Crowded Scenes from Tracklets.
Bolei Zhou, Xiaogang Wang and Xiaoou Tang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.
[PDF][Project Page]

Students

Honors

Media coverage

Datasets & Benchmarks

Open-source softwares

Professional activities

Talks

Personal interests