Bolei Zhou

Assistant Professor
Department of Information Engineering, The Chinese University of Hong Kong
Office: Room 717, Ho Sin-Hang Engineering Building
Email:
CVGoogle ScholarGithubTwitterZhihu


Research Highlight

News

Preprint/Recent work

All Publications
Selected Publications
2019
Policy Continuation with Hindsight Inverse Dynamics.
Hao Sun, Zhizhong Li, Xiaotong Liu, Dahua Lin, Bolei Zhou.
Advances in Neural Information Processing Systems (NeurIPS), 2019, Spotlight
[PDF][Webpage]
Seeing What a GAN Cannot Generate.
David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, Antonio Torralba.
International Conference on Computer Vision (ICCV), 2019, Oral
[PDF][Webpage]
A Graph-Based Framework to Bridge Movies and Synopses.
Yu Xiong, Qingqiu Huang, Lingfen Guo, Hang Zhou, Bolei Zhou, Dahua Lin.
International Conference on Computer Vision (ICCV), 2019, Oral
[PDF]
Reasoning About Human-Object Interactions Through Dual Attention Networks.
Tete Xiao, Quanfu Fan, Dan Gutfreund, Mathew Monfort, Aude Oliva, Bolei Zhou.
International Conference on Computer Vision (ICCV), 2019
[PDF][Webpage]
Semantic Photo Manipulation with a Generative Image Prior.
David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, Antonio Torralba.
ACM Transactions on Graphics (TOG), SIGGRAPH 2019
[PDF][Webpage][Live Demo][MIT News]
Deep Flow-Guided Video Inpainting.
Rui Xu, Xiaoxiao Li, Bolei Zhou, Chen Change Loy.
Computer Vision and Pattern Recognition (CVPR), 2019
[PDF][Webpage][Code]
DrivingStereo: A large-scale dataset for stereo matching in autonomous driving scenarios.
Guorun Yang*, Xiao Song*, Chaoqin Huang, Zhidong Deng, Jianping Shi, Bolei Zhou.
Computer Vision and Pattern Recognition (CVPR), 2019
[PDF][Dataset]
GAN Dissection: Visualizing and Understanding Generative Adversarial Networks.
David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba.
International Conference on Learning Representations (ICLR), 2019.
[PDF][Webpage][Code]
Discovering place-informative scenes and objects using social media photos.
Fan Zhang, Bolei Zhou, Carlo Ratti, Yu Liu.
Royal Society Open Science, 2019
[PDF]
Measuring human perceptions of a large-scale urban region using machine learning.
Fan Zhang, Bolei Zhou, Liu Liu, Yu Liu, Helene Fung, Hui Lin, Carlo Ratti.
Landscape and Urban Planning, 2018
[PDF]
Moments in Time Dataset: one million videos for event understanding.
Mathew Monfort, Alex Andonian, Bolei Zhou, Sarah Adel Bargal, Tom Yan, Kandan Ramakrishnan, Lisa Brown, Quanfu Fan, Dan Gutfreund, Carl Vondrick, Aude Oliva.
IEEE Transaction on Pattern Analysis and Machine Intelligence, March 2019.
[PDF][Website][Code+Model]
2018
Semantic Understanding of Scenes through ADE20K Dataset.
Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso and Antonio Torralba.
International Journal on Computer Vision (IJCV), 2018.
[PDF][Dataset][Pretrained Models][Benchmark Page][Demo]
Revisiting the Importance of Individual Units in CNNs via Ablation.
Bolei Zhou, Yiyou Sun, David Bau, and Antonio Torralba
arXiv:1806.02891, 2018.
[arXiv]
Temporal Relational Reasoning in Videos.
Bolei Zhou, Alex Andonian, Aude Oliva, and Antonio Torralba
European Conference on Computer Vision (ECCV), 2018.
[PDF][arXiv][Webpage][Demo Video][Code][MIT News]
Interpretable Basis Decomposition for Visual Explanation.
Bolei Zhou*, Yiyou Sun*, David Bau*, Antonio Torralba.
European Conference on Computer Vision (ECCV), 2018.
[PDF][Code]
Unified Perceptual Parsing for Scene Understanding.
Tete Xiao*, Yingcheng Liu*, Bolei Zhou*, Yuning Jiang, Jian Sun
European Conference on Computer Vision (ECCV), 2018.
[PDF][Code & Data]
Single Image Intrinsic Decomposition without a Single Intrinsic Image.
Wei-Chiu Ma, Hang Chu, Bolei Zhou, Raquel Urtasun, Antonio Torralba.
European Conference on Computer Vision (ECCV), 2018.
[PDF]
Factorizable Net: An Efficient Subgraph based Framework for Scene Graph Generation.
Yikang Li, Wanli Ouyang, Bolei Zhou, Yawen Cui, Jianping Shi, Xiaogang Wang.
European Conference on Computer Vision (ECCV), 2018.
[PDF]
Real-Time Object Pose Estimation with Pose Interpreter Networks.
Jimmy Wu, Bolei Zhou, Rebecca Russell, Vincent Kee, Syler Wagner, Mitchell Hebert, Antonio Torralba, and David M.S. Johnson
International Conference on Intelligent Robots (IROS), 2018.
[PDF][Code][Video]
Interpretable Representation Learning for Visual Intelligence.
Bolei Zhou
PhD thesis submitted to MIT EECS, May 17, 2018.
Committee: Antonio Torralba, Aude Oliva, Bill Freeman.
[PDF][Defense Talk]
DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation.
Jimmy Wu, Bolei Zhou, Diondra Peck, Scott Hsieh, Vandana Dialani, Vasilis Syrgkanis, Lester Mackey, and Genevieve Patterson
arXiv:1805.12323, 2018.
[arXiv]
Expert identification of visual primitives used by CNNs during mammogram classification.
Jimmy Wu, Diondra Peck, Scott Hsieh, Vandana Dialani, Constance D. Lehman, Bolei Zhou, Vasilis Syrgkanis, Lester Mackey, and Genevieve Patterson
SPIE Medical Imaging, 2018.
[PDF]
Visual Question Generation as Dual Task of Visual Question Answering.
Yikang Li, Nan Duan, Bolei Zhou, Xiao Chu, Wanli Ouyang, and Xiaogang Wang
Computer Vision and Pattern Recognition (CVPR), 2018, spotlight.
[arXiv][Webpage][Code]
Recurrent Residual Module for Fast Inference in Videos.
Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu
Computer Vision and Pattern Recognition (CVPR), 2018.
[arXiv]
Interpreting Deep Visual Representations via Network Dissection.
Bolei Zhou*, David Bau*, Aude Oliva, and Antonio Torralba.
IEEE Transactions on Pattern Analysis and Machine Intelligence, June 2018. *-indicates equal contributions
[arXiv][Webpage][Code]
2017
Places: A 10 Million Image Database for Scene Recognition.
Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba.
IEEE Transactions on Pattern Analysis and Machine Intelligence, July 2017.
[PDF][Places2 Dataset][Challenge Page][Places365 CNN models][Demo]
Scene Graph Generation from Objects, Phrases and Region Captions.
Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, and Xiaogang Wang
International Conference on Computer Vision (ICCV), 2017.
[PDF][Code]
Open Vocabulary Scene Parsing.
Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, and Antonio Torralba
International Conference on Computer Vision (ICCV), 2017.
(arXiv:1703.08769).
[PDF][arXiv][Webpage]
Scene Parsing through ADE20K Dataset.
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso and Antonio Torralba.
Computer Vision and Pattern Recognition (CVPR), 2017.
[PDF][Dataset][Benchmark Page][Challenge Page][Toolkit&Code][Demo]
Network Dissection: Quantifying Interpretability of Deep Visual Representations.
David Bau*, Bolei Zhou*, Aditya Khosla, Aude Oliva, and Antonio Torralba.
Computer Vision and Pattern Recognition (CVPR), 2017. as oral. *-indicates equal contribution.
[PDF][arXiv][webpage][code][Talk Video]
Person Search with Natural Language Description.
Shuang Li, Tong Xiao, Hongsheng Li, Bolei Zhou, Dayu Yue, and Xiaogang Wang.
Computer Vision and Pattern Recognition (CVPR), 2017.
[PDF][Dataset]
SegICP: Integrated Deep Semantic Segmentation and Pose Estimation.
J. Wong, V. Kee, T. Le, S.Wagner, G. Mariottini, A. Schneider, L. Hamilton, R. Chiaplkatty, M. Herbert, D. Johnson
J. Wu, B. Zhou, and A. Torralba.
IEEE International Conference on Intelligent Robots and Systems (IROS'17) as Oral (arXiv:1703.01661)
[PDF]
2016
Learning Deep Features for Discriminative Localization.
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba
Computer Vision and Pattern Recognition (CVPR), 2016.
[PDF] [arXiv][Project Page][Video of CNN shifting its attention]
Optimization as Estimation with Gaussian Processes in Bandit Settings.
Zi Wang, Bolei Zhou, Stephanie Jegelka
Artificial Intelligence and Statistics (AISTATS'16) as oral, 2016.
[PDF][Project][Code]
2015
Understanding Intra-Class Knowledge inside CNN.
Donglai Wei, Bolei Zhou, Antonio Torralba, William Freeman
arXiv:1507.02379, 2015.
[PDF][Page][Code]
Simple Baseline for Visual Question Answering.
Bolei Zhou, Yuandong Tian, Sainbar Suhkbaatar, Arthur Szlam, Rob Fergus
arXiv:1512.02167, 2015.
[PDF][Demo][Code]
Object Detectors Emerge in Deep Scene CNNs.
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba
International Conference on Learning Representations (ICLR) as oral, 2015.
[PDF][Project Page][More Visualization][Code]
ConceptLearner: Discovering Visual Concepts from Weakly Labeled Image Collections.
Bolei Zhou, Vignesh Jagadeesh, and Robinson Piramuthu
Computer Vision and Pattern Recognition (CVPR), 2015.
[PDF][Project Page & Demo]
2014
Learning Deep Features for Scene Recognition using Places Database.
Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva
Advances in Neural Information Processing Systems 27 (NIPS) spotlight, 2014.
[PDF][Project Page][Demo]
Recognizing City Identity via Attribute Analysis of Geo-tagged Images.
Bolei Zhou, Liu Liu, Aude Oliva and Antonio Torralba
Proceedings of 13th European Conference on Computer Vision (ECCV) , 2014.
[PDF][Project Page]
Liu Liu, Bolei Zhou, Jinhua Zhao, Brent D. Ryan
C-IMAGE: City Cognitive Mapping through Geo-tagged Photos
GeoJournal, Springer, 2016.
[PDF]
Measuring Crowd Collectiveness.
Bolei Zhou, Xiaoou Tang, Hepeng Zhang and Xiaogang Wang
IEEE transaction on Pattern Analysis and Machine Intelligence (PAMI), 2014.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) oral, 2013.
[PDF(CVPR)][PDF(TPAMI)][Project Page]
2013 and earlier
Learning Collective Crowd Behaviors with Dynamic Pedestrian-Agents.
Bolei Zhou, Xiaoou Tang and Xiaogang Wang.
International Journal of Computer Vision (IJCV), 2014.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) oral, 2012.
[PDF(CVPR)] [PDF(IJCV)][Project Page]
Coherent Filtering: Detecting Coherent Motions from Crowd Clutters.
Bolei Zhou, Xiaoou Tang and Xiaogang Wang.
In Proceedings of 12th European Conference on Computer Vision (ECCV), 2012.
[PDF] [Project Page]
Random Field Topic Model for Semantic Region Analysis in Crowded Scenes from Tracklets.
Bolei Zhou, Xiaogang Wang and Xiaoou Tang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.
[PDF][Project Page]

Students

Teaching

Honors

Talks

Media coverage

  • VentureBeat: MIT CSAIL designs AI that can track objects over time.
  • MIT News: Helping computers fill in the gaps between video frames.
  • Quartz: Track AI decisions back to single neurons.
  • MIT News: Peering into neural networks.
  • TechCrunch: A fully automated way to peer inside neural networks.
  • MIT CSAIL News: Scene parsing and scene classification challenges.
  • TechCrunch and MIT News: Object detectors emerge in CNNs.

Datasets & Benchmarks

Professional activities