All Publications
Selected Publications
Understanding the Role of Individual Units in a Deep Neural Network
David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, and Antonio Torralba.
Proceedings of the National Academy of Sciences (PNAS), 2020. [PDF]
Closed-Form Factorization of Latent Semantics in GANs
Yujun Shen, Bolei Zhou
Non-local Policy Optimization via Diversity-regularized Collaborative Exploration
Zhenghao Peng, Hao Sun, Bolei Zhou
Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis
Ceyuan Yang*, Yujun Shen*, Bolei Zhou
Cross-view Semantic Segmentation for Sensing Surroundings
Bowen Pan, Jiankai Sun, Ho Yin Tiga Leung, Alex Andonian, Bolei Zhou
IEEE Robotics and Automation Letters (RA-L) and IROS 2020
Novel Policy Seeking with Constrained Optimization
Hao Sun, Zhenghao Peng, Bo Dai, Jian Guo, Dahua Lin, Bolei Zhou
In-Domain GAN Inversion for Real Image Editing
Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou
ECCV 2020
A Unified Framework for Shot Type Classification Based on Subject Centric Lens
Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, Dahua Lin
ECCV 2020
Evolutionary Stochastic Policy Distillation
Hao Sun, Xinyu Pan, Bo Dai, Dahua Lin, Bolei Zhou
Temporal Pyramid Network for Action Recognition
Ceyuan Yang*, Yinghao Xu*, Jianping Shi, Bo Dai, Bolei Zhou
CVPR 2020
Interpreting Latent Space of GANs for Semantic Face Editing
Yujun Shen, Jinjin Gu, Xiaoou Tang, Bolei Zhou
CVPR 2020
Image Processing Using Multi-Code GAN Prior
Jinjin Gu, Yujun Shen, Bolei Zhou
CVPR 2020
TPNet: Trajectory Proposal Network for Motion Prediction
Liangji Fang, Qinhong Jiang, Jianping Shi, Bolei Zhou.
CVPR 2020
A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
Anyi Rao, Linning Xu, Yu Xiong, Guodong Xu, Qingqiu Huang, Bolei Zhou, Dahua Lin
CVPR 2020
Video Motion Retargeting via Invariance-Driven Unsupervised Representation Disentanglement
Zhuoqian Yang, Wentao Zhu, Wayne Wu, Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy
CVPR 2020
Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow
Mingyu Ding, Zhe Wang, Bolei Zhou, Jianping Shi, Zhiwu Lu, Ping Luo
AAAI 2020
Policy Continuation with Hindsight Inverse Dynamics
Hao Sun, Zhizhong Li, Xiaotong Liu, Dahua Lin, Bolei Zhou
NeurIPS 2019, Spotlight
Seeing What a GAN Cannot Generate
David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, Antonio Torralba
ICCV 2019, Oral
A Graph-Based Framework to Bridge Movies and Synopses
Yu Xiong, Qingqiu Huang, Lingfen Guo, Hang Zhou, Bolei Zhou, Dahua Lin.
ICCV 2019, Oral
Reasoning About Human-Object Interactions Through Dual Attention Networks
Tete Xiao, Quanfu Fan, Dan Gutfreund, Mathew Monfort, Aude Oliva, Bolei Zhou
ICCV 2019
Semantic Photo Manipulation with a Generative Image Prior
David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, Antonio Torralba
[PDF][Webpage][Live Demo][MIT News]
Deep Flow-Guided Video Inpainting
Rui Xu, Xiaoxiao Li, Bolei Zhou, Chen Change Loy
CVPR 2019
DrivingStereo: A large-scale dataset for stereo matching in autonomous driving scenarios.
Guorun Yang*, Xiao Song*, Chaoqin Huang, Zhidong Deng, Jianping Shi, Bolei Zhou.
CVPR 2019
GAN Dissection: Visualizing and Understanding Generative Adversarial Networks.
David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba.
ICLR 2019.
Discovering place-informative scenes and objects using social media photos.
Fan Zhang, Bolei Zhou, Carlo Ratti, Yu Liu.
Royal Society Open Science, 2019
Measuring human perceptions of a large-scale urban region using machine learning.
Fan Zhang, Bolei Zhou, Liu Liu, Yu Liu, Helene Fung, Hui Lin, Carlo Ratti.
Landscape and Urban Planning, 2018
Moments in Time Dataset: one million videos for event understanding.
Mathew Monfort, Alex Andonian, Bolei Zhou, Sarah Adel Bargal, Tom Yan, Kandan Ramakrishnan, Lisa Brown, Quanfu Fan, Dan Gutfreund, Carl Vondrick, Aude Oliva.
IEEE Transaction on Pattern Analysis and Machine Intelligence, March 2019.
Semantic Understanding of Scenes through ADE20K Dataset.
Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso and Antonio Torralba.
International Journal on Computer Vision (IJCV), 2018.
[PDF][Dataset][Pretrained Models][Benchmark Page][Demo]
Revisiting the Importance of Individual Units in CNNs via Ablation.
Bolei Zhou, Yiyou Sun, David Bau, and Antonio Torralba
arXiv:1806.02891, 2018.
Temporal Relational Reasoning in Videos.
Bolei Zhou, Alex Andonian, Aude Oliva, and Antonio Torralba
ECCV 2018.
[PDF][arXiv][Webpage][Demo Video][Code][MIT News]
Interpretable Basis Decomposition for Visual Explanation.
Bolei Zhou*, Yiyou Sun*, David Bau*, Antonio Torralba.
ECCV 2018.
Unified Perceptual Parsing for Scene Understanding.
Tete Xiao*, Yingcheng Liu*, Bolei Zhou*, Yuning Jiang, Jian Sun
ECCV 2018.
[PDF][Code & Data]
Single Image Intrinsic Decomposition without a Single Intrinsic Image.
Wei-Chiu Ma, Hang Chu, Bolei Zhou, Raquel Urtasun, Antonio Torralba.
ECCV 2018.
Factorizable Net: An Efficient Subgraph based Framework for Scene Graph Generation.
Yikang Li, Wanli Ouyang, Bolei Zhou, Yawen Cui, Jianping Shi, Xiaogang Wang.
ECCV 2018.
Real-Time Object Pose Estimation with Pose Interpreter Networks.
Jimmy Wu, Bolei Zhou, Rebecca Russell, Vincent Kee, Syler Wagner, Mitchell Hebert, Antonio Torralba, and David M.S. Johnson
IROS 2018.
Interpretable Representation Learning for Visual Intelligence.
Bolei Zhou
PhD thesis submitted to MIT EECS, May 17, 2018.
Committee: Antonio Torralba, Aude Oliva, Bill Freeman.
[PDF][Defense Talk]
DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation.
Jimmy Wu, Bolei Zhou, Diondra Peck, Scott Hsieh, Vandana Dialani, Vasilis Syrgkanis, Lester Mackey, and Genevieve Patterson
arXiv:1805.12323, 2018.
Expert identification of visual primitives used by CNNs during mammogram classification.
Jimmy Wu, Diondra Peck, Scott Hsieh, Vandana Dialani, Constance D. Lehman, Bolei Zhou, Vasilis Syrgkanis, Lester Mackey, and Genevieve Patterson
SPIE Medical Imaging, 2018.
Visual Question Generation as Dual Task of Visual Question Answering.
Yikang Li, Nan Duan, Bolei Zhou, Xiao Chu, Wanli Ouyang, and Xiaogang Wang
CVPR 2018, spotlight.
Recurrent Residual Module for Fast Inference in Videos.
Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu
CVPR 2018.
Interpreting Deep Visual Representations via Network Dissection.
Bolei Zhou*, David Bau*, Aude Oliva, and Antonio Torralba.
IEEE Transactions on Pattern Analysis and Machine Intelligence, June 2018. *-indicates equal contributions
Places: A 10 Million Image Database for Scene Recognition.
Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba.
IEEE Transactions on Pattern Analysis and Machine Intelligence, July 2017.
[PDF][Places2 Dataset][Challenge Page][Places365 CNN models][Demo]
Scene Graph Generation from Objects, Phrases and Region Captions.
Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, and Xiaogang Wang
ICCV 2017.
Open Vocabulary Scene Parsing.
Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, and Antonio Torralba
ICCV 2017.
Scene Parsing through ADE20K Dataset.
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso and Antonio Torralba.
CVPR 2017.
[PDF][Dataset][Benchmark Page][Challenge Page][Toolkit&Code][Demo]
Network Dissection: Quantifying Interpretability of Deep Visual Representations.
David Bau*, Bolei Zhou*, Aditya Khosla, Aude Oliva, and Antonio Torralba.
CVPR 2017. as oral. *-indicates equal contribution.
[PDF][arXiv][webpage][code][Talk Video]
Person Search with Natural Language Description.
Shuang Li, Tong Xiao, Hongsheng Li, Bolei Zhou, Dayu Yue, and Xiaogang Wang.
CVPR 2017.
SegICP: Integrated Deep Semantic Segmentation and Pose Estimation.
J. Wong, V. Kee, T. Le, S.Wagner, G. Mariottini, A. Schneider, L. Hamilton, R. Chiaplkatty, M. Herbert, D. Johnson
J. Wu, B. Zhou, and A. Torralba.
IROS 2017, Oral
Learning Deep Features for Discriminative Localization.
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba
CVPR 2016.
[PDF] [arXiv][Project Page][Video of CNN shifting its attention]
Optimization as Estimation with Gaussian Processes in Bandit Settings.
Zi Wang, Bolei Zhou, Stephanie Jegelka
AISTATS 2016, Oral.
Understanding Intra-Class Knowledge inside CNN.
Donglai Wei, Bolei Zhou, Antonio Torralba, William Freeman
arXiv:1507.02379, 2015.
Simple Baseline for Visual Question Answering.
Bolei Zhou, Yuandong Tian, Sainbar Suhkbaatar, Arthur Szlam, Rob Fergus
arXiv:1512.02167, 2015.
Object Detectors Emerge in Deep Scene CNNs.
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba
ICLR 2015, Oral.
[PDF][Project Page][More Visualization][Code]
ConceptLearner: Discovering Visual Concepts from Weakly Labeled Image Collections.
Bolei Zhou, Vignesh Jagadeesh, and Robinson Piramuthu
CVPR 2015.
[PDF][Project Page & Demo]
Learning Deep Features for Scene Recognition using Places Database.
Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva
NIPS 2014, Spotlight
[PDF][Project Page][Demo]
Recognizing City Identity via Attribute Analysis of Geo-tagged Images.
Bolei Zhou, Liu Liu, Aude Oliva and Antonio Torralba
ECCV 2014
[PDF][Project Page]
Liu Liu, Bolei Zhou, Jinhua Zhao, Brent D. Ryan
C-IMAGE: City Cognitive Mapping through Geo-tagged Photos
GeoJournal, Springer, 2016.
Measuring Crowd Collectiveness.
Bolei Zhou, Xiaoou Tang, Hepeng Zhang and Xiaogang Wang
IEEE transaction on Pattern Analysis and Machine Intelligence (PAMI), 2014.
CVPR 2013, Oral
[PDF(CVPR)][PDF(TPAMI)][Project Page]
2013 and earlier
Learning Collective Crowd Behaviors with Dynamic Pedestrian-Agents.
Bolei Zhou, Xiaoou Tang and Xiaogang Wang.
International Journal of Computer Vision (IJCV), 2014.
CVPR 2012, Oral
[PDF(CVPR)] [PDF(IJCV)][Project Page]
Coherent Filtering: Detecting Coherent Motions from Crowd Clutters.
Bolei Zhou, Xiaoou Tang and Xiaogang Wang.
ECCV 2012.
[PDF] [Project Page]
Random Field Topic Model for Semantic Region Analysis in Crowded Scenes from Tracklets.
Bolei Zhou, Xiaogang Wang and Xiaoou Tang.
CVPR 2011
[PDF][Project Page]