Wenhao Wu (吴文灏)

[Google Scholar]    [Github]    [Linkedin]    [Zhihu]

 

Senior R&D Engineer @ Baidu Inc.

Personal Email : whwu.ucas (at) gmail.com
School Email : wuwenhao17 (at) ucas.edu.cn

About Me

Currently, I am working at Baidu VIS as a Senior R&D Engineer. Previously, I received my Master Degree from University of Chinese Academy of Sciences (UCAS), and studied in the MMLab led by Prof. Yu Qiao, under the supervision of Prof. Shifeng Chen. I received my B.Eng. degree from Central South University (CSU). I also had a great time at SenseTime Research, Baidu Research, iQIYI, Samsung Research China and SIAT-MMLab.

My research interests mainly lie in the areas of computer vision and machine learning. In particular, my current focus is on video understanding (e.g., action recognition, temporal action detection), unsupervised/self-supervised representation learning, dynamic neural networks and cross-modal learning (e.g., combining vision and language).

Please drop me an email if you are interested in those topics or internship with Baidu.

News

  • 2022.07: Three papers accepted by ECCV2022.
  • 2022.06: One paper accepted for Oral presentation on ACMMM2022.
  • 2022.03: Two papers accepted by CVPR2022.
  • 2021.12: One paper accepted by AAAI2022.
  • 2021.07: One paper accepted by ICCV2021.
  • 2021.07: Two papers accepted by ACMMM2021.
  • 2021.04: One paper accepted by IJCAI2021.
  • 2021.04: We rank first in the Traffic Anomaly Detection Track of the CVPR 2021 AI CITY CHALLENGE.
  • 2020.12: One paper accepted by AAAI2021.
  • 2020.07: One paper accepted by ECCV2020.
  • 2020.05: One paper was accepted for Oral presentation on CVPR2020 EDLCV workshop.
  • 2019.07: My first paper has been accepted for Oral presentation on ICCV2019.
  • 2017.09: Recommended to University of Chinese Academy of Sciences towards the MSc degree.
  • 2017.06: Graduated from Central South University with the outstanding graduate honor.
  • 2016.10: Joined MMLAB of the Chinese University of Hong Kong at Shenzhen as research intern. Started doing research on Computer Vision.

Industrial Experience

  • Jul. 2020 - Present, Senior R&D Engineer, Baidu VIS, hosted by Errui Ding
  • Jan. 2020 - Feb. 2020, Research Intern, SenseTime Research - BigVideo Team, hosted by Kai Chen
  • Oct. 2018 - Jan. 2020, Research Intern, Baidu Research & Baidu VIS, hosted by Shilei Wen and Errui Ding
  • Jun. 2018 - Oct. 2018, Research Intern, iQIYI - Video Analysis Group, hosted by Qiyue Liu
  • Mar. 2018 - Jun. 2018, Research Intern, Samsung Research China - Machine Learning Lab, hosted by Zhenbo luo

Publications [ Full List ]

(*Equal Contribution   *Correspondence)

Video/Image Recognition


Transferring Textual Knowledge for Visual Recognition
Wenhao Wu, Zhun Sun, Wanli Ouyang
Technical Report
[ PDF ] [ Code ]
We revisit the classifier with the textual embeddings, and achieve SOTA performance on Full-supervision/Few-shot/Zero-shot recognition.
MVFNet: Multi-View Fusion Network for Efficient Video Recognition
Wenhao Wu, Dongliang He, Tianwei Lin, Fu Li, Chuang Gan, Errui Ding
The AAAI Conference on Artificial Intelligence (AAAI) , 2021
[21% acceptance rate] [ PDF ] [ Poster ] [ Slides ] [ Code ] [ Bibtex ]
  @inproceedings{wu2021mvfnet,
  title={Mvfnet: Multi-view fusion network for efficient video recognition},
  author={Wu, Wenhao and He, Dongliang and Lin, Tianwei and Li, Fu and Gan, Chuang and Ding, Errui},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={4},
  pages={2943--2951},
  year={2021}
  }
      
An efficient architecture for video recognition based on 2D CNN.
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
Wenhao Wu*, Yuxiang Zhao*, Yanwu Xu, Xiao Tan, Dongliang He, Zhikang Zou, Jin Ye, Yingying Li, Mingde Yao, Zichao Dong, Yifeng Shi
ACM International Conference on Multimedia (ACMMM) , 2021
[ PDF ] [ Poster ] [ Slides ] [ Code ]
An efficient plug-and-play module for effective video-level representation learning.
Attention-Driven Dynamic Graph Convolutional Network for Multi-Label Image Recognition
Jin Ye, Junjun He, Xiaojiang Peng, Wenhao Wu, Yu Qiao
European Conference on Computer Vision (ECCV) , 2020
[Poster] [ PDF ] [ Code ] [ Bibtex ]
@inproceedings{ADD-GCN,
    title={Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition},
    author={Ye, Jin and He, Junjun and Peng, Xiaojiang and Wu, Wenhao and Qiao, Yu}
    booktitle={Proceedings of ECCV 2020},
    pages={649--665},
    year={2020}
}
    

Cross-Modal Learning


Transferring Textual Knowledge for Visual Recognition
Wenhao Wu, Zhun Sun, Wanli Ouyang
Technical Report
[ PDF ] [ Code ]
We revisit the classifier with the textual embeddings, and achieve SOTA performance on Full-supervision/Few-shot/Zero-shot recognition.
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
Haoran Wang, Dongliang He, Wenhao Wu, Boyang Xia, Min Yang, Fu Li, Yunlong Yu, Zhong Ji, Errui Ding, Jingdong Wang
European Conference on Computer Vision (ECCV), 2022
[19% acceptance rate] [ PDF ] [ Code ]

Dynamic Video Inference for Efficient Recognition


Temporal Saliency Query Network for Efficient Video Recognition
Boyang Xia*, Zhihao Wang*, Wenhao Wu*, Haoran Wang, Jungong Han
European Conference on Computer Vision (ECCV) , 2022
[19% acceptance rate] [ PDF ] [ Project ]
TSQNet, the first work to model temporal sampling as a query-response task.
NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition
Boyang Xia*, Wenhao Wu**, Haoran Wang, Rui Su, Dongliang He, Haosen Yang, Xiaoran Fan, Wanli Ouyang
European Conference on Computer Vision (ECCV) , 2022
[19% acceptance rate] [ PDF ] [ Project ]
A sampler with a 4x faster practical speed than SOTA methods.
Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Wenhao Wu, Dongliang He, Xiao Tan, Shifeng Chen, Yi Yang, Shilei Wen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - Joint Workshop on Efficient Deep Learning in Computer Vision (EDLCV), 2020
[Oral] [ PDF ] [ Slides ] [ Bibtex ]
@inproceedings{wu2020dynamic,
    title={Dynamic Inference: A New Approach Toward Efficient Video Action Recognition},
    author={Wu, Wenhao and He, Dongliang and Tan, Xiao and Chen, Shifeng 
      and Yang, Yi and Wen, Shilei},
    booktitle={Proceedings of CVPR Workshops},
    pages={676--677},
    year={2020}
}
    
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition
Wenhao Wu, Dongliang He, Xiao Tan, Shifeng Chen, Shilei Wen
IEEE International Conference on Computer Vision (ICCV) , 2019
[Oral, 4.3% acceptance rate] [ PDF ] [ Poster ] [ Slides ] [ Bibtex ]
@inproceedings{wu2019multi,
    title={Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed
       Video Recognition},
    author={Wu, Wenhao and He, Dongliang and Tan, Xiao and Chen, Shifeng and Wen, Shilei},
    booktitle={Proceedings of the IEEE International Conference on Computer Vision},
    pages={6222--6231},
    year={2019}
}

Self-supervised Video Representation Learning


MaMiCo: Macro-to-Micro Semantic Correspondence for Self-supervised Video Representation Learning
Bo Fang*, Wenhao Wu*, Chang Liu*, Yu Zhou, Dongliang He, Weiping Wang
ACM International Conference on Multimedia (ACMMM) , 2022
[Oral, 5.0% acceptance rate] [ PDF ] [ Project ]
MaMiCo, a self-supervised Macro-to-Micro Semantic Correspondence learning framework for video representation learning.
ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency
Deng Huang*, Wenhao Wu*, Weiwen Hu, Xu Liu, Dongliang He, Zhihua Wu, Xiangmiao Wu, Mingkui Tan, Errui Ding
IEEE International Conference on Computer Vision (ICCV) , 2021
[ PDF ] [ Poster ] [ Slides ] [ Video ] [ Code ]
An effective self-supervised video representation learning framework.

Temporal Action/Anomaly Detection


Temporal Action Proposal Generation with Background Constraint
Haosen Yang*, Wenhao Wu*, Lining Wang, Sheng Jin, Boyang Xia, Hongxun Yao, Hujie Huang
The AAAI Conference on Artificial Intelligence (AAAI) , 2022
[15% acceptance rate] [ PDF ] [ Code ]
BCNet, an general framework for effective Temporal Action Proposal Generation.
Temporal Action Proposal Generation with Transformers
Lining Wang*, Haosen Yang*, Wenhao Wu*, Hongxun Yao, Hujie Huang
Technical Report
[ PDF ] [ Code ]
Transformers for temporal action proposal generation with obtaining SOTA performance.
Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video
Jie Wu, Wei Zhang, Guanbin Li, Wenhao Wu, Xiao Tan, Yingying Li, Errui Ding, Liang Lin
Joint Conference on Artificial Intelligence (IJCAI) , 2021
[ PDF ] [ Code ] [ Bibtex ]
@article{wu2021weakly,
title={Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video},
author={Wu, Jie and Zhang, Wei and Li, Guanbin and Wu, Wenhao and Tan, Xiao and Li, Yingying and Ding, Errui and Lin,
Liang},
booktitle={Proceeding of IJCAI},
year={2021}
}
We introduce a novel task: Weakly-Supervised Spatio-Temporal Anomaly Detection.
Good Practices and A Strong Baseline for Traffic Anomaly Detection
Yuxiang Zhao*, Wenhao Wu*, Yue He, Yingying Li, Xiao Tan, Shifeng Chen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - 5th AI City Challenge (AICity), 2021
[ PDF ] [ Code ] [ Bibtex ]
@inproceedings{zhao2021good,
title={Good Practices and A Strong Baseline for Traffic Anomaly Detection},
author={Zhao, Yuxiang and Wu, Wenhao and He, Yue and Li, Yingying and Tan, Xiao and Chen, Shifeng},
booktitle={Proceedings of CVPR Workshops},
year={2021}
}
  
Winner of AI City challenge for traffic anomaly detection

Low-level Vision


Discovering “Semantics” in Super-Resolution Networks
Yihao Liu, Anran Liu, Jinjin Gu, Zhipeng Zhang, Wenhao Wu, Yu Qiao, Chao Dong
Technical Report
[ PDF ] [ Code ]
Color2Style: Real-Time Exemplar-Based Image Colorization with Self-Reference Learning and Deep Feature Modulation
Henyuan Zhao*, Wenhao Wu*, Yihao Liu*, Dongliang He
Technical Report
[ PDF ] [ Code ]

Contests

  • CVPR2021 AI CITY Challenge: 1st place in Traffic Anomaly Detection, 2021
  • NTIRE 2021 Challenge on Image Deblurring: Track 2 JPEG Artifacts, Runner-Up Award, 2021
  • The First-class Prize of America Mathematical Contest in Modeling (MCM), 2016
  • The First-class Prize in National Undergraduate Mechanical Innovation Design Competition, 2016
  • The Second-class Prize in China Freescale Cup Intelligent Car Competition (South China Region), 2015
  • The Second-class Prize in Smart Car Racing Competition of Hunan Province, 2015

Awards

  • Excellent Student Cadre of University of Chinese Academy of Sciences, 2020
  • The Most Outstanding Intern in Baidu (only 2 recipient in the Department), 2019
  • Scholarship for Academic Excellence of Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 2018
  • Excellent Undergraduate Student of Central South University, 2017
  • Outstanding Student of Central South University, 2016
  • National Endeavor Scholarship, 2016
  • Excellent Student Cadre of Central South University, 2015
  • Excellent League Member of Central South University, 2015
  • Scholarship for Academic Excellence of Central South University, 2014/2015/2016
  • Academic Activities

    Journal Reviewer

  • IEEE Transaction on Image Processing (TIP)
  • IEEE Transaction on Circuits and Systems for Video Technology (TCSVT)
  • Conference PC Member/Reviewer

  • Reviewer, European Conference on Computer Vision (ECCV), 2022
  • Reviewer, The Conference on Computer Vision and Pattern Recognition (CVPR), 2021,2022
  • PC Member, International Joint Conference on Artificial Intelligence (IJCAI), 2021
  • PC Member, The AAAI Conference on Artifical Intelligence (AAAI), 2021,2022
  • Member of IEEE, ACM and CVF

    Mentoring

    Yuxiang Zhao (SIAT, CAS), Hengyuan Zhao (NUS), Deng Huang (SCUT), Haosen Yang (HIT), Boyang Xia (ICT, CAS), Zhihao Wang (ICT, CAS), Bo Fang (IIE, CAS), Yuguo Wang (Duke University)

    Collaborators & Friends

    Xiao Tan (Baidu), Dongliang He (Baidu), Yanwu Xu (PITT), Tianwei Lin (Baidu), Jie Wu (Bytedance), Jin Ye (Shanghai Lab), Junjun He (Shanghai Lab), Chuang Gan (MIT-IBM Watson Lab), Yihao Liu (SIAT, CAS), Zhihong Pan (Baidu Research USA), Chang Liu (Tsinghua University), Zhun Sun (Baidu), Rui Su (Shanghai Lab), Xiaohan Wang (Zhejiang University), Peihao Chen (SCUT), Zhikang Zou (Baidu), Xiaoqing Ye (Baidu), Zhenbo Xu (Beihang University)

    Last Updated on 6th Oct, 2021

    Published with GitHub Pages