Publications

Conferences: ICCV (4), CVPR (4), ECCV (4), AAAI (4), IJCAI (1), ACMMM (3) -- Oral x2, Highlight x1

Journals: IJCV (1)

( *Co-first Author, *Corresponding Author)


Preprints:


  1. DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM
    Yixuan Wu, Yizhou Wang, Shixiang Tang, Wenhao Wu, Tong He, Wanli Ouyang, Jian Wu, Philip Torr
    Technical Report, ArXiv:2403.12488   [ PDF ]
  2. GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition
    Guangzhao Dai, Xiangbo Shu, Wenhao Wu
    Technical Report, ArXiv:2401.10039   [ PDF ]
  3. GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?
    Wenhao Wu, Huanjin Yao, Mengxi Zhang, Yuxin Song, Wanli Ouyang, Jingdong Wang
    Technical Report, ArXiv:2311.15732   [ PDF ] [ Code ]
  4. Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
    Huanjin Yao, Wenhao Wu**, Zhiheng Li
    Technical Report, ArXiv:2311.15769   [ PDF ] [ Code ]
  5. It Takes Two: Masked Appearance-Motion Modeling for Self-supervised Video Transformer Pre-training
    Yuxin Song, Min Yang, Wenhao Wu, Dongliang He, Fu Li, Jingdong Wang
    Technical Report, ArXiv:2210.05234   [ PDF ]
  6. Discovering “Semantics” in Super-Resolution Networks
    Yihao Liu, Anran Liu, Jinjin Gu, Zhipeng Zhang, Wenhao Wu, Yu Qiao, Chao Dong
    Technical Report, ArXiv:2108.00406   [ PDF ] [ Code ]
  7. Color2Style: Real-Time Exemplar-Based Image Colorization with Self-Reference Learning and Deep Feature Modulation
    Henyuan Zhao*, Wenhao Wu*, Yihao Liu, Dongliang He
    Technical Report, ArXiv:2106.08017   [ PDF ] [ Code ]
  8. Temporal Action Proposal Generation with Transformers
    Lining Wang*, Haosen Yang*, Wenhao Wu*, Hongxun Yao, Hujie Huang
    Technical Report, ArXiv:2105.12043   [ PDF ]

  9. Journal Papers:


    1. Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
      Wenhao Wu, Zhun Sun, Yuxin Song, Jingdong Wang, Wanli Ouyang
      International Journal of Computer Vision (IJCV), 2023.   Impact factor: 19.5   [ PDF ] [ Code ]
    2. Rethinking 3D cost aggregation in stereo matching
      Wanshui Gan, Wenhao Wu, Shifeng Chen, Yuxiang Zhao, Pak Kin Wong
      Pattern Recognition Letters, 2023   [ PDF ] [ Code ]

    Conference Papers:


    1. What Can Simple Arithmetic Operations Do for Temporal Modeling?
      Wenhao Wu, Yuxin Song, Zhun Sun, Jingdong Wang, Chang Xu, Wanli Ouyang
      ICCV 2023   [ PDF ] [ Code ]
    2. UATVR: Uncertainty-Adaptive Text-Video Retrieval
      Bo Fang*, Wenhao Wu*, Chang Liu*, Yu Zhou, Yuxin Song, Weiping Wang, Xiangbo Shu, Xiangyang Ji, Jingdong Wang
      ICCV 2023   [ PDF ] [ Code ]
    3. Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
      Wenhao Wu, Haipeng Luo, Bo Fang, Jingdong Wang, Wanli Ouyang
      CVPR 2023    Highlight, 2.5% acceptance rate    [ PDF ] [ Code ]
    4. Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
      Wenhao Wu, Xiaohan Wang, Haipeng Luo, Jingdong Wang, Yi Yang, Wanli Ouyang
      CVPR 2023    [ PDF ] [ Code ]
    5. Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
      Wenhao Wu, Zhun Sun, Wanli Ouyang
      AAAI 2023    [ PDF ] [ Code ] [ Poster ] [ Slides ] [ Video ]
    6. AdaCM: Adaptive ColorMLP for Real-Time Universal Photo-realistic Style Transfer
      Tianwei Lin, Honglin Lin, Fu Li, Dongliang He, Wenhao Wu, Meiling Wang, Xin Li, Yong Liu
      AAAI 2023    [ PDF ]
    7. Effective Invertible Arbitrary Image Rescaling
      Zhihong Pan, Baopu Li, Dongliang He, Wenhao Wu, Errui Ding
      WACV 2023    [ PDF ] [ Code ]
    8. NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition
      Boyang Xia*, Wenhao Wu**, Haoran Wang, Rui Su, Dongliang He, Haosen Yang, Xiaoran Fan, Wanli Ouyang
      ECCV 2022    [ PDF ] [ Project ]
    9. Temporal Saliency Query Network for Efficient Video Recognition
      Boyang Xia*, Zhihao Wang*, Wenhao Wu*, Haoran Wang, Jungong Han
      ECCV 2022    [ PDF ] [ Project ]
    10. CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
      Haoran Wang, Dongliang He, Wenhao Wu, Boyang Xia, Min Yang, Fu Li, Yunlong Yu, Zhong Ji, Errui Ding, Jingdong Wang
      ECCV 2022    [ PDF ] [ Code ]
    11. MaMiCo: Macro-to-Micro Semantic Correspondence for Self-supervised Video Representation Learning
      Bo Fang*, Wenhao Wu*, Chang Liu*, Yu Zhou, Dongliang He, Weiping Wang
      ACMMM 2022   Oral, 5.0% acceptance rate   [ PDF ] [ Code ]
    12. Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image Translation
      Yanwu Xu, Shaoan Xie, Wenhao Wu, Kun Zhang, Mingming Gong, Kayhan Batmanghelich
      CVPR 2022    [ PDF ] [ Code ]
    13. Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence
      Zhihong Pan, Baopu Li, Dongliang He, Mingde Yao, Wenhao Wu, Tianwei Lin, Xin Li, Errui Ding
      CVPR 2022    [ PDF ] [ Code ]
    14. Temporal Action Proposal Generation with Background Constraint
      Haosen Yang*, Wenhao Wu*, Lining Wang, Sheng Jin, Boyang Xia, Hongxun Yao, Hujie Huang
      AAAI 2022    [ PDF ] [ Code ]
    15. ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency
      Deng Huang*, Wenhao Wu*, Weiwen Hu, Xu Liu, Dongliang He, Zhihua Wu, Xiangmiao Wu, Mingkui Tan, Errui Ding
      ICCV 2021   [ PDF ] [ Poster ] [ Slides ] [ Video ] [ Code ]
    16. DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
      Wenhao Wu, Yuxiang Zhao, Yanwu Xu, Xiao Tan, Dongliang He, Zhikang Zou, Jin Ye, Yingying Li, Mingde Yao, Zichao Dong, Yifeng Shi
      ACMMM 2021   [ PDF ] [ Poster ] [ Slides ] [ Code]
    17. Coarse to Fine: Domain Adaptive Crowd Counting via Adversarial Scoring Network
      Zhikang Zou, Xiaoye Qu, Pan Zhou, Shuangjie Xu, Xiaoqing Ye, Wenhao Wu, Jin Ye
      ACMMM 2021   [ PDF ]
    18. Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video
      Jie Wu, Wei Zhang, Guanbin Li, Wenhao Wu, Xiao Tan, Yingying Li, Errui Ding, Liang Lin
      IJCAI 2021   [ PDF ]
    19. Good Practices and A Strong Baseline for Traffic Anomaly Detection
      Yuxiang Zhao*, Wenhao Wu*, Yue He, Yingying Li, Xiao Tan, Shifeng Chen
      CVPR 2021   Workshop on AICity Challenge   Winner [ PDF ]
    20. MVFNet: Multi-View Fusion Network for Efficient Video Recognition
      Wenhao Wu, Dongliang He, Tianwei Lin, Fu Li, Chuang Gan, Errui Ding
      AAAI 2021   [ PDF ] [ Poster ] [ Slides ] [ Code ]
    21. Attention-Driven Dynamic Graph Convolutional Network for Multi-Label Image Recognition
      Jin Ye, Junjun He, Xiaojiang Peng, Wenhao Wu, Yu Qiao
      ECCV 2020   [ PDF ] [ Code ]
    22. Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
      Wenhao Wu, Dongliang He, Xiao Tan, Shifeng Chen, Yi Yang, Shilei Wen
      CVPR 2020   Workshop on Efficient Deep Learning in Computer Vision  Oral   [ PDF ] [ Slides ]
    23. Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition
      Wenhao Wu, Dongliang He, Xiao Tan, Shifeng Chen, Shilei Wen
      ICCV 2019  Oral, 4.3% acceptance rate   [ PDF ] [ Poster ] [ Slides ]