Wenhao Wang

I am a Ph.D. student in ReLER, AAII, University of Technology Sydney, supervised by Yi Yang. My research interest is visual copy detection, deep metric learning and computer vision. Before ReLER, I was a Research Assistant in Baidu Research co-supervised by Yifan Sun and Yi Yang. Prior to Baidu, I was a Remote Research Intern in Inception Institute of Artificial Intelligence (IIAI) from 2020 to 2021, where I was supervised by Fang Zhao and Shengcai Liao. I gained my bachelor degree from Beihang University in 2021 with Shenyuan Medal (Top 10 Undergraduate).

Email  /  Google Scholar  /  OpenReview  /  Github  /  Twitter

profile photo
Selected Research Papers
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
Wenhao Wang, Yi Yang
Arxiv, 2024
arXiv / Project / Hugging Face / Wisemodel / bibtex / Zhihu

Top10/119,507 in the Hugging Face Dataset Trending List on Mar. 15th 2024.

VidProM is the first dataset featuring 1.67 million unique text-to-video prompts and 6.69 million videos generated from 4 different state-of-the-art diffusion models. It inspires many exciting new research areas, such as Text-to-Video Prompt Engineering, Efficient Video Generation, Fake Video Detection, and Video Copy Detection for Diffusion Models.

TransHP: Image Classification with Hierarchical Prompting
Wenhao Wang, Yifan Sun, Wei Li, Yi Yang
NeurIPS, 2023
arXiv / Code / bibtex / poster

This paper explores a hierarchical prompting mechanism for the hierarchical image classification (HIC) task. Different from prior HIC methods, our hierarchical prompting is the first to explicitly inject ancestor-class information as a tokenized hint that benefits the descendant-class discrimination.

A Benchmark and Asymmetrical-Similarity Learning for Practical Image Copy Detection
Wenhao Wang, Yifan Sun, Yi Yang
AAAI, 2023 (Oral)
arXiv / Dataset&Code / bibtex / poster

We contribute a new ICD dataset, i.e., Negative-Distractor for Edited Copy (NDEC), with emphasis on the seldom-noticed hard negative problem. We propose a novel Asymmetric-Similarity Learning (ASL) method for ICD.

Attentive WaveBlock: Complementarity-enhanced Mutual Networks for Unsupervised Domain Adaptation in Person Re-identification and Beyond
Wenhao Wang, Fang Zhao, Shengcai Liao, Ling Shao
TIP, 2022
arXiv / Code / bibtex

This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB), which can be integrated into the dual networks of mutual learning to enhance the complementarity.

Learning Anchored Unsigned Distance Functions with Gradient Direction Alignment for Single-view Garment Reconstruction
Fang Zhao, Wenhao Wang, Shengcai Liao, Ling Shao
ICCV, 2021 (Oral)
arXiv / Code / bibtex

We propose a novel learnable Anchored Unsigned Distance Function (AnchorUDF) representation for 3D garment reconstruction from a single image.

DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations
Wenhao Wang, Shengcai Liao, Fang Zhao, Cuicui Kang, Ling Shao
BMVC, 2021
arXiv / Code / bibtex

We propose a new person re-identification task, i.e. how to use labeled synthetic dataset and unlabeled real-world dataset to train a universal model. A DomainMix framework is introduced to give a basic solution to the task.





Competitions
Meta AI Video Similarity Challenge: Descriptor Track
Wenhao Wang, Yifan Sun, Yi Yang
CVPR, 2023 (Rank 2)
Introduction / Solution / Code / Presentation

We propose Feature-Compatible Progressive Learning (FCPL), which trains various models that produce mutually-compatible features.

Meta AI Video Similarity Challenge: Matching Track
Wenhao Wang, Yifan Sun, Yi Yang
CVPR, 2023 (Rank 2)
Introduction / Solution / Code / Presentation

We use Temporal Network (TN) to ensemble the features from the descriptor track directly.

FGVC9: eBay eProduct Visual Search Challenge
Wenhao Wang, Yifan Sun, Zongxin Yang, Yi Yang
CVPR, 2022 (Rank 1)
Introduction / Solution / Code / Certificate

The paper demonstrates the effectiveness of vision-language models in product retrieval tasks for the first time.

Facebook AI Image Similarity Challenge: Matching Track
Wenhao Wang, Yifan Sun, Weipu Zhang, Yi Yang
NeurIPS, 2021 (Rank 1)
Introduction / Solution / Code / Presentation

In this paper, a data-driven and local-verification approach is proposed.

Facebook AI Image Similarity Challenge: Descriptor Track
Wenhao Wang, Yifan Sun, Weipu Zhang, Yi Yang
NeurIPS, 2021 (Rank 3)
Introduction / Solution / Code / Presentation

In this paper, a bag of tricks and a strong baseline are proposed for image copy detection.

The 3rd Large-scale Video Object Segmentation Challenge: Video Object Segmentation Track
Zongxin Yang, Jian Zhang, Wenhao Wang, etc
CVPR, 2021 (Rank 1)
Introduction / Solution / Code / Certificate

This paper investigates how to realize better and more efficient embedding learning to tackle the semi-supervised video object segmentation under challenging multi-object scenarios.

Professional Activities

Journal Reviewer of Transactions on Pattern Analysis and Machine Intelligence, International Journal of Computer Vision, Transactions on Image Processing, Transactions on Circuits and Systems for Video Technology, Knowledge-Based Systems, Transactions on Intelligent Transportation Systems, IEEE/CAA Journal of Automatica Sinica, Transactions on Big Data, Transactions on Artificial Intelligence, Journal of Visual Communication and Image Representation, and Neural Networks.

Conference Reviewer of ICLR, ICML, NeurIPS, CVPR, ICCV, ECCV, AAAI, and ACM MM.