|
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
Wenhao Wang,
Yi Yang
Arxiv, 2024
arXiv /
Project /
Hugging Face /
Wisemodel /
bibtex /
Zhihu
✨ Top10/119,507 in the Hugging Face Dataset Trending List on Mar. 15th 2024.
VidProM is the first dataset featuring 1.67 million unique text-to-video prompts and 6.69 million videos generated from 4 different state-of-the-art diffusion models. It inspires many exciting new research areas, such as Text-to-Video Prompt Engineering, Efficient Video Generation, Fake Video Detection, and Video Copy Detection for Diffusion Models.
|
|
TransHP: Image Classification with Hierarchical Prompting
Wenhao Wang,
Yifan Sun,
Wei Li,
Yi Yang
NeurIPS, 2023
arXiv /
Code /
bibtex /
poster
This paper explores a hierarchical prompting mechanism for the hierarchical image classification (HIC) task. Different from prior HIC methods, our hierarchical prompting is the first to explicitly inject ancestor-class information as a tokenized hint that benefits the descendant-class discrimination.
|
|
A Benchmark and Asymmetrical-Similarity Learning for Practical Image Copy Detection
Wenhao Wang,
Yifan Sun,
Yi Yang
AAAI, 2023 (Oral)
arXiv /
Dataset&Code /
bibtex /
poster
We contribute a new ICD dataset, i.e., Negative-Distractor for Edited Copy (NDEC), with emphasis on the seldom-noticed hard negative problem. We propose a novel Asymmetric-Similarity Learning (ASL) method for ICD.
|
|
Attentive WaveBlock: Complementarity-enhanced Mutual Networks for Unsupervised Domain Adaptation in Person Re-identification and Beyond
Wenhao Wang,
Fang Zhao,
Shengcai Liao,
Ling Shao
TIP, 2022
arXiv /
Code /
bibtex
This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB), which can be integrated into the dual networks of mutual learning to enhance the complementarity.
|
|
Learning Anchored Unsigned Distance Functions with Gradient Direction Alignment for
Single-view Garment Reconstruction
Fang Zhao,
Wenhao Wang,
Shengcai Liao,
Ling Shao
ICCV, 2021 (Oral)
arXiv /
Code /
bibtex
We propose a novel learnable Anchored Unsigned Distance Function (AnchorUDF)
representation for 3D garment reconstruction from a single image.
|
|
DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations
Wenhao Wang,
Shengcai Liao,
Fang Zhao,
Cuicui Kang,
Ling Shao
BMVC, 2021
arXiv /
Code /
bibtex
We propose a new person re-identification task, i.e. how to use labeled synthetic dataset and unlabeled real-world dataset to train
a universal model. A DomainMix framework is introduced to give a basic solution to the task.
|
|
Meta AI Video Similarity Challenge: Descriptor Track
Wenhao Wang,
Yifan Sun,
Yi Yang
CVPR, 2023 (Rank 2)
Introduction /
Solution /
Code /
Presentation
We propose Feature-Compatible Progressive Learning (FCPL), which trains various models that produce mutually-compatible features.
|
|
Meta AI Video Similarity Challenge: Matching Track
Wenhao Wang,
Yifan Sun,
Yi Yang
CVPR, 2023 (Rank 2)
Introduction /
Solution /
Code /
Presentation
We use Temporal Network (TN) to ensemble the features from the descriptor track directly.
|
|
FGVC9: eBay eProduct Visual Search Challenge
Wenhao Wang,
Yifan Sun,
Zongxin Yang,
Yi Yang
CVPR, 2022 (Rank 1)
Introduction /
Solution /
Code /
Certificate
The paper demonstrates the effectiveness of vision-language models in product retrieval tasks for the first time.
|
|
Facebook AI Image Similarity Challenge: Matching Track
Wenhao Wang,
Yifan Sun,
Weipu Zhang,
Yi Yang
NeurIPS, 2021 (Rank 1)
Introduction /
Solution /
Code /
Presentation
In this paper, a data-driven and local-verification approach is proposed.
|
|
Facebook AI Image Similarity Challenge: Descriptor Track
Wenhao Wang,
Yifan Sun,
Weipu Zhang,
Yi Yang
NeurIPS, 2021 (Rank 3)
Introduction /
Solution /
Code /
Presentation
In this paper, a bag of tricks and a strong baseline are proposed for image copy detection.
|
|
The 3rd Large-scale Video Object Segmentation Challenge: Video Object Segmentation Track
Zongxin Yang,
Jian Zhang,
Wenhao Wang,
etc
CVPR, 2021 (Rank 1)
Introduction /
Solution /
Code /
Certificate
This paper investigates how to realize better and more efficient embedding learning to tackle the semi-supervised video object segmentation under challenging multi-object
scenarios.
|
Professional Activities
|
Journal Reviewer of Transactions on Pattern Analysis and Machine Intelligence, International Journal of Computer Vision, Transactions on Image Processing, Transactions on Circuits and Systems for Video Technology, Knowledge-Based Systems, Transactions on Intelligent Transportation Systems, IEEE/CAA Journal of Automatica Sinica,
Transactions on Big Data, Transactions on Artificial Intelligence, Journal of Visual Communication and Image Representation, and Neural Networks.
Conference Reviewer of ICLR, ICML, NeurIPS, CVPR, ICCV, ECCV, AAAI, and ACM MM.
|
|