Code and Datasets

Featured

Codebases

StableSR

Exploiting diffusion prior for real-world image super-resolution.

MARCONet

Using generative structure prior for blind text image super-resolution.

Vtoonify

Generating controllable and high-quality artistic portrait videos.

CodeFormer

A blind face restoration algorithm for enhancing old photos and fixing AI arts.

K-Net

A unified, simple, and effective framework to address semantic segmentation, instance segmentation and panoptic segmentation.

Zero-DCE

Zero-Reference Deep Curve Estimation (Zero-DCE) formulates light enhancement as a task of image-specific curve estimation with a deep network. The method generalizes well to diverse lighting conditions.

MMSelfSup

MMSelfSup, which was upgraded from OpenSelfSup, supports popular and contemporary self-supervised learning methods such as MoCo, MoCo v2, SimCLR, SwAV, ODC, BYOL, and SimSiam.

MMHuman3D

MMHuman3D is an open source PyTorch-based codebase for the use of 3D human parametric models in computer vision and computer graphics.

MMDetection

MMDetection is an open source object detection toolbox that supports popular and contemporary detection frameworks, e.g., Faster RCNN, Mask RCNN, and RetinaNet. Easy-to-extend and highly efficient.

MMDetection3D

MMDetection3D supports multi-modality/single-modality 3D detectors out of box. It directly supports popular indoor and outdoor 3D detection datasets, including ScanNet, SUNRGB-D, Waymo, nuScenes, Lyft, and KITTI.

MMagic

MMMagic combines the pooular MMEditing and MMGeneration libraries.

OpenMMLab

As an open source project for academic research and industrial applications, OpenMMLab covers a wide range of libraries to facilitate research on various computer vision topics, e.g., classification, detection, segmentation and super-resolution. Join the OpenMMLab developer community to contribute, learn, and get your questions answered.

Featured

Datasets

MeViS

AnimeRun is a large-scale benchmark for video segmentation with motion expressions.

OmniObject3D

OmniObject3D is a large vocabulary 3D object dataset with massive high-quality real-scanned 3D objects to facilitate the development of 3D perception, reconstruction, and generation in the real world.

DNA-Rendering

DNA-Rendering presents a large-scale, high-fidelity repository of neural actor rendering represented by neural implicit fields of human actors.

AnimeRun

AnimeRun is derived from 3D movies with pixel-wise and region-wise correspondence labels to facilitate research in 2D animation visual correspondence.

Flare7K

Flare7K offers 5,000 scattering flare images and 2,000 reflective flare images for research in nighttime flare removal.

LOL-Blur

LOL-Blur contains 12,000 low-blur/normal-sharp pairs with diverse darkness and motion blurs in different scenarios.

StyleGAN-Human

SHHQ is a dataset with high-quality full-body human images in a resolution of 1024 × 512.

CelebV-HQ

CelebV-HQ contains 35,666 video clips involving 15,653 identities and 83 manually labeled facial attributes covering appearance, action, and emotion.

OmniBenchmark

OmniBenchmark is a diverse (21 semantic realm-wise datasets) and concise (realm-wise datasets have no concepts overlapping) benchmark for evaluating pre-trained model generalization across semantic super-concepts/realms.

Panoptic Scene Graph

PSG dataset has 48749 images with 133 object classes (80 objects and 53 stuff) and 56 predicate classes. It annotates inter-segment relations based on COCO panoptic segmentation.

GTA-Human

GTA-Human, a mega-scale and highly-diverse 3D human dataset generated with the GTA-V game engine, featuring a rich set of subjects, actions, and scenarios.

CelebA-Dialog

CelebA-Dialog is a large-scale visual-language face dataset. Facial images are annotated with rich fine-grained labels. Each image comes with captions that describe its attributes and a sample of user request.

ATD-12K

ATD-12K is a large-scale dataset that facilitates the training and evaluation of animation video interpolation methods. It contains 10,000 animation frame triplets and a test set of 2,000 triplets, collected from a variety of animation movies.

MessyTable

A challenging dataset that features a large number of scenes with messy tables captured from multiple camera views. Each scene in this dataset is highly complex, containing multiple object instances that could be identical, stacked and occluded by other instances. The key challenge is to associate all instances given the RGB image of all views. Over 50K images with 1.2M bounding box annotations.

Webly-Reference Super-Resolution

Webly-Reference SR dataset is a test dataset for evaluating reference-based super-resolution approaches. The dataset covers diverse categories including outdoor scenes, indoor scenes, buildings, famous landmarks, animals and plants.

Under-Display Camera Images

Synthetic and real images for the research on under-display camera restoration. UDC systems introduce a new class of complex image degradation problems, combining flare, haze, blur, and noise.

Multi-View Partial (MVP) Point Cloud Dataset

The dataset contains over 100,000 high-quality scans, obtained by rendering partial 3D shapes from 26 uniformly distributed camera poses for each 3D CAD model. For research on point cloud completion.

NTURGBD-Parsing-4K

A multi-modality human perception dataset that contains i) diverse poses and actions, ii) both RGB and depth images, and iii) fine-grained human part parsing annotations.

DeeperForensics

DeeperForensics is a large-scale face forgery detection dataset with 60, 000 videos constituted by a total of 17.6 million frames. Extensive perturbations are applied to obtain a more challenging benchmark of larger scale and higher diversity. All source videos in DeeperForensics are carefully collected, and fake videos are generated by a newly proposed end-to-end face swapping framework.

ForgeryNet

The dataset contains 2.9 million images and 221,247 videos for the research of anti-deepfake. Manipulations are achieved using seven image-level approaches and eight video-level approaches. For the research on forgery detection.