CVPR 2025

Presentation Schedule

Presentation

Schedule

The team has a total of 20 papers (including 2 orals and 3 highlights) accepted to CVPR 2025.

Paper
Editing MatAnyone: Stable Video Matting with Consistent Memory Propagation P. Yang, S. Zhou, J. Zhao, Q. Tao, C. C. Loy in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page] [Demo]
EditingSegmentation EdgeTAM: On-Device Track Anything Model C. Zhou, C. Zhu, Y. Xiong, S. Suri, F. Xiao, L. Wu, R. Krishnamoorthi, B. Dai, C. C. Loy, V. Chandra, B. Soran in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
Restoration SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration J. Wang, Z. Lin, M. Wei, Y. Zhao, C. Yang, F. Xiao, C. C. Loy, L. Jiang in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR, Highlight) [arXiv] [Project Page]
Restoration Arbitrary-steps Image Super-resolution via Diffusion Inversion Z. Yue, K. Liao, C. C. Loy in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page] [Demo]
Generation Alias-free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space Y. Zhou, Z. Xiao, S. Yang, X. Pan in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR, Oral) [arXiv] [Project Page]
Generation AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers J. Guan, K. Wang, Z. Xu, Q. Yang, Y. Sun, S. He, B. Liang, Y. Cao, Y. Li, H. Feng, E. Ding, J. Wang, Y. Zhao, H. Zhou, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
3DRestoration 3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement Y. Luo, S. Zhou, Y. Lan, X. Pan, C. C. Loy in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
3DEditing DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting L. Shen, T. Liu, H. Sun, J. Li, Z. Cao, W. Li, C. C. Loy in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
3D MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention Y. Wang, F. Hong, S. Yang, L. Jiang, W. Wu, C. C. Loy in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
3D SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE Y. Chen, Y. Lan, S. Zhou, T. Wang, X. Pan in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
3D WildAvatar: Learning In-the-wild 3D Avatars from the Web Z. Huang, S. Hu, G. Wang, T. Liu, Y. Zang, Z. Cao, W. Li, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
3D GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation H. Xie, Z. Chen, F. Hong, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
3D 3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion Z. Chen, J. Tang, Y. Dong, Z. Cao, F. Hong, Y. Lan, T. Wang, H. Xie, T. Wu, S. Saito, L. Pan, D. Lin, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR, Highlight) [arXiv] [Project Page]
3D Disco4D: Disentangled 4D Human Generation and Animation from a Single Image H. E. Pang, S. Liu, Z. Cai, L. Yang, T. Zhang, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
3D Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion Z. He, T. Wang, X. Huang, X. Pan, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
Multimodal3D SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters J. Jiang, W. Xiao, Z. Lin, H. Zhang, T. Ren, Y. Gao, Z. Lin, Z. Cai, L. Yang, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
MultimodalSegmentation F-LMM: Grounding Frozen Large Multimodal Models S. Wu, S. Jin, W. Zhang, L. Xu, W. Liu, W. Li, C. C. Loy in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
Multimodal EgoLife: Towards Egocentric Life Assistant J. Yang, S. Liu, H. Guo, Y. Dong, X. Zhang, S. Zhang, P. Wang, Z. Zhou, B. Xie, Z. Wang, B. Ouyang, Z. Lin, M. Cominelli, Z. Cai, B. Li, Y. Zhang, P. Zhang, F. Hong, J. Widmer, F. Gringoli, L. Yang, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR) [arXiv] [Project Page]
Multimodal Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Y. Dong, Z. Liu, H.-L. Sun, J. Yang, W. Hu, Y. Rao, Z. Liu in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR, Highlight) [arXiv] [Project Page]
Multimodal EgoLM: Multi-Modal Language Model of Egocentric Motions F. Hong, V. Guzov, H. J. Kim, Y. Ye, R. Newcombe, Z. Liu, L. Ma in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2025 (CVPR, Oral) [arXiv] [Project Page]