|
Research
My research interests lie at the intersection of multi-modal learning and
computer vision with the long-term goal of empowering computational models to
better perceive and interact
with the 3D visual world.
Currently, I'm working on:
- 3D shape understanding, 3D object detection for autonomous driving
- Multi-modal learning for 3D perception
- Spatial intelligence in VLMs
|
News
- [2026.02] Two first-author papers accepted by CVPR 2026!
- [2026.01] I will join TikTok as a Research Intern in summer 2026! See you in
San Jose!
- [2025.07] One paper accepted by ICCV 2025!
- [2024.08] Joined MSU as a Ph.D. student!
- [2024.06] Honored with the Outstanding Graduate Thesis Award at XJTU!
- [2024.02] One first-author paper accepted by CVPR 2024!
- [2023.11] Honored with the National Scholarship!
|
|
|
Towards Intrinsic-Aware Monocular 3D Object Detection
Zhihao Zhang,
Abhinav Kumar,
Xiaoming Liu
CVPR 2026
Project Page
/
Code
/
Paper
|
|
|
Unleashing the Power of Chain-of-Prediction for Monocular 3D Object Detection
Zhihao Zhang,
Abhinav Kumar,
Girish Chandar Ganesan,
Xiaoming Liu
CVPR 2026
Project Page
/
Code
/
Paper
|
|
|
CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector
Abhinav Kumar,
Yuliang Guo,
Zhihao Zhang,
Xinyu Huang, Liu Ren, Xiaoming Liu
ICCV 2025
Project Page
/
Code
/
Paper
|
|
|
TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding
Zhihao Zhang*,
Shengcao Cao*,
Yu-Xiong Wang
CVPR 2024
Project Page
/
Code
/
Paper
|
|
|
Tile Classification Based Viewport Prediction with Multi-modal Fusion Transformer
Zhihao Zhang*,
Yiwei Chen*,
Weizhan Zhang,
Caixia Yan, Qinghua Zheng, Qi Wang, Wangdu Chen
ACM MM 2023
Project Page
/
Code
/
Paper
|
|
(* means equal contribution)
|
|