ECCV-2024-Papers-Autonomous-Driving

We will promptly include more related works in this repository. Please stay tuned!!!

We also kindly invite you to our platform, Auto Driving Heart, for paper interpretation and sharing. If you would like to promote your work, please feel free to contact me.

1) End to End | 端到端自动驾驶

GenAD: Generative End-to-End Autonomous Driving

paper: https://arxiv.org/pdf/2402.11502
code: https://github.com/wzzheng/GenAD

CarFormer: Self-Driving with Learned Object-Centric Representations

paper: https://arxiv.org/pdf/2407.15843
code: https://github.com/Shamdan17/CarFormer

2）LLM Agent | 大语言模型智能体

DriveLM: Driving with Graph Visual Question Answering

paper: https://arxiv.org/pdf/2312.14150
code: https://github.com/OpenDriveLab/DriveLM

ELM: Embodied Understanding of Driving Scenarios

paper: https://arxiv.org/pdf/2403.04593
code: https://github.com/OpenDriveLab/ELM

Controllable Navigation Instruction Generation with Chain of Thought Prompting

paper: coming soon
code: https://github.com/refkxh/C-Instructor

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

paper: coming soon
code: coming soon

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

paper: https://arxiv.org/pdf/2403.19589
code: https://github.com/jxbbb/TOD3Cap

Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection

paper: coming soon
code: https://github.com/GradiusTwinbee/GLIS

Dolphins: Multimodal Language Model for Driving

paper: https://arxiv.org/pdf/2312.00438
code: https://github.com/SaFoLab-WISC/Dolphins

3）SSC: Semantic Scene Completion | 语义场景补全

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

paper: https://arxiv.org/pdf/2407.02077
code: https://github.com/Arlo0o/HTCL

Pyramid Diffusion for Fine 3D Large Scene Generation

4）OCC: Occupancy Prediction | 占用感知

Fully Sparse 3D Occupancy Prediction

paper: https://arxiv.org/pdf/2312.17118
code: https://github.com/MCG-NJU/SparseOcc

GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction

paper: https://arxiv.org/pdf/2405.17429
code: https://github.com/huang-yh/GaussianFormer

Occupancy as Set of Points

paper: https://arxiv.org/pdf/2407.04049
code: https://github.com/hustvl/osp

Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation

paper: https://arxiv.org/pdf/2407.13362
code: https://github.com/Wang-pengfei/GGSD

ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

paper: https://arxiv.org/pdf/2405.04299
code: https://github.com/ViewFormerOcc/ViewFormer-Occ

nuCraft: Crafting High Resolution 3D Semantic Occupancy for Unified 3D Scene Understanding

paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/00730.pdf
code: coming soon

5) World Model | 世界模型

OccWorld: 3D World Model for Autonomous Driving

paper: https://arxiv.org/pdf/2311.16038
code: https://github.com/wzzheng/OccWorld

Modelling Competitive Behaviors in Autonomous Driving Under Generative World Model

paper: coming soon
code: coming soon

DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

paper: https://arxiv.org/pdf/2309.09777
code: https://github.com/JeffWang987/DriveDreamer

6）HD-Mapping

MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping

paper: https://arxiv.org/pdf/2403.15951
code: https://github.com/woodfrog/maptracker

ADMap: Anti-disturbance framework for reconstructing online vectorized HD map

paper: coming soon
code: https://github.com/hht1996ok/ADMap

Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention

paper: coming soon
code: https://github.com/alfredgu001324/MapBEVPrediction

Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction

paper: https://arxiv.org/pdf/2402.17430
code: https://github.com/HXMap/MapQR

7）Foundation Model

PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects

paper: coming soon
code: coming soon

8）Robust Perception

Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather

R2-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations

paper: coming soon
code: https://github.com/lxa9867/r2bench

9）3D Object Detection | 三维目标检测

Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance

paper: https://arxiv.org/pdf/2312.07530
code: https://github.com/KuanchihHuang/VG-W3D

GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection

paper: https://arxiv.org/pdf/2403.11848
code: https://github.com/adept-thu/GraphBEV

RecurrentBEV: A Long-term Temporal Fusion Framework for Multi-view 3D Detection

paper: coming soon
code: https://github.com/lucifer443/RecurrentBEV

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

paper: https://arxiv.org/pdf/2402.03634
code: https://github.com/LiewFeng/RayDN

MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection

paper: coming soon
code: https://github.com/VisualAIKHU/MonoWAD

DualBEV: CNN is All You Need in View Transformation

paper: https://arxiv.org/pdf/2403.05402
code: https://github.com/PeidongLi/DualBEV

OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection

paper: coming soon
code: https://github.com/AlmoonYsl/OPEN

Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression

paper: coming soon
code: coming soon

SEED: A Simple and Effective 3D DETR in Point Clouds

paper: coming soon
code: coming soon

Towards Stable 3D Object Detection

paper: https://arxiv.org/pdf/2407.04305
code: https://github.com/jbwang1997/StabilityIndex

FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection

paper: coming soon
code: https://github.com/CocoBoom/fsd-bev

HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras

paper: https://arxiv.org/pdf/2404.02517
code: https://github.com/VDIGPKU/HENet

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

paper: https://arxiv.org/pdf/2407.10876
code: https://github.com/jbji/RepVF

Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene

paper: https://arxiv.org/abs/2407.08569
code: https://github.com/Ruiyang-061X/LiSe

SimPB: A Single Model for 2D and 3D Object Detection from Multiple Cameras

paper: https://arxiv.org/pdf/2403.10353
code: https://github.com/nullmax-vision/SimPB

Interactive 3D Object Detection with Prompts

paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/02556.pdf
code: coming soon

CSOT: Cross-Scan Object Transfer for Semi-Supervised LiDAR Object Detection

Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene

paper: https://arxiv.org/pdf/2407.08569
code: coming soon

10）Domain Adaptation & Test-Time Adaptation

Enhancing Source-Free Domain Adaptive Object Detection with Low-Confidence Pseudo-Label Distillation

paper: coming soon
code: https://github.com/junia3/LPLD

Fully Test-Time Adaptation for Monocular 3D Object Detection

paper: coming soon
code: https://github.com/Hongbin98/MonoTTA

Progressive Classifier and Feature Extractor Adaptation for Unsupervised Domain Adaptation on Point Clouds

paper: https://arxiv.org/pdf/2303.01276
code: https://github.com/xiaoyao3302/PCFEA

CMD: A Cross Mechanism Domain Adaptation Dataset for 3D Object Detection

paper: coming soon
code: https://github.com/im-djh/CMD

11）Cooperative Perception | 协同感知

Plug and Play: A Representation Enhanced Domain Adapter for Collaborative Perception

paper: coming soon
code: https://github.com/luotianyou349/PnPDA

Align before Collaborate: Mitigating Feature Misalignment for Robust Multi-Agent Perception

paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/00560.pdf
code: coming soon

12）World Model

Neural Volumetric World Models for Autonomous Driving

paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/02571.pdf
code: coming soon

13）Scene Flow Estimation | 场景流估计

4D Contrastive Superflows are Dense 3D Representation Learners

paper: https://arxiv.org/pdf/2407.06190
code: https://github.com/Xiangxu-0103/SuperFlow

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

paper: https://arxiv.org/pdf/2407.01702
code: https://github.com/KTH-RPL/SeFlow

I Can't Believe It's Not Scene Flow!

14）Point Cloud Semantic Segmentation| 点云

T-CorresNet: Template Guided 3D Point Cloud Completion with Correspondence Pooling Query Generation Strategy

paper: coming soon
code: https://github.com/df-boy/T-CorresNet

Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation

paper: https://arxiv.org/pdf/2407.12489
code: https://github.com/RikkiXu/NCD_PC

SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds

paper: https://arxiv.org/pdf/2407.11569
code: coming soon

RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation

paper: https://arxiv.org/pdf/2407.10159
code: coming soon

ItTakesTwo: Leveraging Peer Representations for Semi-supervised LiDAR Semantic Segmentation

paper: https://arxiv.org/pdf/2407.07171
code: coming soon

Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather

MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty

paper: https://arxiv.org/pdf/2401.12761
code: https://github.com/timbroed/MUSES

Train Till You Drop: Towards Stable and Robust Source-free Unsupervised 3D Domain Adaptation

T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning

paper: https://arxiv.org/pdf/2312.10217
code: coming soon

15) Generative Model

RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios

paper: https://arxiv.org/pdf/2312.13303
code: https://realgen.github.io/

SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic

paper: https://arxiv.org/pdf/2403.17933
code: https://github.com/autonomousvision/sledge

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

paper: https://arxiv.org/pdf/2408.09702
code: https://research.nvidia.com/labs/toronto-ai/DiPIR/

16) Optical Flow

SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow

paper: https://arxiv.org/pdf/2405.14793
code: https://github.com/princeton-vl/SEA-RAFT

17）Radar | 毫米波雷达

Sparse Beats Dense: Rethinking Supervision in Radar-Camera Depth Completion

paper: https://arxiv.org/pdf/2312.00844
code: https://github.com/megvii-research/Sparse-Beats-Dense

18）Nerf Gaussian Splatting

Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

paper: https://arxiv.org/pdf/2401.01339
code: https://github.com/zju3dv/street_gaussians

MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images

paper: https://arxiv.org/pdf/2403.14627
code: https://github.com/donydchen/mvsplat

GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

paper: https://arxiv.org/pdf/2404.13679
code: https://github.com/W-Ted/GScream

BeNeRF: Neural Radiance Fields from a Single Blurry Image and Event Stream

paper: coming soon
code: https://github.com/WU-CVGL/BeNeRF

PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF Priors

paper: https://arxiv.org/pdf/2403.09079
code: https://github.com/yuantianyuan01/PreSight

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

paper: https://arxiv.org/pdf/2403.08551
code: https://github.com/Xinjie-Q/GaussianImage

SG-NeRF: Neural Surface Reconstruction with Scene Graph Optimization

paper: coming soon
code: https://github.com/Iris-cyy/SG-NeRF

Disentangled Generation and Aggregation for Robust Radiance Fields

paper: coming soon
code: https://github.com/GaoHchen/Robust-Triplane

RPBG: Towards Robust Neural Point-based Graphics in the Wild

paper: https://arxiv.org/pdf/2405.05663
code: coming soon

19）Object Tracking | 目标跟踪

Beyond MOT: Semantic Multi-Object Tracking

paper: coming soon
code: https://github.com/HengLan/SMOT

3D Single-object Tracking in Point Clouds with High Temporal Variation

paper: https://arxiv.org/pdf/2408.02049
code: coming soon

OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers

paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/01174.pdf
code: coming soon

Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Object Appearance Graphs

Boosting 3D Single Object Tracking with 2D Matching Distillation and 3D Pre-training

paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/01900.pdf
code: coming soon

20）Lane Detection | 车道线检测

OMR: Occlusion-Aware Memory-Based Refinement for Video Lane Detection

paper: https://arxiv.org/pdf/2408.07486
code: https://github.com/dongkwonjin/OMR

RoadPainter: Points Are Ideal Navigators for Topology transformER

paper: https://arxiv.org/pdf/2407.15349
code: coming soon

21) Motion Prediction | 运动预测

22) Trajectory Prediction | 轨迹预测

VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions

paper: https://arxiv.org/pdf/2407.12345
code: https://moonseokha.github.io/VisionTrap/

Risk-Aware Self-Consistent Imitation Learning for Trajectory Planning in Autonomous Driving

paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/02087.pdf
code: coming soon

23) Depth Estimation | 深度估计

Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation

paper: coming soon
code: https://github.com/zhyever/PatchRefiner

ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion

paper: https://arxiv.org/pdf/2407.09303
code: https://github.com/Sungmin-Woo/ProDepth

Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions

paper: https://arxiv.org/pdf/2407.16698
code: https://github.com/fabiotosi92/Diffusion4RobustDepth

Mono-ViFI: A Unified Learning Framework for Self-supervised Single- and Multi-frame Monocular Depth Estimation

paper: https://arxiv.org/pdf/2407.14126
code: https://github.com/LiuJF1226/Mono-ViFI

M2Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation

paper: https://arxiv.org/pdf/2405.02004
code: https://heiheishuang.xyz/M2Depth/

24) Event Camera | 事件相机

25) Odometry

DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment

paper: coming soon
code: https://github.com/IRMVLab/DVLO

26）Normalized Object Coordinates

OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects

paper: https://arxiv.org/pdf/2407.08711
code: https://omninocs.github.io/

Postscript

This list of papers is primarily curated by Rujia Wang.

If you have any questions about the paper list, please do not hesitate to email me and [Auto Driving Heart Team] or open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ECCV-2024-Papers-Autonomous-Driving

We will promptly include more related works in this repository. Please stay tuned!!!

1) End to End | 端到端自动驾驶

2）LLM Agent | 大语言模型智能体

3）SSC: Semantic Scene Completion | 语义场景补全

4）OCC: Occupancy Prediction | 占用感知

5) World Model | 世界模型

6）HD-Mapping

7）Foundation Model

8）Robust Perception

9）3D Object Detection | 三维目标检测

10）Domain Adaptation & Test-Time Adaptation

11）Cooperative Perception | 协同感知

12）World Model

13）Scene Flow Estimation | 场景流估计

14）Point Cloud Semantic Segmentation| 点云

15) Generative Model

16) Optical Flow

17）Radar | 毫米波雷达

18）Nerf Gaussian Splatting

19）Object Tracking | 目标跟踪

20）Lane Detection | 车道线检测

21) Motion Prediction | 运动预测

22) Trajectory Prediction | 轨迹预测

23) Depth Estimation | 深度估计

24) Event Camera | 事件相机

25) Odometry

26）Normalized Object Coordinates

Postscript

About

Releases

Packages

autodriving-heart/ECCV-2024-Papers-Autonomous-Driving

Folders and files

Latest commit

History

Repository files navigation

ECCV-2024-Papers-Autonomous-Driving

We will promptly include more related works in this repository. Please stay tuned!!!

1) End to End | 端到端自动驾驶

2）LLM Agent | 大语言模型智能体

3）SSC: Semantic Scene Completion | 语义场景补全

4）OCC: Occupancy Prediction | 占用感知

5) World Model | 世界模型

6）HD-Mapping

7）Foundation Model

8）Robust Perception

9）3D Object Detection | 三维目标检测

10）Domain Adaptation & Test-Time Adaptation

11）Cooperative Perception | 协同感知

12）World Model

13）Scene Flow Estimation | 场景流估计

14）Point Cloud Semantic Segmentation| 点云

15) Generative Model

16) Optical Flow

17）Radar | 毫米波雷达

18）Nerf Gaussian Splatting

19）Object Tracking | 目标跟踪

20）Lane Detection | 车道线检测

21) Motion Prediction | 运动预测

22) Trajectory Prediction | 轨迹预测

23) Depth Estimation | 深度估计

24) Event Camera | 事件相机

25) Odometry

26）Normalized Object Coordinates

Postscript

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages