Implementation of SECOND paper for 3D Object Detection with following performance improvements:
- Full Lyft Dataset Integration
- Parallel Data Preperation with Ray
- Checkpoints during training
- Class upsampling, as in CBGS paper
- Mean IOU Computation, just like in Lyft Kaggle Competition
- Parallel Score Computation
- Debugged config usage (some configs were not trully connected to anything)
- Added PathLib Support
- Added Scripts for Evaluation, Training and Data Prep
- Handling of corrupted scenes in Lyft DataSet
This repo is based on @traveller59's second.pytorch.
Using this code and configuration, I won 27th place in 2019 Lyft 3D Object Dectection Kaggle Competition.
git clone https://github.com/traveller59/second.pytorch.git
cd ./second.pytorch/second
wget "https://github.com/Kitware/CMake/releases/download/v3.15.4/cmake-3.15.4.tar.gz"
tar xf cmake-3.15.4.tar.gz
cd ./cmake-3.15.4/
./configure
make
make install
export PATH=/usr/local/bin:$PATH
cmake --version
conda create -n env_stereo python=3.6
conda activate env_stereo
conda install pytorch==1.0.0 torchvision==0.2.1 cuda100 -c pytorch
git clone https://github.com/Michalos88/spconv --recursive
sudo apt-get install libboost-all-dev
cd spconv/
python setup.py bdist_wheel
cd ./dist/
pip install spconv-1.1-cp36-cp36m-linux_x86_64.whl
conda install scikit-image scipy numba pillow matplotlib
pip install fire tensorboardX protobuf opencv-python ray
If you want to use NuScenes dataset, you need to install nuscenes-devkit. If you want to use Lyft dataset, you need to install lyft-devkit.
you need to add following environment variable for numba.cuda, you can add them to ~/.bashrc:
export NUMBAPRO_CUDA_DRIVER=/usr/lib/x86_64-linux-gnu/libcuda.so
export NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so
export NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice
export PATHONPATH=.
- KITTI Dataset preparation
Download KITTI dataset and create some directories first:
└── KITTI_DATASET_ROOT
├── training <-- 7481 train data
| ├── image_2 <-- for visualization
| ├── calib
| ├── label_2
| ├── velodyne
| └── velodyne_reduced <-- empty directory
└── testing <-- 7580 test data
├── image_2 <-- for visualization
├── calib
├── velodyne
└── velodyne_reduced <-- empty directory
Then run
python create_data.py kitti_data_prep --data_path=KITTI_DATASET_ROOT
- NuScenes Dataset preparation
Download NuScenes dataset:
└── NUSCENES_TRAINVAL_DATASET_ROOT
├── samples <-- key frames
├── sweeps <-- frames without annotation
├── maps <-- unused
└── v1.0-trainval <-- metadata and annotations
└── NUSCENES_TEST_DATASET_ROOT
├── samples <-- key frames
├── sweeps <-- frames without annotation
├── maps <-- unused
└── v1.0-test <-- metadata
Then run
python create_data.py nuscenes_data_prep --data_path=NUSCENES_TRAINVAL_DATASET_ROOT --version="v1.0-trainval" --max_sweeps=10
python create_data.py nuscenes_data_prep --data_path=NUSCENES_TEST_DATASET_ROOT --version="v1.0-test" --max_sweeps=10
--dataset_name="NuscenesDataset"
- Lyft Dataset preparation
Download NuScenes dataset:
└── ../lyft_data/train
├── samples <-- key frames
├── sweeps <-- frames without annotation
├── maps <-- unused
└── v1.0-trainval <-- metadata and annotations
└── ../lyft_data/test
├── samples <-- key frames
├── sweeps <-- frames without annotation
├── maps <-- unused
└── v1.0-test <-- metadata
Then run
python create_data.py lyft_data_prep --data_path=../lyft_data/train --version="v1.0-trainval"
python create_data.py nuscenes_data_prep --data_path=../lyft_data/testT --version="v1.0-test"
--dataset_name="LyftDataset"
I recommend to use script.py to train and eval. see script.py for more details.
python ./pytorch/train.py train --config_path=./configs/car.fhd.config --model_dir=/path/to/model_dir
Assume you have 4 GPUs and want to train with 3 GPUs:
CUDA_VISIBLE_DEVICES=0,1,3 python ./pytorch/train.py train --config_path=./configs/car.fhd.config --model_dir=/path/to/model_dir --multi_gpu=True
Note: The batch_size and num_workers in config file is per-GPU, if you use multi-gpu, they will be multiplied by number of GPUs. Don't modify them manually.
You need to modify total step in config file. For example, 50 epochs = 15500 steps for car.lite.config and single GPU, if you use 4 GPUs, you need to divide steps
and steps_per_eval
by 4.
Modify config file, set enable_mixed_precision to true.
-
Make sure "/path/to/model_dir" doesn't exist if you want to train new model. A new directory will be created if the model_dir doesn't exist, otherwise will read checkpoints in it.
-
training process use batchsize=6 as default for 1080Ti, you need to reduce batchsize if your GPU has less memory.
-
Currently only support single GPU training, but train a model only needs 20 hours (165 epoch) in a single 1080Ti and only needs 50 epoch to reach 78.3 AP with super converge in car moderate 3D in Kitti validation dateset.
python ./pytorch/train.py evaluate --config_path=./configs/car.fhd.config --model_dir=/path/to/model_dir --measure_time=True --batch_size=1
- detection result will saved as a result.pkl file in model_dir/eval_results/step_xxx or save as official KITTI label format if you use --pickle_result=False.
You can download pretrained models in google drive. The car_fhd
model is corresponding to car.fhd.config.
Note that this pretrained model is trained before a bug of sparse convolution fixed, so the eval result may slightly worse.
You can use a prebuilt docker for testing:
docker pull scrin/second-pytorch
Then run:
nvidia-docker run -it --rm -v /media/yy/960evo/datasets/:/root/data -v $HOME/pretrained_models:/root/model --ipc=host second-pytorch:latest
python ./pytorch/train.py evaluate --config_path=./configs/car.config --model_dir=/root/model/car