This implements FP8 quantization of popular model architectures, such as ResNet on the ImageNet dataset, which is supported by Intel Gaudi2 AI Accelerator.
To try on Intel Gaudi2, docker image with Gaudi Software Stack is recommended, please refer to following script for environment setup. More details can be found in Gaudi Guide.
# Run a container with an interactive shell
docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest
- Install requirements
pip install -r requirements.txt
- Download the ImageNet dataset from http://www.image-net.org/
- Then, move and extract the training and validation images to labeled subfolders, using the following shell script
To quant a model and validate accaracy, run main.py
with the desired model architecture and the path to the ImageNet dataset:
python main.py --pretrained -t -a resnet50 -b 30 /path/to/imagenet
or
bash run_quant.sh --input_model=resnet50 --dataset_location=/path/to/imagenet