Skip to content

Commit

Permalink
Prep for release
Browse files Browse the repository at this point in the history
  • Loading branch information
warner-benjamin committed Oct 17, 2023
1 parent 50895ce commit b4e6270
Show file tree
Hide file tree
Showing 12 changed files with 400 additions and 125 deletions.
75 changes: 48 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,44 +4,61 @@

### Train fastai models faster (and other useful tools)

![fastxtend accelerates fastai](nbs/images/imagenette_benchmark.png)
![fastxtend accelerates
fastai](https://github.com/warner-benjamin/fastxtend/blob/main/nbs/images/imagenette_benchmark.png?raw=true)

Train fastai models faster with fastxtend’s [fused
optimizers](optimizer.fused.html), [Progressive
Resizing](callback.progresize.html) callback, and integrated [FFCV
DataLoader](ffcv.tutorial.html).
optimizers](https://fastxtend.benjaminwarner.dev/optimizer.fused.html),
[Progressive
Resizing](https://fastxtend.benjaminwarner.dev/callback.progresize.html)
callback, integrated [FFCV
DataLoader](https://fastxtend.benjaminwarner.dev/ffcv.tutorial.html),
and integrated [PyTorch
Compile](https://fastxtend.benjaminwarner.dev/callback.compiler.html)
support.

## Feature overview

**Train Models Faster**

- Drop in [fused optimizers](optimizer.fused.html), which are 21 to 293
percent faster then fastai native optimizers.
- Drop in [fused
optimizers](https://fastxtend.benjaminwarner.dev/optimizer.fused.html),
which are 21 to 293 percent faster then fastai native optimizers.
- Up to 75% optimizer memory savings with integrated
[bitsandbytes](https://github.com/TimDettmers/bitsandbytes) [8-bit
optimizers](optimizer.eightbit.html).
optimizers](https://fastxtend.benjaminwarner.dev/optimizer.eightbit.html).
- Increase GPU throughput and decrease training time with the
[Progressive Resizing](callback.progresize.html) callback.
- Use the highly optimized [FFCV DataLoader](ffcv.tutorial.html), fully
integrated with fastai.
[Progressive
Resizing](https://fastxtend.benjaminwarner.dev/callback.progresize.html)
callback.
- Use the highly optimized [FFCV
DataLoader](https://fastxtend.benjaminwarner.dev/ffcv.tutorial.html),
fully integrated with fastai.
- Integrated support for `torch.compile` via the
[Compile](callback.compiler.html) callbacks.
[Compile](https://fastxtend.benjaminwarner.dev/callback.compiler.html)
callbacks.

**General Features**

- Fused implementations of modern optimizers, such as
[Adan](optimizer.adan.html), [Lion](optimizer.lion.html), &
[StableAdam](optimizer.stableadam.html).
- Hugging Face [Transformers compatibility](text.huggingface.html) with
fastai
- Flexible [metrics](metrics.html) which can log on train, valid, or
both. Backwards compatible with fastai metrics.
- Easily use [multiple losses](multiloss.html) and log each individual
loss on train and valid.
- [Multiple profilers](callback.profiler.html) for profiling training
and identifying bottlenecks.
- A fast [Exponential Moving Average](callback.ema.html) callback for
smoother training.
[Adan](https://fastxtend.benjaminwarner.dev/optimizer.adan.html),
[Lion](https://fastxtend.benjaminwarner.dev/optimizer.lion.html), &
[StableAdam](https://fastxtend.benjaminwarner.dev/optimizer.stableadam.html).
- Hugging Face [Transformers
compatibility](https://fastxtend.benjaminwarner.dev/text.huggingface.html)
with fastai
- Flexible [metrics](https://fastxtend.benjaminwarner.dev/metrics.html)
which can log on train, valid, or both. Backwards compatible with
fastai metrics.
- Easily use [multiple
losses](https://fastxtend.benjaminwarner.dev/multiloss.html) and log
each individual loss on train and valid.
- [Multiple
profilers](https://fastxtend.benjaminwarner.dev/callback.profiler.html)
for profiling training and identifying bottlenecks.
- A fast [Exponential Moving
Average](https://fastxtend.benjaminwarner.dev/callback.ema.html)
callback for smoother training.

**Vision**

Expand All @@ -52,11 +69,15 @@ DataLoader](ffcv.tutorial.html).
[`CutMixUp`](https://fastxtend.benjaminwarner.dev/callback.cutmixup.html#cutmixup)
or
[`CutMixUpAugment`](https://fastxtend.benjaminwarner.dev/callback.cutmixup.html#cutmixupaugment).
- Additional [image augmentations](vision.augment.batch.html).
- Additional [image
augmentations](https://fastxtend.benjaminwarner.dev/vision.augment.batch.html).
- Support for running fastai [batch transforms on
CPU](vision.data.html).
- More [attention](vision.models.attention_modules.html) and
[pooling](vision.models.pooling.html) modules
CPU](https://fastxtend.benjaminwarner.dev/vision.data.html).
- More
[attention](https://fastxtend.benjaminwarner.dev/vision.models.attention_modules.html)
and
[pooling](https://fastxtend.benjaminwarner.dev/vision.models.pooling.html)
modules
- A flexible implementation of fastai’s
[`XResNet`](https://fastxtend.benjaminwarner.dev/vision.models.xresnet.html#xresnet).

Expand Down
2 changes: 1 addition & 1 deletion fastxtend/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.1.5"
__version__ = "0.1.5.post1"
2 changes: 1 addition & 1 deletion fastxtend/callback/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ def _reset_compiled(self):

# %% ../../nbs/callback.compiler.ipynb 16
class DynamoExplainCallback(Callback):
"An experimental callback to find graph breaks with `torch.compile` (beta)"
"A callback to automate finding graph breaks with PyTorch Compile's Dynamo Explain"
order = MixedPrecision.order+1 # DynamoExplain occurs on the GPU before any training starts

def __init__(self,
Expand Down
2 changes: 1 addition & 1 deletion install.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
conda create -n fastxtend python=3.11 "pytorch>=2.1" torchvision torchaudio \
pytorch-cuda=12.1 fastai nbdev pkg-config libjpeg-turbo opencv tqdm psutil \
terminaltables numpy "numba>=0.57" librosa timm kornia rich typer wandb \
terminaltables numpy "numba>=0.57" "librosa>=0.10.1" timm kornia rich typer wandb \
"transformers>=4.34" "tokenizers>=0.14" "datasets>=2.14" ipykernel ipywidgets \
"matplotlib<3.8" -c pytorch -c nvidia -c fastai -c huggingface -c conda-forge
2 changes: 1 addition & 1 deletion nbs/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ format:
toc-depth: 4
highlight-style: custom.theme
html-math-method: katex
fontsize: 1.1rem
fontsize: 1.15rem
grid:
sidebar-width: 275px
body-width: 1025px
Expand Down
4 changes: 2 additions & 2 deletions nbs/callback.compiler.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
"\n",
"For more information on `torch.compile` please read *[PyTorch's getting started](https://pytorch.org/docs/master/compile/get-started.html)* guide. For troubleshooting `torch.compile` refer to this [PyTorch Nightly guide](https://pytorch.org/docs/master/compile/index.html#troubleshooting-and-gotchas).\n",
"\n",
"This module is not imported via any fastxtend all imports. You must import it separately after importing fastai and fastxtend:\n",
"This module is not imported via any fastxtend all imports. You must import it separately after importing fastai and fastxtend as it modifies model saving, loading, and training:\n",
"\n",
"```python\n",
"from fastxtend.callback import compiler\n",
Expand Down Expand Up @@ -305,7 +305,7 @@
"source": [
"#|export\n",
"class DynamoExplainCallback(Callback):\n",
" \"An experimental callback to find graph breaks with `torch.compile` (beta)\"\n",
" \"A callback to automate finding graph breaks with PyTorch Compile's Dynamo Explain\"\n",
" order = MixedPrecision.order+1 # DynamoExplain occurs on the GPU before any training starts\n",
"\n",
" def __init__(self,\n",
Expand Down
Loading

0 comments on commit b4e6270

Please sign in to comment.