Fix computation in normalization layers #876

AakashKumarNain · 2024-10-12T10:04:49Z

Irrespective of the dtype passed to the normalization layers, the calculations should be done with fp32 or higher dtype. Not doing that can cause instabilities in training which is too common when someone is training models in half-precision. Now, we can delegate this to the end user to do this explicitly, but it is an easy thing to miss. This PR takes care of that.

patrick-kidger · 2024-10-12T10:48:53Z

Looks like the current implementation doesn't pass under strict-dtype-promotion (see failing tests), but other than that this sounds good to me!

AakashKumarNain · 2024-10-12T14:18:31Z

Thanks @patrick-kidger I have fixed it. Please let me know if you have any more suggestions on this one

equinox/nn/_normalisation.py

patrick-kidger · 2024-10-14T17:44:12Z

Alright, this LGTM! Final question before I merge this, do you have a reference for needing higher precision for these operations? (As much for me to read up on as anything else :D )

AakashKumarNain · 2024-10-15T02:36:04Z

I think this is pretty much enough: https://pytorch.org/docs/stable/amp.html#autocast-op-reference

patrick-kidger · 2024-10-18T08:04:37Z

Alright, looks good to me! Thanks for spotting this, and for the reference. Merged :)

fix formatting

7ebf602

fix dtype promotion during weights creating itself

8c538a0

patrick-kidger reviewed Oct 12, 2024

View reviewed changes

equinox/nn/_normalisation.py Outdated Show resolved Hide resolved

AakashKumarNain added 2 commits October 13, 2024 13:45

ignore false pywright type warning

abb5703

remove redundant dtype check

034093e

AakashKumarNain requested a review from patrick-kidger October 14, 2024 08:36

patrick-kidger merged commit e2d7e38 into patrick-kidger:main Oct 18, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix computation in normalization layers #876

Fix computation in normalization layers #876

AakashKumarNain commented Oct 12, 2024

patrick-kidger commented Oct 12, 2024

AakashKumarNain commented Oct 12, 2024

patrick-kidger commented Oct 14, 2024

AakashKumarNain commented Oct 15, 2024

patrick-kidger commented Oct 18, 2024

Fix computation in normalization layers #876

Fix computation in normalization layers #876

Conversation

AakashKumarNain commented Oct 12, 2024

patrick-kidger commented Oct 12, 2024

AakashKumarNain commented Oct 12, 2024

patrick-kidger commented Oct 14, 2024

AakashKumarNain commented Oct 15, 2024

patrick-kidger commented Oct 18, 2024