Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add loss: Noise Constrastive Estimation #65

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

christophmluscher
Copy link
Contributor

Add NCE loss

implemented by @Gerstenberger

depends on #64

i6_models/losses/nce.py Outdated Show resolved Hide resolved
Co-authored-by: Albert Zeyer <[email protected]>
i6_models/losses/nce.py Outdated Show resolved Hide resolved
@Gerstenberger
Copy link

Added some small comments about documentation. Otherwise looks fine to me.

i6_models/parts/losses/nce.py Outdated Show resolved Hide resolved
i6_models/parts/losses/nce.py Outdated Show resolved Hide resolved
def forward(self, data: torch.Tensor, target: torch.Tensor) -> torch.Tensor:
"""

:param data: the tensor for the input data, where batch and time are flattened resulting in [B x T, F] shape.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments here are more confusing than helpful. So it should be specified what data is (or even rename it), First I assumed it is logits, but then I guess this should be the input to the "output" layer of the provided model? This is very confusing and unclear. Then it should also be clear what is the input shape. When you say "where batch and time are flattened resulting in" this sounds like this is done internally, but what you mean it expects [B x T, F] as only valid input, correct?

:param model: model on which the NCE loss is to be applied. The model requires a member called `output`, which
is expected to be a linear layer with bias and weight members. The member `output` represents the output
layer of the model allowing access to the parameters during loss computation.
:param noise_distribution_sampler: for example `i6_model.samplers.LogUniformSampler`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:param noise_distribution_sampler: for example `i6_model.samplers.LogUniformSampler`.
:param noise_distribution_sampler: for example `i6_models.parts.samplers.LogUniformSampler`.


:param num_samples: num of samples for the estimation, normally a value between 1000-4000.
2000 is a good starting point.
:param model: model on which the NCE loss is to be applied. The model requires a member called `output`, which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not pass in these two weights explicitly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants