`Nan` logits when performing inference using ModernBERT #35574

yaswanthg15 · 2025-01-09T07:20:10Z

System Info

transformers == 4.48.0.dev0
torch == 2.2.2

Description

I have finetuned ModelBERT model for multi-label task and when performing batched inference I am getting Nan values for logits in a batch except for one sample. So, in each batch except one sample all the remaining logits are Nan

Who can help?

@tomaarsen @ArthurZucker

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Simply passing model(**batch) is resulting in this - while with bs=1 no issue exists.
Code:

model.eval()
labels, predictions, confidences = [], [], []

with torch.no_grad():
    for batch in tqdm(val_loader):
        labels.extend(batch['labels'].cpu().numpy())
        batch.pop('labels')
        batch = {k: v.to(device) for k, v in batch.items()}

        outputs = model(**batch)

        probs = torch.sigmoid(outputs.logits)
        preds = (probs > 0.5).int()
        confs = probs

        predictions.extend(preds.cpu().numpy())
        confidences.extend(confs.cpu().numpy())

Expected behavior

Expecting some values instead of Nan

The text was updated successfully, but these errors were encountered:

tomaarsen · 2025-01-09T07:21:31Z

Could you add a snippet to reproduce? I've done lots of inference and I've never seen nans yet.

Tom Aarsen

yaswanthg15 · 2025-01-09T07:26:35Z

@tomaarsen While it is not possible for me to make the model public - but coming to the issue I am performing a simple inference using pytorch loop.

ArthurZucker · 2025-01-09T14:32:03Z

In order to help we would need to know:

the dtype
the attention implementation
This should help us as a lot of nans can come from, sdpa or casting

yaswanthg15 added the bug label Jan 9, 2025

yaswanthg15 closed this as completed Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Nan` logits when performing inference using ModernBERT #35574

`Nan` logits when performing inference using ModernBERT #35574

yaswanthg15 commented Jan 9, 2025 •

edited

Loading

tomaarsen commented Jan 9, 2025

yaswanthg15 commented Jan 9, 2025

ArthurZucker commented Jan 9, 2025

Nan logits when performing inference using ModernBERT #35574

Nan logits when performing inference using ModernBERT #35574

Comments

yaswanthg15 commented Jan 9, 2025 • edited Loading

System Info

Description

Who can help?

Information

Tasks

Reproduction

Expected behavior

tomaarsen commented Jan 9, 2025

yaswanthg15 commented Jan 9, 2025

ArthurZucker commented Jan 9, 2025

`Nan` logits when performing inference using ModernBERT #35574

`Nan` logits when performing inference using ModernBERT #35574

yaswanthg15 commented Jan 9, 2025 •

edited

Loading