You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Everything works find it loads 7B into about 8GB VRAM. Great.
But in generating I get:
File "example.py", line 103, in main
results = generator.generate(
File "C:\Users\Shadow\Documents\LLama\llama-int8-main\llama\generation.py", line 60, in generate
next_token = torch.multinomial(
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
Any ideas what went wrong?
The text was updated successfully, but these errors were encountered:
I reported an error in testing on tesla p40, but it ran successfully on rtx a5000. Maybe it is because of the low computing power of the graphics card?
I installed bitsandbytes following the guide for windows
including the dll from here.
Everything works find it loads 7B into about 8GB VRAM. Great.
But in generating I get:
Any ideas what went wrong?
The text was updated successfully, but these errors were encountered: