-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't run on Linux #151
Comments
I'm also having a similar problem when trying to install model 70b on arch linux llama-gpt-ui-1 | [INFO wait] Host [llama-gpt-api:8000] not yet available...
llama-gpt-api-1 | /usr/local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
llama-gpt-api-1 |
llama-gpt-api-1 | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
llama-gpt-api-1 | warnings.warn(
llama-gpt-api-1 | llama.cpp: loading model from /models/llama-2-70b-chat.bin
llama-gpt-api-1 | llama_model_load_internal: warning: assuming 70B model based on GQA == 8
llama-gpt-api-1 | llama_model_load_internal: format = ggjt v3 (latest)
llama-gpt-api-1 | llama_model_load_internal: n_vocab = 32001
llama-gpt-api-1 | llama_model_load_internal: n_ctx = 4096
llama-gpt-api-1 | llama_model_load_internal: n_embd = 8192
llama-gpt-api-1 | llama_model_load_internal: n_mult = 7168
llama-gpt-api-1 | llama_model_load_internal: n_head = 64
llama-gpt-api-1 | llama_model_load_internal: n_head_kv = 8
llama-gpt-api-1 | llama_model_load_internal: n_layer = 80
llama-gpt-api-1 | llama_model_load_internal: n_rot = 128
llama-gpt-api-1 | llama_model_load_internal: n_gqa = 8
llama-gpt-api-1 | llama_model_load_internal: rnorm_eps = 5.0e-06
llama-gpt-api-1 | llama_model_load_internal: n_ff = 28672
llama-gpt-api-1 | llama_model_load_internal: freq_base = 10000.0
llama-gpt-api-1 | llama_model_load_internal: freq_scale = 1
llama-gpt-api-1 | llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama-gpt-api-1 | llama_model_load_internal: model size = 70B
llama-gpt-api-1 | llama_model_load_internal: ggml ctx size = 0.07 MB
llama-gpt-api-1 | error loading model: llama.cpp: tensor 'layers.26.ffn_norm.weight' is missing from model
llama-gpt-api-1 | llama_load_model_from_file: failed to load model
llama-gpt-api-1 | Traceback (most recent call last):
llama-gpt-api-1 | File "<frozen runpy>", line 198, in _run_module_as_main
llama-gpt-api-1 | File "<frozen runpy>", line 88, in _run_code
llama-gpt-api-1 | File "/app/llama_cpp/server/__main__.py", line 46, in <module>
llama-gpt-api-1 | app = create_app(settings=settings)
llama-gpt-api-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1 | File "/app/llama_cpp/server/app.py", line 317, in create_app
llama-gpt-api-1 | llama = llama_cpp.Llama(
llama-gpt-api-1 | ^^^^^^^^^^^^^^^^
llama-gpt-api-1 | File "/app/llama_cpp/llama.py", line 328, in __init__
llama-gpt-api-1 | assert self.model is not None
llama-gpt-api-1 | ^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1 | AssertionError
llama-gpt-api-1 exited with code 1
llama-gpt-ui-1 | [INFO wait] Host [llama-gpt-api:8000] not yet available... |
I'm unfortunately having the same issues |
same for me. any workaround? |
Hey, is there a fix or a workaround? :)
|
I also was running into this, so I did a compare from the 7b to the 70b. I noticed it did not have the same code. So how I solved this:
To watch the model download, I just opened a new tab and typed in: watch ls -lah models/ While I am typing this, it is at 21gb... Waiting for the 37gb :) |
Why do I get all those errors?
Arch Linux.
The text was updated successfully, but these errors were encountered: