Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't run on Linux #151

Open
ProgrammingLife opened this issue Feb 24, 2024 · 5 comments
Open

Can't run on Linux #151

ProgrammingLife opened this issue Feb 24, 2024 · 5 comments

Comments

@ProgrammingLife
Copy link

ProgrammingLife commented Feb 24, 2024

Why do I get all those errors?
Arch Linux.

$ ./run.sh --model 7b
...
llama-gpt-api-1  | error loading model: llama.cpp: tensor 'layers.9.ffn_norm.weight' is missing from model
llama-gpt-api-1  | llama_load_model_from_file: failed to load model
llama-gpt-api-1  | Traceback (most recent call last):
llama-gpt-api-1  |   File "<frozen runpy>", line 198, in _run_module_as_main
llama-gpt-api-1  |   File "<frozen runpy>", line 88, in _run_code
llama-gpt-api-1  |   File "/app/llama_cpp/server/__main__.py", line 46, in <module>
llama-gpt-api-1  |     app = create_app(settings=settings)
llama-gpt-api-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1  |   File "/app/llama_cpp/server/app.py", line 317, in create_app
llama-gpt-api-1  |     llama = llama_cpp.Llama(
llama-gpt-api-1  |             ^^^^^^^^^^^^^^^^
llama-gpt-api-1  |   File "/app/llama_cpp/llama.py", line 328, in __init__
llama-gpt-api-1  |     assert self.model is not None
llama-gpt-api-1  |            ^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1  | AssertionError
llama-gpt-ui-1   | [INFO  wait] Host [llama-gpt-api:8020] not yet available...
llama-gpt-api-1 exited with code 0
llama-gpt-api-1  | /usr/local/lib/python3.11/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
llama-gpt-api-1  | !!
llama-gpt-api-1  | 
llama-gpt-api-1  |         ********************************************************************************
llama-gpt-api-1  |         Please avoid running ``setup.py`` and ``easy_install``.
llama-gpt-api-1  |         Instead, use pypa/build, pypa/installer or other
llama-gpt-api-1  |         standards-based tools.
llama-gpt-api-1  | 
llama-gpt-api-1  |         See https://github.com/pypa/setuptools/issues/917 for details.
llama-gpt-api-1  |         ********************************************************************************
llama-gpt-api-1  | 
llama-gpt-api-1  | !!
llama-gpt-api-1  |   easy_install.initialize_options(self)
llama-gpt-api-1  | [0/1] Install the project...
llama-gpt-api-1  | -- Install configuration: "Release"
llama-gpt-api-1  | -- Up-to-date: /app/_skbuild/linux-x86_64-3.11/cmake-install/llama_cpp/libllama.so
llama-gpt-api-1  | copying _skbuild/linux-x86_64-3.11/cmake-install/llama_cpp/libllama.so -> llama_cpp/libllama.so
llama-gpt-api-1  | 
llama-gpt-api-1  | running develop
llama-gpt-api-1  | /usr/local/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
llama-gpt-api-1  | !!
llama-gpt-api-1  | 
llama-gpt-api-1  |         ********************************************************************************
llama-gpt-api-1  |         Please avoid running ``setup.py`` directly.
llama-gpt-api-1  |         Instead, use pypa/build, pypa/installer or other
llama-gpt-api-1  |         standards-based tools.
llama-gpt-api-1  | 
llama-gpt-api-1  |         See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
llama-gpt-api-1  |         ********************************************************************************
llama-gpt-api-1  | 
llama-gpt-api-1  | !!
llama-gpt-api-1  |   self.initialize_options()
llama-gpt-api-1  | running egg_info
llama-gpt-api-1  | writing llama_cpp_python.egg-info/PKG-INFO
llama-gpt-api-1  | writing dependency_links to llama_cpp_python.egg-info/dependency_links.txt
llama-gpt-api-1  | writing requirements to llama_cpp_python.egg-info/requires.txt
llama-gpt-api-1  | writing top-level names to llama_cpp_python.egg-info/top_level.txt
llama-gpt-api-1  | reading manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
llama-gpt-api-1  | adding license file 'LICENSE.md'
llama-gpt-api-1  | writing manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
llama-gpt-api-1  | running build_ext
llama-gpt-api-1  | Creating /usr/local/lib/python3.11/site-packages/llama-cpp-python.egg-link (link to .)
llama-gpt-api-1  | llama-cpp-python 0.1.78 is already the active version in easy-install.pth
llama-gpt-api-1  | 
llama-gpt-api-1  | Installed /app
llama-gpt-api-1  | Processing dependencies for llama-cpp-python==0.1.78
llama-gpt-api-1  | Searching for diskcache==5.6.1
llama-gpt-api-1  | Best match: diskcache 5.6.1
llama-gpt-api-1  | Processing diskcache-5.6.1-py3.11.egg
llama-gpt-api-1  | Adding diskcache 5.6.1 to easy-install.pth file
llama-gpt-api-1  | 
llama-gpt-api-1  | Using /usr/local/lib/python3.11/site-packages/diskcache-5.6.1-py3.11.egg
llama-gpt-api-1  | Searching for numpy==1.26.0b1
llama-gpt-api-1  | Best match: numpy 1.26.0b1
llama-gpt-api-1  | Processing numpy-1.26.0b1-py3.11-linux-x86_64.egg
llama-gpt-api-1  | Adding numpy 1.26.0b1 to easy-install.pth file
llama-gpt-api-1  | Installing f2py script to /usr/local/bin
llama-gpt-api-1  | 
llama-gpt-api-1  | Using /usr/local/lib/python3.11/site-packages/numpy-1.26.0b1-py3.11-linux-x86_64.egg
llama-gpt-api-1  | Searching for typing-extensions==4.7.1
llama-gpt-api-1  | Best match: typing-extensions 4.7.1
llama-gpt-api-1  | Adding typing-extensions 4.7.1 to easy-install.pth file
llama-gpt-api-1  | 
llama-gpt-api-1  | Using /usr/local/lib/python3.11/site-packages
llama-gpt-api-1  | Finished processing dependencies for llama-cpp-python==0.1.78
llama-gpt-api-1  | Initializing server with:
llama-gpt-api-1  | Batch size: 2096
llama-gpt-api-1  | Number of CPU threads: 20
llama-gpt-api-1  | Number of GPU layers: 0
llama-gpt-api-1  | Context window: 4096
llama-gpt-ui-1   | [INFO  wait] Host [llama-gpt-api:8020] not yet available...
llama-gpt-api-1  | /usr/local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
llama-gpt-api-1  | 
llama-gpt-api-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
llama-gpt-api-1  |   warnings.warn(
llama-gpt-api-1  | llama.cpp: loading model from /models/llama-2-7b-chat.bin
llama-gpt-api-1  | llama_model_load_internal: format     = ggjt v3 (latest)
llama-gpt-api-1  | llama_model_load_internal: n_vocab    = 32000
llama-gpt-api-1  | llama_model_load_internal: n_ctx      = 4096
llama-gpt-api-1  | llama_model_load_internal: n_embd     = 4096
llama-gpt-api-1  | llama_model_load_internal: n_mult     = 5504
llama-gpt-api-1  | llama_model_load_internal: n_head     = 32
llama-gpt-api-1  | llama_model_load_internal: n_head_kv  = 32
llama-gpt-api-1  | llama_model_load_internal: n_layer    = 32
llama-gpt-api-1  | llama_model_load_internal: n_rot      = 128
llama-gpt-api-1  | llama_model_load_internal: n_gqa      = 1
llama-gpt-api-1  | llama_model_load_internal: rnorm_eps  = 5.0e-06
llama-gpt-api-1  | llama_model_load_internal: n_ff       = 11008
llama-gpt-api-1  | llama_model_load_internal: freq_base  = 10000.0
llama-gpt-api-1  | llama_model_load_internal: freq_scale = 1
llama-gpt-api-1  | llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama-gpt-api-1  | llama_model_load_internal: model size = 7B
llama-gpt-api-1  | llama_model_load_internal: ggml ctx size =    0.03 MB
llama-gpt-api-1  | error loading model: llama.cpp: tensor 'layers.9.ffn_norm.weight' is missing from model
llama-gpt-api-1  | llama_load_model_from_file: failed to load model
llama-gpt-api-1  | Traceback (most recent call last):
llama-gpt-api-1  |   File "<frozen runpy>", line 198, in _run_module_as_main
llama-gpt-api-1  |   File "<frozen runpy>", line 88, in _run_code
llama-gpt-api-1  |   File "/app/llama_cpp/server/__main__.py", line 46, in <module>
llama-gpt-api-1  |     app = create_app(settings=settings)
llama-gpt-api-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1  |   File "/app/llama_cpp/server/app.py", line 317, in create_app
llama-gpt-api-1  |     llama = llama_cpp.Llama(
llama-gpt-api-1  |             ^^^^^^^^^^^^^^^^
llama-gpt-api-1  |   File "/app/llama_cpp/llama.py", line 328, in __init__
llama-gpt-api-1  |     assert self.model is not None
llama-gpt-api-1  |            ^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1  | AssertionError
llama-gpt-api-1 exited with code 1
llama-gpt-ui-1   | [INFO  wait] Host [llama-gpt-api:8020] not yet available...
llama-gpt-api-1  | /usr/local/lib/python3.11/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
llama-gpt-api-1  | !!
llama-gpt-api-1  | 
llama-gpt-api-1  |         ********************************************************************************
llama-gpt-api-1  |         Please avoid running ``setup.py`` and ``easy_install``.
llama-gpt-api-1  |         Instead, use pypa/build, pypa/installer or other
llama-gpt-api-1  |         standards-based tools.
llama-gpt-api-1  | 
llama-gpt-api-1  |         See https://github.com/pypa/setuptools/issues/917 for details.
llama-gpt-api-1  |         ********************************************************************************
llama-gpt-api-1  | 
llama-gpt-api-1  | !!
llama-gpt-api-1  |   easy_install.initialize_options(self)
llama-gpt-api-1  | [0/1] Install the project...
llama-gpt-api-1  | -- Install configuration: "Release"
llama-gpt-api-1  | -- Up-to-date: /app/_skbuild/linux-x86_64-3.11/cmake-install/llama_cpp/libllama.so
llama-gpt-api-1  | copying _skbuild/linux-x86_64-3.11/cmake-install/llama_cpp/libllama.so -> llama_cpp/libllama.so
llama-gpt-api-1  | 
llama-gpt-api-1  | running develop
llama-gpt-api-1  | /usr/local/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
llama-gpt-api-1  | !!
llama-gpt-api-1  | 
llama-gpt-api-1  |         ********************************************************************************
llama-gpt-api-1  |         Please avoid running ``setup.py`` directly.
llama-gpt-api-1  |         Instead, use pypa/build, pypa/installer or other
llama-gpt-api-1  |         standards-based tools.
llama-gpt-api-1  | 
llama-gpt-api-1  |         See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
llama-gpt-api-1  |         ********************************************************************************
llama-gpt-api-1  | 
llama-gpt-api-1  | !!
llama-gpt-api-1  |   self.initialize_options()
llama-gpt-api-1  | running egg_info
@ExodosPavilion
Copy link

I'm also having a similar problem when trying to install model 70b on arch linux
At first I have the same issue as the one in #86 and then the system crashes cause it maxed out the memory usage.
But after a reboot and running the command again (./run.sh --model 70b) I get this on repeat:

llama-gpt-ui-1   | [INFO  wait] Host [llama-gpt-api:8000] not yet available...
llama-gpt-api-1  | /usr/local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
llama-gpt-api-1  |
llama-gpt-api-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
llama-gpt-api-1  |   warnings.warn(
llama-gpt-api-1  | llama.cpp: loading model from /models/llama-2-70b-chat.bin
llama-gpt-api-1  | llama_model_load_internal: warning: assuming 70B model based on GQA == 8
llama-gpt-api-1  | llama_model_load_internal: format     = ggjt v3 (latest)
llama-gpt-api-1  | llama_model_load_internal: n_vocab    = 32001
llama-gpt-api-1  | llama_model_load_internal: n_ctx      = 4096
llama-gpt-api-1  | llama_model_load_internal: n_embd     = 8192
llama-gpt-api-1  | llama_model_load_internal: n_mult     = 7168
llama-gpt-api-1  | llama_model_load_internal: n_head     = 64
llama-gpt-api-1  | llama_model_load_internal: n_head_kv  = 8
llama-gpt-api-1  | llama_model_load_internal: n_layer    = 80
llama-gpt-api-1  | llama_model_load_internal: n_rot      = 128
llama-gpt-api-1  | llama_model_load_internal: n_gqa      = 8
llama-gpt-api-1  | llama_model_load_internal: rnorm_eps  = 5.0e-06
llama-gpt-api-1  | llama_model_load_internal: n_ff       = 28672
llama-gpt-api-1  | llama_model_load_internal: freq_base  = 10000.0
llama-gpt-api-1  | llama_model_load_internal: freq_scale = 1
llama-gpt-api-1  | llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama-gpt-api-1  | llama_model_load_internal: model size = 70B
llama-gpt-api-1  | llama_model_load_internal: ggml ctx size =    0.07 MB
llama-gpt-api-1  | error loading model: llama.cpp: tensor 'layers.26.ffn_norm.weight' is missing from model
llama-gpt-api-1  | llama_load_model_from_file: failed to load model
llama-gpt-api-1  | Traceback (most recent call last):
llama-gpt-api-1  |   File "<frozen runpy>", line 198, in _run_module_as_main
llama-gpt-api-1  |   File "<frozen runpy>", line 88, in _run_code
llama-gpt-api-1  |   File "/app/llama_cpp/server/__main__.py", line 46, in <module>
llama-gpt-api-1  |     app = create_app(settings=settings)
llama-gpt-api-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1  |   File "/app/llama_cpp/server/app.py", line 317, in create_app
llama-gpt-api-1  |     llama = llama_cpp.Llama(
llama-gpt-api-1  |             ^^^^^^^^^^^^^^^^
llama-gpt-api-1  |   File "/app/llama_cpp/llama.py", line 328, in __init__
llama-gpt-api-1  |     assert self.model is not None
llama-gpt-api-1  |            ^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1  | AssertionError
llama-gpt-api-1 exited with code 1
llama-gpt-ui-1   | [INFO  wait] Host [llama-gpt-api:8000] not yet available...

@RobIux
Copy link

RobIux commented Jun 7, 2024

I'm unfortunately having the same issues

@wisnc
Copy link

wisnc commented Jul 10, 2024

same for me. any workaround?

@ferrixx
Copy link

ferrixx commented Aug 28, 2024

Hey, is there a fix or a workaround? :)
I have the Same issue.
I'm using VM with 16GB Ram and 6vCores based on my AMD Ryzen 7 3700X with 4.2GHz.
I tryed it with port 8000:3000 & 3001 and with port 8020:3005 & 3004.
I downloaded the Repo and started the run.sh with --model 13b at first and then with model 7b. After a few seconds this Error appears:

llama-gpt-api-1  |
llama-gpt-api-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
llama-gpt-api-1  |   warnings.warn(
llama-gpt-api-1  | llama.cpp: loading model from /models/llama-2-7b-chat.bin
llama-gpt-api-1  | llama_model_load_internal: format     = ggjt v3 (latest)
llama-gpt-api-1  | llama_model_load_internal: n_vocab    = 32000
llama-gpt-api-1  | llama_model_load_internal: n_ctx      = 4096
llama-gpt-api-1  | llama_model_load_internal: n_embd     = 4096
llama-gpt-api-1  | llama_model_load_internal: n_mult     = 5504
llama-gpt-api-1  | llama_model_load_internal: n_head     = 32
llama-gpt-api-1  | llama_model_load_internal: n_head_kv  = 32
llama-gpt-api-1  | llama_model_load_internal: n_layer    = 32
llama-gpt-api-1  | llama_model_load_internal: n_rot      = 128
llama-gpt-api-1  | llama_model_load_internal: n_gqa      = 1
llama-gpt-api-1  | llama_model_load_internal: rnorm_eps  = 5.0e-06
llama-gpt-api-1  | llama_model_load_internal: n_ff       = 11008
llama-gpt-api-1  | llama_model_load_internal: freq_base  = 10000.0
llama-gpt-api-1  | llama_model_load_internal: freq_scale = 1
llama-gpt-api-1  | llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama-gpt-api-1  | llama_model_load_internal: model size = 7B
llama-gpt-api-1  | llama_model_load_internal: ggml ctx size =    0.01 MB
llama-gpt-api-1  | error loading model: llama.cpp: tensor 'layers.3.ffn_norm.weight' is missing from model
llama-gpt-api-1  | llama_load_model_from_file: failed to load model
llama-gpt-api-1  | Traceback (most recent call last):
llama-gpt-api-1  |   File "<frozen runpy>", line 198, in _run_module_as_main
llama-gpt-api-1  |   File "<frozen runpy>", line 88, in _run_code
llama-gpt-api-1  |   File "/app/llama_cpp/server/__main__.py", line 46, in <module>
llama-gpt-api-1  |     app = create_app(settings=settings)
llama-gpt-api-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1  |   File "/app/llama_cpp/server/app.py", line 317, in create_app
llama-gpt-api-1  |     llama = llama_cpp.Llama(
llama-gpt-api-1  |             ^^^^^^^^^^^^^^^^
llama-gpt-api-1  |   File "/app/llama_cpp/llama.py", line 328, in __init__
llama-gpt-api-1  |     assert self.model is not None
llama-gpt-api-1  |            ^^^^^^^^^^^^^^^^^^^^^^
llama-gpt-api-1  | AssertionError
llama-gpt-ui-1   | [INFO  wait] Host [llama-gpt-api:8020] not yet available...
llama-gpt-ui-1   | [INFO  wait] Host [llama-gpt-api:8020] not yet available...

@girls-whocode
Copy link

I also was running into this, so I did a compare from the 7b to the 70b. I noticed it did not have the same code. So how I solved this:

  1. Delete the llama-gpt folder.
  2. re-run the: git clone https://github.com/getumbrel/llama-gpt.git
  3. re-run with the 7b model: ./run.sh --model 7b
  4. Let everything come up and do a quick test (How far is the sun)
  5. Hit Control-C to stop it
  6. re-run with the 70b model: ./run.sh --model 70b
  7. No more errors, but it took a long time to download the 37gb model.

To watch the model download, I just opened a new tab and typed in: watch ls -lah models/

While I am typing this, it is at 21gb... Waiting for the 37gb :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants