You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
def load_tokenizer(self):
if self.tokenizer is None:
import transformers
name = _MOCK_TOKENIZER if _MOCK_TOKENIZER else (self.tokenizer_name or self.model_name)
self.tokenizer = transformers.AutoTokenizer.from_pretrained(name)
Fails with
File "/home/toolkit/.local/lib/python3.12/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/home/toolkit/.local/lib/python3.12/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/cat-mono-repo/llmd2-core/src/llmd2/tapeagents_tmp/ghreat/dev/run_user_simulator.py", line 91, in main
raise exception
File "/home/toolkit/code/cat-mono-repo/llmd2-core/src/llmd2/tapeagents_tmp/ghreat/dev/run_user_simulator.py", line 41, in run_user_simulator_agent
user_simulator_agent_tape = user_simulator_agent.run(user_simulator_agent_tape).get_final_tape()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/agent.py", line 60, in get_final_tape
for event in self:
^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/agent.py", line 364, in _run_implementation
for step in current_subagent.run_iteration(tape):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/agent.py", line 344, in run_iteration
for step in self.generate_steps(tape, llm_stream):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/agent.py", line 296, in generate_steps
for step in node.generate_steps(self, tape, llm_stream):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/cat-mono-repo/llmd2-core/src/llmd2/tapeagents_tmp/ghreat/user_simulator_agent.py", line 193, in generate_steps
user_utterance = llm_stream.get_text()
^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/llms.py", line 65, in get_text
o = self.get_output()
^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/llms.py", line 59, in get_output
for event in self:
^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/llms.py", line 56, in __next__
return next(self.generator)
^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/llms.py", line 189, in _implementation
toks = self.count_tokens(prompt.messages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/code/TapeAgents/tapeagents/llms.py", line 447, in count_tokens
self.load_tokenizer()
File "/home/toolkit/code/TapeAgents/tapeagents/llms.py", line 323, in load_tokenizer
self.tokenizer = transformers.AutoTokenizer.from_pretrained(name)
^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'transformers' has no attribute 'AutoTokenizer'
This is because dynamic imports and multiprocessing don't play together well due to pickling
The text was updated successfully, but these errors were encountered:
Note: @ehsk and @NicolasAG have encountered similar problem, so i think this should be fixed fairly quickly. We could move TrainableLLM to a separate file and change to static import if not using transformers as a dependency to the whole project is important
Thanks for the report, Gabriel! Transformers are already in the main requirements.txt, so it's not really optional. It was put under conditional import because import transformers takes a crazy amount of time, 3-5 seconds, which considerably increased the startup time of almost all of our scripts, as we're using import tapeagents.llms almost everywhere.
We can try https://github.com/huggingface/tokenizers instead of the whole transformers lib to avoid a load time penalty. PRs are welcome!
Fails with
This is because dynamic imports and multiprocessing don't play together well due to pickling
The text was updated successfully, but these errors were encountered: