Skip to content

Commit

Permalink
updated tokenizers submodule (#1559)
Browse files Browse the repository at this point in the history
  • Loading branch information
ilya-lavrenov authored Jan 16, 2025
1 parent 7765bc3 commit 36b88ad
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion thirdparty/openvino_tokenizers
Submodule openvino_tokenizers updated 81 files
+2 −2 .github/workflows/linux.yml
+5 −5 .github/workflows/mac.yml
+2 −2 .github/workflows/windows.yml
+2 −1 CMakeLists.txt
+18 −17 README.md
+1 −1 cmake/platforms.cmake
+5 −2 cmake/templates/__version__.py.in
+72 −0 cmake/version.cmake
+1 −1 js/README.md
+1 −1 pyproject.toml
+1 −1 python/openvino_tokenizers/__init__.py
+1 −1 python/openvino_tokenizers/__version__.py
+1 −1 python/openvino_tokenizers/cli.py
+1 −1 python/openvino_tokenizers/constants.py
+1 −1 python/openvino_tokenizers/convert_tokenizer.py
+12 −26 python/openvino_tokenizers/hf_parser.py
+1 −1 python/openvino_tokenizers/str_pack.py
+17 −1 python/openvino_tokenizers/tokenizer_pipeline.py
+1 −1 python/openvino_tokenizers/utils.py
+6 −6 src/CMakeLists.txt
+1 −1 src/bpe_tokenizer.cpp
+1 −1 src/bpe_tokenizer.hpp
+1 −1 src/byte_fallback.cpp
+1 −1 src/byte_fallback.hpp
+1 −1 src/bytes_to_chars.cpp
+1 −1 src/bytes_to_chars.hpp
+1 −1 src/case_fold.cpp
+1 −1 src/case_fold.hpp
+1 −1 src/chars_to_bytes.cpp
+1 −1 src/chars_to_bytes.hpp
+1 −1 src/charsmap_normalization.cpp
+1 −1 src/charsmap_normalization.hpp
+1 −1 src/combine_segments.cpp
+1 −1 src/combine_segments.hpp
+1 −1 src/equal_str.cpp
+1 −1 src/equal_str.hpp
+1 −1 src/fuze.cpp
+1 −1 src/fuze.hpp
+1 −1 src/normalize_unicode.cpp
+1 −1 src/normalize_unicode.hpp
+1 −1 src/ov_extension.cpp
+1 −1 src/ragged_tensor_pack.cpp
+1 −1 src/ragged_tensor_pack.hpp
+1 −1 src/ragged_to_dense.cpp
+1 −1 src/ragged_to_dense.hpp
+1 −1 src/ragged_to_ragged.cpp
+1 −1 src/ragged_to_ragged.hpp
+1 −1 src/ragged_to_sparse.cpp
+1 −1 src/ragged_to_sparse.hpp
+1 −1 src/regex_normalization.cpp
+1 −1 src/regex_normalization.hpp
+1 −1 src/regex_split.cpp
+1 −1 src/regex_split.hpp
+1 −1 src/sentence_piece.cpp
+1 −1 src/sentence_piece.hpp
+1 −1 src/special_tokens_split.cpp
+1 −1 src/special_tokens_split.hpp
+1 −1 src/string_tensor_pack.cpp
+1 −1 src/string_tensor_pack.hpp
+1 −1 src/string_tensor_unpack.cpp
+1 −1 src/string_tensor_unpack.hpp
+1 −1 src/string_to_hash_bucket.cpp
+1 −1 src/string_to_hash_bucket.hpp
+1 −1 src/tensorflow_translators.cpp
+1 −1 src/tensorflow_translators.hpp
+1 −1 src/tokenizer.hpp
+1 −1 src/trie_tokenizer.cpp
+1 −1 src/trie_tokenizer.hpp
+1 −1 src/utf8_validate.cpp
+1 −1 src/utf8_validate.hpp
+1 −1 src/utils.cpp
+1 −1 src/utils.hpp
+1 −1 src/vocab_decoder.cpp
+1 −1 src/vocab_decoder.hpp
+1 −1 src/vocab_encoder.cpp
+1 −1 src/vocab_encoder.hpp
+1 −1 src/wordpiece_tokenizer.cpp
+1 −1 src/wordpiece_tokenizer.hpp
+1 −1 tests/pass_rates.json
+18,150 −17,086 tests/stats.json
+32 −1 tests/tokenizers_test.py

0 comments on commit 36b88ad

Please sign in to comment.