-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Set/Get String Tensor Data via C-API Does Not Work #26906
Comments
@peterchen-intel could you take a look, please? |
Ok, I think
If there's a better way to access string data, do let me know. I'm still having problems setting a string tensor from C-api. I will try to update with another comment to document test-cases if possible. |
@rahulchaphalkar I don't think current C APIs support string tensor, so we need new APIs to support string tensor in C API, such as create, access and free string tensor. |
Hi @rahulchaphalkar, Now it is possible to create We may consider to extend Best regards, |
From C API perspective, string tensor is very different with other numerical tensor, why not add dedicated C APIs to support string tensor, such as |
Yep, we can do this. @rahulchaphalkar, what is the priority of this task to support string tensors in C? Who is the customer and user scenario? We can continue to discuss this by work email. Best regards, |
Thanks for the details, both of you. |
@rahulchaphalkar, this one issue is fine for discussion and to start development of this feature. Please create PR to Best regards, |
@rahulchaphalkar It will be great if you can create PR for it! |
OpenVINO Version
2024.3.0 https://github.com/rahulchaphalkar/openvino/tree/add-extension
Operating System
Ubuntu 20.04 (LTS)
Device used for inference
CPU
Framework
None
Model used
Detokenizer.xml from TinyLlama-1.1B-Chat-v1.0
Issue description
The
STRING
element_type has been added to C-API, but in my testing with models that expectstring
tensors and output them, I see incorrect results. I have a test case below comparing a working C++ case, and a failing C case. I have done some processing on the received string data as you can see in the test case below, but I'm not able to get a valid string output.Reference - https://docs.openvino.ai/2024/openvino-workflow/running-inference/string-tensors.html
Step-by-step reproduction
Reproduction of getting string data from output of a model -
I was working with
TinyLlama-1.1B-Chat-v1.0
which I got from recommended steps in optimum-cli/gen.ai repos. I'm loading an extension for both cases, I have added support for loading extensions in C-API in my open PR, so you will need to use that for C case below.I am providing the detokenizer model with tokens extracted previously from Tinyllama model.
C++ case prints this correct output
C-Case prints some unvalid utf-8.
C++/Working Case
C/C-API/ Failing Case
Relevant log output
No response
Issue submission checklist
The text was updated successfully, but these errors were encountered: