Tag query/write inconsistent quotes with case sensitive tags and unhandled error #116

DataCerealz · 2025-01-13T14:17:38Z

Specifications

Client Version: 0.10.0
InfluxDB Version: 3
Platform: Cloud

Code sample to reproduce problem

Assume this snipped to create a downsampling query and write the result back to a new downsampling bucket

start_ts = ...
end_ts = ...

tags_to_preserve = ['currency', 'marketPlace']  # NOTE: the error is hidden in this line!

query = f"""
    SELECT 
        date_bin_wallclock(INTERVAL '1 hour', tz(time, 'Europe/Berlin')) AS time,
        SUM(price) as price,
        {', '.join(tags_to_preserve)}
    FROM
        "marketdata"
    WHERE
        time >= timestamp '{start_ts}' AND time <= timestamp '{end_ts}'
    GROUP BY 
        1, {', '.join(tags_to_preserve)}
    """

table = client_for_original_data_bucket.query(query=query, language="sql")
data_frame = table.to_pandas()
data_frame = data_frame.sort_values(by="time")

client_with_bucket_to_write_downsampled_data_to.write(
      record=data_frame,
      data_frame_measurement_name="marketdata",
      data_frame_timestamp_column="time",
      data_frame_tag_columns=tags_to_preserve,
  )

Expected behavior

I would expect the above code snippet to work.

Actual behavior

The snippet results in an error message.

However, this error is thrown:

pyarrow.lib.ArrowInvalid: Flight returned invalid argument error, with message: Error while planning query: Schema error: No field named marketplace. Valid fields are marketdata.price, marketdata.currency, marketdata."marketPlace", marketdata.time.. gRPC client debug context: UNKNOWN

As you can see the reason is the tag called marketPlace. The client ignores the case of the string and changes it to "marketplace" for the query - which is a tag that does not exist and results in an error.

Let's try to fix above code like this:

...

tags_to_preserve = ['currency', '"marketPlace"']

...

By adding double quotes around the case sensitive tag, the query works!
Well...except it doesn't. Now we get another error:

Reason: Internal Server Error
HTTP response body: {"code":"internal error","message":"dml handler error: rejected write: Timeout expired (the operation was cancelled)"}

Besides the fact that this error message is not helpful, the reason is the resulting data frame.

If we run data_frame.columns it returns this: Index(['time', 'price', 'currency', 'marketPlace'], dtype='object')

See the issue? Now the last line of the code does not work anymore:

client_with_bucket_to_write_downsampled_data_to.write(
      record=data_frame,
      data_frame_measurement_name="marketdata",
      data_frame_timestamp_column="time",
      data_frame_tag_columns=tags_to_preserve,
  )

because tags_to_preserve = ['currency', '"marketPlace"'] contains the column name "marketPlace" but only column marketPlace exists in the data frame.

So the solution here is pretty funny:

Add quotes around every tag in the list of tags before executing the query
Remove all those quotes again when passing them as column names to the client because otherwise it won't be able to identify the columns.

Meaning following adjustments need to be made to code that injects tags into a query string like this:

...
tags_to_preserve = ['currency', 'marketPlace']

# prepare tags to be inserted into query string
tags_to_preserve = [f'"{tag}"' for tag in tags_to_preserve]

...
# execute query
...

# prepare tags to be used as column accessors by influx client
tags_to_preserve = [tag.replace('"', '') for tag in tags_to_preserve]

...
# write data
...

So overall we have two issues:

I was not aware tags are case sensitive and especially that the influxdb3 python client ignores case sensitivity except for tags that are wrapped in double quotes (is any of this documented anywhere? couldn't find it). This behaviour is very annoying in cases like above and I feel like the client should be able to deal with this on it's own (or at least be consistent about it so subsequent actions remain compatible).
Supplying a tag column that does not exist results in an unhandled error.

Additional info

No response

The text was updated successfully, but these errors were encountered:

DataCerealz added the bug Something isn't working label Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tag query/write inconsistent quotes with case sensitive tags and unhandled error #116

Tag query/write inconsistent quotes with case sensitive tags and unhandled error #116

DataCerealz commented Jan 13, 2025 •

edited

Loading

Tag query/write inconsistent quotes with case sensitive tags and unhandled error #116

Tag query/write inconsistent quotes with case sensitive tags and unhandled error #116

Comments

DataCerealz commented Jan 13, 2025 • edited Loading

Specifications

Code sample to reproduce problem

Expected behavior

Actual behavior

Additional info

DataCerealz commented Jan 13, 2025 •

edited

Loading