Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Superset using duckdb connection encounter DB engine error #31978

Open
3 tasks done
ivanonair opened this issue Jan 24, 2025 · 2 comments
Open
3 tasks done

Superset using duckdb connection encounter DB engine error #31978

ivanonair opened this issue Jan 24, 2025 · 2 comments
Labels
dashboard:performance Related to Dashboard performance data:connect:duckdb Related to a specific database

Comments

@ivanonair
Copy link

ivanonair commented Jan 24, 2025

Bug description

When I use DuckDB to connect to MINIO to read data, I often encounter (not always) the following error:

"DB engine Error
This may be triggered by: Issue 1011 - Superset encountered an unexpected error."

or sometime error msg are

"upstream connect error or disconnect/reset before headers. reset reason: connection termination"

This seems to be related to having multiple filters on a dashboard. I observed that to generate the dropdown menu for each filter, a distinct SQL query is executed on the dataset for each filter. I suspect that this is causing many concurrent queries to be executed simultaneously when the dashboard is opened.

And there is also an error msg on superset server:

duckdb.duckdb.InvalidInputException: Invalid Input Error: No open result set.

The current probability of error occurrence is about fifty-fifty without changing any settings, not occur every time.

This leads to suspicions that it might be an issue with the rate limit or the number of threads. In the article below, I saw that when using duckdb.connect(:default:), it might induce error since running queries concurrently on the same connection is not supported.

duckdb.duckdb.InvalidInputException: Invalid Input Error: Attempting to execute an unsuccessful or closed pending query result

duckdb issue link

However, this is different from the error message I received, and I'm not sure if it's related. Has anyone else using Superset + DuckDB encountered a similar issue? Thanks."

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.10

Node version

16

Browser

Chrome

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Copy link

dosubot bot commented Jan 24, 2025

Hey there, @ivanonair! I'm here to help you with any bugs, questions, or contributions you have in mind. Let's tackle this issue together!

The issue you're encountering with DuckDB in Superset might be related to how concurrent queries are handled. DuckDB does not support running multiple queries concurrently on the same connection, which could lead to the errors you're seeing when multiple filters are applied on a dashboard. This is consistent with the error message about "No open result set" and the potential connection termination.

To address this, you might consider the following approaches:

  1. Connection Pooling: Ensure that each query is executed on a separate connection. This can help avoid conflicts that arise from concurrent query execution on a single connection.

  2. Rate Limiting: Check if there are any settings related to rate limiting or the number of concurrent connections allowed. Adjusting these settings might help manage the load better.

  3. Query Optimization: Review the queries generated by the filters to see if they can be optimized or combined to reduce the number of concurrent queries.

  4. Superset Configuration: Look into Superset's configuration options to see if there are any settings that can help manage database connections more effectively when using DuckDB.

These steps might help mitigate the issue by ensuring that queries are handled in a way that aligns with DuckDB's capabilities.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added dashboard:performance Related to Dashboard performance data:connect:duckdb Related to a specific database labels Jan 24, 2025
@villebro
Copy link
Member

Ping @guenp as I know you're involved with DuckDB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dashboard:performance Related to Dashboard performance data:connect:duckdb Related to a specific database
Projects
None yet
Development

No branches or pull requests

2 participants