feat: lutra query runner #4134

aljazerzen · 2024-01-25T12:11:13Z

Initial take on #3825. Inspired by prql-query.

Adds a query runner named Lutra that is basically a wrapper for prqlc compile, sqlite3 run and print(pl.DataFrame).

Connection parameters are defined within PRQL source with @lutra.sqlite annotation.

Uses connectorx for executing the query and converting to Apache Arrow. Uses polars to print the dataframe.

Next step is to create Python bindings so the resulting dataframe can be used with other existing tooling.

hulxv · 2024-01-25T14:08:36Z

Before a few months, I built a tool called prqlite (I think you took a look at it). Is there any difference between it and Lutra?

eitsupi · 2024-01-25T14:13:27Z

Oops, this seems to depend on a very old version of polars.......

aljazerzen · 2024-01-25T14:25:16Z

Before a few months, I built a tool called prqlite (I think you took a look at it). Is there any difference between it and Lutra?

There are. prqlite is primarily a TUI and it does that well. My intention with Lutra is to be able to call it from Python (and other languages) while providing it with only PRQL source.

aljazerzen · 2024-01-25T14:27:27Z

Oops, this seems to depend on a very old version of polars

Yeah. It also won't compile for wasm32-unknown-unknown without a serious effort, as we'd have to compile polars and rusqlite too. That does not seem like a battle worth taking right now.

hulxv · 2024-01-25T14:28:47Z

My intention with Lutra is to be able to call it from Python (and other languages) while providing it with only PRQL source.

Then, Lutra is something like a library or do I misunderstand it?

aljazerzen · 2024-01-25T14:34:22Z

Then, Lutra is something like a library or do I misunderstand it?

Yes, but also with an option to invoke it from the command line.

max-sixty · 2024-01-26T01:41:58Z

Before a few months, I built a tool called prqlite (I think you took a look at it). Is there any difference between it and Lutra?

@hulxv this is cool! Nice work. I don't think we knew about this...

(the goals seem to be slightly different to Lutra fwiw)

max-sixty · 2024-01-26T01:43:22Z

lutra/Taskfile.yml

+      - cmd: |
+          # remove trailing whitespace
+          rg '\s+$' --files-with-matches --glob '!*.{rs,snap}' . \
+          | xargs -I _ sh -c "echo Removing trailing whitespace from _ && sd '[\t ]+$' '' _"


(FYI if you're in the mood for new tools, I really like rargs rather than xargs for this sort of thing. I almost never use xargs now)

...even if I still maintain that pre-commit is better for these sorts of lints! 😄

max-sixty · 2024-01-26T01:48:31Z

lutra/Taskfile.yml

+          cargo \
+            llvm-cov --lcov --output-path lcov.info \
+            nextest \
+            {{.packages}}


FYI if you want to put nextest & {{.packages}} into the base task totally fine w me. Then we have a single task and can pass packages.

(but also zero stress if you prefer to have them separate)

I like to have a separate task that builds and tests only the thing that I'm currently developing.

What do you mean "pass packages"? To have the base task take packages as argument?

What do you mean "pass packages"? To have the base task take packages as argument?

Currently this task has a {{.packages}} arg, so it's possible to pass a subset of packages

I like to have a separate task that builds and tests only the thing that I'm currently developing.

Very much agree! TBC, this task is almost identical to the task in the parent task, but passes -p lutra to .packages. So they could be combined.

(I can make a PR later, maybe that's an easier way of explaining)

max-sixty · 2024-01-26T02:11:37Z

lutra/example-project/Project.prql

@@ -0,0 +1,19 @@
+
+## This module is configured to pull data from an SQLite chinook.db database
+@(lutra.sqlite {file="chinook.db"})


How do we want to think about coupling the target and the query?

I would think we want to use some form of composition, where we say "associate this chinook.db file with this module chinook" somewhere, but we don't couple their definitions, so it's possible to change.

At a more conceptual level, it's could be similar to applying a functor in OCaml (or somewhat implementing a trait in Rust) — specifying a mapping of conceptual queries/functions to concrete queries/functions. Another reference is dbt's approach for templating the source of a query, so it's possible to run the same queries on a production vs dev vs personal database / schema.

I don't have a really good idea for how to implement this, and obv fine to experiment with things. But I do think that a really good design would be to make decouple the query and the target.

Yeah, I was thinking about not having actual connection params within the PRQL source, similar to how dbt does it.

I've opted for this approach because it is way more convenient and it does not actually rule out composition - the queries are not defined within the database module. This means that one could swap out the database module and run the queries on a different database.

This could be done by defining the database module in a separate file, which is not committed to VCS or is overridden in production.

Yes, we can think about this.

I think there's some nice design that's possible, where we map the relations to concrete tables. dbt do a nice job at the moment even if it needs a decent amount of jinja.

I'll meditate on it...

max-sixty

Exciting!

Maybe we put a readme of a couple of lines saying what it is and that it's still experimental?

Cargo.toml

max-sixty · 2024-02-04T04:17:30Z

FYI can merge despite the failures — one is a cargo-audit which is fine and we can ignore; the other is the drop in test coverage

aljazerzen · 2024-02-04T14:49:57Z

I'd like to replace connector-x with connector-arrow before merging. ConnectorX has stale dependencies and does a lot of things that are not needed in for lutra.

feat: lutra query runner

b81f7c4

aljazerzen force-pushed the feat-lutra branch from 8f1293f to 965f199 Compare January 25, 2024 12:56

revert the module resolution order

eb1213d

aljazerzen force-pushed the feat-lutra branch from ac9bf73 to eb1213d Compare January 25, 2024 13:00

fix

acb95f8

max-sixty reviewed Jan 26, 2024

View reviewed changes

max-sixty approved these changes Jan 26, 2024

View reviewed changes

aljazerzen commented Jan 26, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

eitsupi requested a review from max-sixty February 2, 2024 14:38

max-sixty added 2 commits February 2, 2024 14:29

Exclude wasm targets from cli build

ac7d77b

Merge branch 'main' into feat-lutra

559a89e

max-sixty approved these changes Feb 2, 2024

View reviewed changes

8044229

max-sixty force-pushed the feat-lutra branch from 1432cb2 to 8044229 Compare February 3, 2024 01:24

max-sixty added 2 commits February 2, 2024 17:37

Use prqlc

20ad2fc

Adjust dependency versions

9d6ee48

max-sixty mentioned this pull request Feb 3, 2024

[draft] Lutra CI #4169

Closed

max-sixty added 2 commits February 2, 2024 23:03

Merge branch 'main' into feat-lutra

8f30132

cherry-pick version changes

a909223

aljazerzen added 5 commits February 5, 2024 11:14

move to a subdir, for compatibility with Python bindings

ac49aad

readme

3a7ac10

replace connector-x with connector_arrow

5fcf9f1

fix

fd787ab

a few tests

01f3f2d

aljazerzen force-pushed the feat-lutra branch from 8a1cf65 to 01f3f2d Compare February 5, 2024 18:06

don't test on wasm

6b64607

aljazerzen enabled auto-merge (squash) February 5, 2024 21:42

aljazerzen merged commit 114fb19 into main Feb 5, 2024
79 of 81 checks passed

aljazerzen deleted the feat-lutra branch February 5, 2024 22:02

This was referenced Feb 6, 2024

feat: lutra python bindings #4174

Merged

Lutra roadmap #4177

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: lutra query runner #4134

feat: lutra query runner #4134

aljazerzen commented Jan 25, 2024 •

edited

Loading

hulxv commented Jan 25, 2024

eitsupi commented Jan 25, 2024

aljazerzen commented Jan 25, 2024

aljazerzen commented Jan 25, 2024

hulxv commented Jan 25, 2024

aljazerzen commented Jan 25, 2024

max-sixty commented Jan 26, 2024

max-sixty Jan 26, 2024

max-sixty Jan 26, 2024

max-sixty Jan 26, 2024

aljazerzen Jan 26, 2024

max-sixty Feb 2, 2024

max-sixty Jan 26, 2024 •

edited

Loading

aljazerzen Jan 26, 2024

max-sixty Jan 30, 2024

max-sixty left a comment

max-sixty commented Feb 4, 2024

aljazerzen commented Feb 4, 2024

		@@ -0,0 +1,19 @@

		## This module is configured to pull data from an SQLite chinook.db database
		@(lutra.sqlite {file="chinook.db"})

feat: lutra query runner #4134

feat: lutra query runner #4134

Conversation

aljazerzen commented Jan 25, 2024 • edited Loading

hulxv commented Jan 25, 2024

eitsupi commented Jan 25, 2024

aljazerzen commented Jan 25, 2024

aljazerzen commented Jan 25, 2024

hulxv commented Jan 25, 2024

aljazerzen commented Jan 25, 2024

max-sixty commented Jan 26, 2024

max-sixty Jan 26, 2024

Choose a reason for hiding this comment

max-sixty Jan 26, 2024

Choose a reason for hiding this comment

max-sixty Jan 26, 2024

Choose a reason for hiding this comment

aljazerzen Jan 26, 2024

Choose a reason for hiding this comment

max-sixty Feb 2, 2024

Choose a reason for hiding this comment

max-sixty Jan 26, 2024 • edited Loading

Choose a reason for hiding this comment

aljazerzen Jan 26, 2024

Choose a reason for hiding this comment

max-sixty Jan 30, 2024

Choose a reason for hiding this comment

max-sixty left a comment

Choose a reason for hiding this comment

max-sixty commented Feb 4, 2024

aljazerzen commented Feb 4, 2024

aljazerzen commented Jan 25, 2024 •

edited

Loading

max-sixty Jan 26, 2024 •

edited

Loading