Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple lakehouse writes from transaction lifecycle #88

Open
dpxcc opened this issue Jan 17, 2025 · 1 comment
Open

Decouple lakehouse writes from transaction lifecycle #88

dpxcc opened this issue Jan 17, 2025 · 1 comment
Assignees
Labels
feature good first issue Good for newcomers
Milestone

Comments

@dpxcc
Copy link
Contributor

dpxcc commented Jan 17, 2025

What feature are you requesting?

Decouple lakehouse writes from transaction lifecycle

Why are you requesting this feature?

Each transaction creates a new Delta Lake log record, which is inefficient for small INSERTs

What is your proposed implementation for this feature?

  • Add a new command, e.g. mooncake.export_table(), to manually trigger writes to the lakehouse
  • Parquet files will continue to be uploaded to the lake when generated, as they are now
  • Add a new metadata table to track changes that have not yet been exported to the lakehouse
@dpxcc dpxcc added the feature label Jan 17, 2025
@dpxcc dpxcc added this to the 0.2.0 milestone Jan 17, 2025
@dpxcc dpxcc added the good first issue Good for newcomers label Jan 17, 2025
@dentiny
Copy link
Contributor

dentiny commented Jan 19, 2025

Hi I'm interested in this ticket. Could you please assign it to me?

If you think my progress slow, feel free to ping me any time, I will try to prioritize and expedite. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants