Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GCS #73

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

raghunandanbhat
Copy link

fixes #38

@dpxcc dpxcc requested a review from zhousun January 7, 2025 19:09
@dpxcc dpxcc force-pushed the main branch 2 times, most recently from 2af4b2b to 6a655ea Compare January 10, 2025 04:05
@raghunandanbhat raghunandanbhat force-pushed the gcs-support branch 2 times, most recently from 8ffabb2 to 03ea259 Compare January 13, 2025 15:29
@raghunandanbhat raghunandanbhat marked this pull request as ready for review January 13, 2025 16:33
@raghunandanbhat
Copy link
Author

@zhousun please review when you get a chance. thanks!

@zhousun
Copy link
Contributor

zhousun commented Jan 13, 2025

@raghunandanbhat thanks for the contribute. For delta-rs, I think it is possible to inline the secret as json instead of providing a path: https://docs.rs/object_store/latest/src/object_store/gcp/builder.rs.html#365

@raghunandanbhat
Copy link
Author

delta-rs now uses the service account json instead of the file path.

I believe it is better to provide the path when creating/inserting a secret into mooncake.secrets, as it is easier than typing the entire JSON.

@zhousun
Copy link
Contributor

zhousun commented Jan 15, 2025

If I understand correct, providing a path like this requires the files to be in server side. For example if you connect to a remote postgres, you cannot use a secret file in local path. That's why I believe having a way to direct type/copy the secret is important.

Copy link
Contributor

@zhousun zhousun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general lgtm, will approve once you resolve the other comment (service account json file won't work with server-client setup).
thx!

README.md Outdated
Comment on lines 86 to 96
#### GCS Buckets
- Add your GCS credentials
```sql
SELECT mooncake.create_secret('<name>', 'GCS', '<HMAC_key_id>', '<HMAC_secret>', '{"PATH":"path/to/service_account.json"}');
```
- Set default bucket after adding GCS credentials.
```sql
SET mooncake.default_bucket = 'gs://<gcs_bucket>';
```
>**Note**: delta-rs seems to accept only Service Account JSON credentials when creating delta tables in GCS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: To keep readme more condense, we will move the bucket instruction to https://pgmooncake.com/docs/cloud-storage. :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.
please make sure to update the sql query.

SELECT mooncake.create_secret('<name>', 'GCS', '<HMAC_key_id>', '<HMAC_secret>', '{"PATH":"path/to/service_account.json"}');

to

SELECT mooncake.create_secret('<name>', 'GCS', '<HMAC_key_id>', '<HMAC_secret>', '{"GCS_SECRET":"Service Account JSON"}');

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghunandanbhat
Copy link
Author

If I understand correct, providing a path like this requires the files to be in server side. For example if you connect to a remote postgres, you cannot use a secret file in local path. That's why I believe having a way to direct type/copy the secret is important.

what if we keep both options?

  • type/copy secrets for server-client setup
  • path based secrets for local setup

@zhousun
Copy link
Contributor

zhousun commented Jan 15, 2025

If I understand correct, providing a path like this requires the files to be in server side. For example if you connect to a remote postgres, you cannot use a secret file in local path. That's why I believe having a way to direct type/copy the secret is important.

what if we keep both options?

  • type/copy secrets for server-client setup
  • path based secrets for local setup

Yep that would work

rust_extensions/delta/Cargo.toml Outdated Show resolved Hide resolved
rust_extensions/delta/src/lib.rs Show resolved Hide resolved
rust_extensions/delta/src/lib.rs Outdated Show resolved Hide resolved
sql/pg_mooncake--0.1.0.sql Outdated Show resolved Hide resolved
src/columnstore/columnstore_metadata.cpp Outdated Show resolved Hide resolved
src/columnstore/columnstore_metadata.cpp Outdated Show resolved Hide resolved
rust_extensions/delta/src/lib.rs Outdated Show resolved Hide resolved
@@ -48,8 +55,10 @@ pub fn DeltaCreateTable(
runtime.block_on(async {
let mut storage_options: HashMap<String, String> =
serde_json::from_str(options.to_str()?).expect("invalid options");
// Write directly to S3 without locking is safe since Mooncake is the only writer
storage_options.insert("AWS_S3_ALLOW_UNSAFE_RENAME".to_string(), "true".to_string());
if get_storage_type(path) == StorageType::S3 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this again, I think it might be better to simply return the type from ColumnstoreMetadata::SecretsSearchDeltaOptions() and pass it down here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is much simpler. we can return tuple<string, string> from ColumnstoreMetadata::SecretsSearchDeltaOptions() and pass them separately to DeltaCreateTable()

src/columnstore/columnstore_metadata.cpp Outdated Show resolved Hide resolved
1. Support raw JSON secrets and path based secret for GCS
2. Bugfix to pick correct secret based on longest matching scope
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support for gcs
3 participants