Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-3301] Support OSS for iceberg in InternalCatalog #3306

Merged
merged 22 commits into from
Nov 21, 2024

Conversation

shouwangyw
Copy link
Contributor

Why are the changes needed?

Close #3301.

Brief change log

  • Add OSS storage type, support OSS for iceberg in InternalCatalog

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added module:mixed-spark Spark module for Mixed Format module:ams-server Ams server module type:build module:ams-dashboard Ams dashboard module module:common labels Oct 29, 2024
@czy006
Copy link
Contributor

czy006 commented Oct 29, 2024

Thank you for your contribution. As far as I know, the current S3 protocol can also use Ali Cloud OSS object storage. Is iceberg storage support added here? By the way, we should make OSS optional at compile time

Copy link
Contributor

@czy006 czy006 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is best to have screenshots or unit tests to verify that the functionality is correct ~

@shouwangyw
Copy link
Contributor Author

Thank you for your contribution. As far as I know, the current S3 protocol can also use Ali Cloud OSS object storage. Is iceberg storage support added here? By the way, we should make OSS optional at compile time

I don't understand how to use Ali Cloud OSS object storage with the S3 protocol?
This PR has added Ali Cloud OSS related dependencies

@shouwangyw
Copy link
Contributor Author

OSS support can be correctly added through the front end
image

I have completed the test in my local environment, able to obtain relevant data correctly
image

@shouwangyw shouwangyw requested a review from czy006 November 1, 2024 05:41
@czy006
Copy link
Contributor

czy006 commented Nov 5, 2024

image
you can use s3 protocol to read it

cc @shouwangyw

@czy006
Copy link
Contributor

czy006 commented Nov 5, 2024

CI is Failing

@shouwangyw
Copy link
Contributor Author

shouwangyw commented Nov 5, 2024

you can use s3 protocol to read it

cc @czy006

But this might not be intuitive

@shouwangyw
Copy link
Contributor Author

CI is Failing

I have fixed the ci

@czy006
Copy link
Contributor

czy006 commented Nov 6, 2024

you can use s3 protocol to read it
cc @czy006

But this might not be intuitive

Of course this is a compromise. I would like to ask if we can have a compilation switch to compile the OSS part when we needed? As object storage support increases, compiling all by default will increase Amoro's binary installation package

@shouwangyw
Copy link
Contributor Author

shouwangyw commented Nov 6, 2024

you can use s3 protocol to read it
cc @czy006

But this might not be intuitive

Of course this is a compromise. I would like to ask if we can have a compilation switch to compile the OSS part when we needed? As object storage support increases, compiling all by default will increase Amoro's binary installation package

yes, I can choose whether to package aliyun sdk by scope

@github-actions github-actions bot added the type:docs Improvements or additions to documentation label Nov 6, 2024
Copy link
Contributor

@czy006 czy006 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, I left a few comments. and suggest include documentation about this feature

README.md Outdated
@@ -117,6 +117,7 @@ Amoro is built using Maven with JDK 8 and JDK 17(only for `amoro-format-mixed/am
* Build and skip tests: `mvn clean package -DskipTests`
* Build and skip dashboard: `mvn clean package -Pskip-dashboard-build`
* Build and disable disk storage, RocksDB will NOT be introduced to avoid memory overflow: `mvn clean package -DskipTests -Pno-extented-disk-storage`
* Build and disable aliyun sdk: `mvn clean package -DskipTests -Pno-aliyun-sdk`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend not compiling Aliyun OSS by default, if the user wants to use it can use -Paliyun-oss-sdk

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 22.54%. Comparing base (f13f3f3) to head (99e0eba).
Report is 7 commits behind head on master.

❗ There is a different number of reports uploaded between BASE (f13f3f3) and HEAD (99e0eba). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (f13f3f3) HEAD (99e0eba)
core 1 0
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3306      +/-   ##
============================================
- Coverage     30.17%   22.54%   -7.63%     
+ Complexity     3840     2312    -1528     
============================================
  Files           580      397     -183     
  Lines         48020    38201    -9819     
  Branches       6207     5437     -770     
============================================
- Hits          14488     8611    -5877     
+ Misses        32542    28863    -3679     
+ Partials        990      727     -263     
Flag Coverage Δ
core ?
trino 22.54% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@shouwangyw shouwangyw requested a review from czy006 November 7, 2024 13:11
Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shouwangyw Thanks for the contribution!
The PR looks good to me in general, I left a question, PTAL when you are free.

#

### get catalog config
GET http://localhost:1630/api/iceberg/rest/v1/config?warehouse=iceberg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what is the usage of this new file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for easy testing of HTTP interfaces in a local environment, and the 'HTTP Client' plugin can be installed in IDEA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is this used for local testing? I'm not sure if this file is necessary for other developers. If it is, we might need to add a README to explain how to use it.

Additionally, I'm not sure if using the IDEA plugin is a reasonable way to test the REST API. Perhaps we should integrate them into unit tests or integration tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be suitable for unit tests or integration tests, as it relies on the local environment. I added a readme file to explain it.

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thanks for the contribution!

@zhoujinsong zhoujinsong merged commit f88b7a6 into apache:master Nov 21, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ams-dashboard Ams dashboard module module:ams-server Ams server module module:common module:mixed-spark Spark module for Mixed Format type:build type:docs Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Support OSS for iceberg in InternalCatalog
4 participants