Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close #LGVISIUM-101: Removed deprecated AWS Lambda script and removed the AWS Region from the environement variables #99

Conversation

dcleres
Copy link
Contributor

@dcleres dcleres commented Oct 21, 2024

After working together with Oliver from Geowerkstatt, we were able to find a setup that makes it possible to integrate the data extraction API into the borehole application. This mainly involved figuring out which environment variables AWS needed to connect to the right S3 bucket in the correct region.

As it is also very unlikely that we will deploy the API using AWS Lambda in the near future I removed the code to avoid any sort of confusion.

The README was also edited to only document the possibility of passing the AWS credentials as environment variables to the container, as this is how the container is used in production.

@dcleres dcleres self-assigned this Oct 21, 2024
Copy link

github-actions bot commented Oct 21, 2024

Coverage

Coverage Report
FileStmtsMissCoverMissing
src/stratigraphy
   __init__.py8188%11
   extract.py1861860%3–483
   get_files.py19190%3–47
   main.py1171170%3–308
src/stratigraphy/data_extractor
   data_extractor.py57395%33, 66, 103
src/stratigraphy/depthcolumn
   boundarydepthcolumnvalidator.py412051%47, 57, 60, 81–84, 110–128, 140–149
   depthcolumn.py1946467%25, 29, 50, 56, 59–60, 84, 87, 94, 101, 109–110, 120, 137–153, 191, 228, 247–255, 266, 271, 278, 309, 314–321, 336–337, 380–422
   depthcolumnentry.py28679%17, 21, 36, 39, 56, 65
   find_depth_columns.py1061982%42–43, 73, 86, 180–181, 225–245
src/stratigraphy/layer
   layer_identifier_column.py745230%16–17, 20, 28, 43, 47, 51, 59–63, 66, 74, 91–96, 99, 112, 125–126, 148–158, 172–199
src/stratigraphy/lines
   geometric_line_utilities.py86298%81, 131
   line.py51492%25, 50, 60, 110
   linesquadtree.py46198%75
src/stratigraphy/metadata
   coordinate_extraction.py108595%30, 64, 94–95, 107
src/stratigraphy/text
   description_block_splitter.py70297%24, 139
   extract_text.py29390%19, 53–54
   find_description.py642856%27–35, 50–63, 79–95, 172–175
   textblock.py80989%28, 56, 64, 89, 101, 124, 145, 154, 183
src/stratigraphy/util
   dataclasses.py32391%37–39
   interval.py1045547%29–32, 37–40, 46, 52, 56, 66–68, 107–153, 174, 180–196
   predictions.py1071070%3–282
   util.py391756%41, 69–76, 90–92, 116–117, 129–133
TOTAL165272356% 

Tests Skipped Failures Errors Time
82 0 💤 0 ❌ 0 🔥 6.162s ⏱️

@dcleres dcleres force-pushed the LGVISIUM-101-S3-connection-issues-when-the-API-is-deployed-in-the-borehole branch from 05f9c47 to 9a6b890 Compare October 21, 2024 08:24
… the AWS Region from the environement variables
@dcleres dcleres force-pushed the LGVISIUM-101-S3-connection-issues-when-the-API-is-deployed-in-the-borehole branch from 9a6b890 to cdc127c Compare October 21, 2024 08:40
@dcleres dcleres marked this pull request as ready for review October 21, 2024 08:41
README.md Show resolved Hide resolved
Comment on lines 6 to +7
AWS_ENDPOINT=your_endpoint_url
AWS_S3_BUCKET=your_bucket_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify why we need both AWS_ENDPOINT and AWS_S3_BUCKET? Is there any use case where these might refer to different buckets? Or if they always refer to the same bucket, then why can't we always connect using the endpoint URL (which contains the bucket name, if I recall correctly)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Geowerkstatt is using this structure for its environment variables, so I need to comply with it. When we developed the code, we passed all the information into the AWS_ENDPOINT variable, but Oliver told me that they prefer to do it this way. I think that this indeed leaves you the flexibility to open several S3 buckets from the same AWS account by adapting the bucket name.

If you have a strong opinion about having everything in the AWS_ENDPOINT then I can ask Oliver whether it is possible for them to adapt their code.

@dcleres
Copy link
Contributor Author

dcleres commented Oct 22, 2024

@stijnvermeeren-swisstopo I added the information provided by Oliver from Geowerkstatt to the README.

README.md Outdated Show resolved Hide resolved
Copy link
Contributor Author

@dcleres dcleres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adapting the README. Overall, I think it makes sense to remove some information regarding the environment variables as this was taking quite some space in the README. The rationale behind having all this information was that even people who are not super experienced with the concept of environment variables can have a quick start. With some of the info removed, I think it is still feasible but requires a bit of research. That is fine for me. I will keep that in mind for future README edits

README.md Show resolved Hide resolved
@dcleres dcleres merged commit ed163b0 into main Oct 22, 2024
3 checks passed
@stijnvermeeren-swisstopo stijnvermeeren-swisstopo deleted the LGVISIUM-101-S3-connection-issues-when-the-API-is-deployed-in-the-borehole branch October 23, 2024 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants