Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ETL pipeline testing QS (e2e-python) #985

Merged
merged 47 commits into from
Feb 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
56ced84
Add files via upload
iaizarc Jul 25, 2023
7a086e4
Delete requirements.txt
iaizarc Jul 25, 2023
ebadf37
Create e2e-python
iaizarc Jul 25, 2023
677d6dc
Delete e2e-python
iaizarc Jul 25, 2023
897af12
Create README.md
iaizarc Jul 25, 2023
90f2c2f
Merge pull request #3 from opendevstack/master
roicarrera Jan 17, 2024
5e0f7c8
remove e2e-python folder
cristianhkr Jan 17, 2024
da44921
add e2e-python folder and content
cristianhkr Jan 17, 2024
cc4059b
use functional pre/post_requistes.py in demo tests
cristianhkr Jan 17, 2024
bf4a0e6
remove person expectation expect_table_row_count_to_equal
cristianhkr Jan 18, 2024
8f30f5d
1. Sample tests working
cristianhkr Jan 23, 2024
f7a3e15
remove escaping of testcase name
cristianhkr Jan 24, 2024
6a1221c
replace testpg by projectId in the variables.tf as a default value
cristianhkr Jan 24, 2024
7b2887a
rephrase the comments in the post and pre requisties.py
cristianhkr Jan 24, 2024
16a202b
Merge pull request #4 from iaizarc/add_e2e-python
cristianhkr Jan 24, 2024
3ad8d40
Update CHANGELOG.md
cristianhkr Jan 24, 2024
ca30f7b
Merge pull request #5 from iaizarc/add_changelog
cristianhkr Jan 24, 2024
9026ecc
Update CHANGELOG.md including ods changes
cristianhkr Jan 24, 2024
21257d4
Merge pull request #6 from iaizarc/udpate_changelogs
cristianhkr Jan 24, 2024
4fab27a
Update README.md
cristianhkr Jan 26, 2024
8d14340
Update README.md root
cristianhkr Jan 26, 2024
ce8f907
README.md root now is generic and intern README.md added + requirements
cristianhkr Jan 29, 2024
78dfc75
README.md root link updated
cristianhkr Jan 29, 2024
6c4c43a
README.md root updated
cristianhkr Jan 29, 2024
6529de4
update README.md
cristianhkr Jan 29, 2024
b969502
Merge pull request #7 from iaizarc/readme_update
roicarrera Jan 29, 2024
a6c84ea
update README.md
cristianhkr Jan 29, 2024
43d5671
update CHANGELOG.md
cristianhkr Jan 29, 2024
281a62b
add reference e2e-python to CHANGELOG.md
cristianhkr Jan 29, 2024
41e78cd
Merge pull request #8 from iaizarc/readme_update
roicarrera Jan 29, 2024
66660ea
added latest ods changes
roicarrera Jan 30, 2024
2fba61e
removed unneeded commented line in Jenkinsfile.template
roicarrera Jan 30, 2024
e2e572f
Merge branch 'master' of https://github.com/opendevstack/ods-quicksta…
roicarrera Feb 1, 2024
e6fe5fe
Merge pull request #9 from iaizarc/readme_update
roicarrera Feb 1, 2024
214ae2b
update README.md with use cases
cristianhkr Feb 5, 2024
94e105f
update README.md with use cases
cristianhkr Feb 5, 2024
5c708d6
Merge remote-tracking branch 'origin/readme_update' into readme_update
cristianhkr Feb 5, 2024
2c766f7
rename qs and create ods webpage
cristianhkr Feb 6, 2024
82c4273
duplicated code don't really know why
cristianhkr Feb 6, 2024
e8d133a
Merge pull request #10 from iaizarc/readme_update
cristianhkr Feb 6, 2024
fb51f56
update index and nav docs adding reference e2e-etl-python
cristianhkr Feb 6, 2024
9780365
Merge pull request #11 from iaizarc/readme_update
cristianhkr Feb 6, 2024
ee69f56
remove test folder
cristianhkr Feb 7, 2024
b825efc
change comment on the Makefile
cristianhkr Feb 8, 2024
dc4650d
Merge pull request #12 from iaizarc/readme_update
roicarrera Feb 8, 2024
670540a
add default value to AWS_REGION
cristianhkr Feb 12, 2024
afeffb9
Merge pull request #13 from iaizarc/readme_update
roicarrera Feb 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ updates:
interval: "weekly"
BraisVQ marked this conversation as resolved.
Show resolved Hide resolved
labels:
- "dependencies"
- "skip changelog"
- "skip changelog"
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

### Added
- Rust Quickstarter with Axum web framework simple boilerplate ([#980](https://github.com/opendevstack/ods-quickstarters/issues/980))
- Added ETL pipeline testing QS (e2e-python) ([#985](https://github.com/opendevstack/ods-quickstarters/pull/985))
- Update gateway-Nginx quickstarter ([#983](https://github.com/opendevstack/ods-quickstarters/pull/983))
- Added secret scanning in docker plain ([#963](https://github.com/opendevstack/ods-quickstarters/pull/963))
- Added Nodejs20 agent ([#962](https://github.com/opendevstack/ods-quickstarters/issues/962))
Expand Down
1 change: 1 addition & 0 deletions docs/modules/quickstarters/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
** xref:quickstarters:ds-rshiny.adoc[Data Science RShiny app]
** xref:quickstarters:ds-streamlit.adoc[Data Science Streamlit app]
** xref:quickstarters:e2e-cypress.adoc[Cypress E2E testing]
** xref:quickstarters:e2e-etl-python.adoc[ETL Python E2E testing]
** xref:quickstarters:e2e-spock-geb.adoc[Spock, Geb and Unirest E2E testing]
** xref:quickstarters:inf-terraform-aws.adoc[INF Terraform AWS]
** xref:quickstarters:inf-terraform-azure.adoc[INF Terraform AZURE]
Expand Down
46 changes: 46 additions & 0 deletions docs/modules/quickstarters/pages/e2e-etl-python.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
= End-to-end tests with Great Expectations and Pytest (e2e-etl-python)

End-to-end tests for ETLs quickstarter project

== Purpose of this quickstarter

This is a python based quicktarter intended to develop end-to-end tests for data pipelines.
In order to do that it uses two testing technologies:
1. Great Expectations, meant for data transformation testing data within relational tables.
e.g.: You could test the schema of a database, the number of rows, that a specific column has no null values, etc
2. Pytest together with Boto it allows for testing etl triggers, notification system, content of S3 buckets, etc

== What files / architecture is generated?

----
├── Jenkinsfile - This file contains Jenkins stages.
├── README.md
├── environments
│ ├── dev.json - This file describes parameters for the development AWS environment.
│ ├── test.json - This file describes parameters for the test AWS environment.
│ └── prod.json - This file describes parameters for the production AWS environment.
├── tests - This folder contains the root for test-kitchen
│ ├── acceptance/great_expectations - This folder contains the Great Expecations tests to test
│ └── acceptance/pytest - This folder contains the pytest tests to test


----

== Frameworks used

* https://greatexpectations.io[Great-expectations]
* https://pytest.org[Pytest]


== Usage - how do you start after you provisioned this quickstarter

Check the README.md file at root level for further instructions after the quickstarter has been provisioned.


== Builder agent used

This quickstarter uses https://github.com/opendevstack/ods-quickstarters/tree/master/common/jenkins-agents/terraform[terraform] Jenkins agent.

== Known limitations

Let us know if you find any, thanks!
1 change: 1 addition & 0 deletions docs/modules/quickstarters/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ Quickstarters are used from the https://github.com/opendevstack/ods-provisioning
=== E2E Test Quickstarter
* xref::e2e-cypress.adoc[E2E test - Cypress]
* xref::e2e-spock-geb.adoc[E2E test - Spock / Geb]
* xref::e2e-etl-python.adoc[E2E test - ETL Python]

=== Infrastructure Terraform Quickstarter
* xref::inf-terraform-aws.adoc[AWS deployments utilizing terraform tooling]
Expand Down
48 changes: 48 additions & 0 deletions e2e-etl-python/Jenkinsfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
def odsNamespace = ''
def odsGitRef = ''
def odsImageTag = ''
def sharedLibraryRef = ''
def agentImageTag = ''

node {
odsNamespace = env.ODS_NAMESPACE ?: 'ods'
odsGitRef = env.ODS_GIT_REF ?: 'master'
odsImageTag = env.ODS_IMAGE_TAG ?: 'latest'
sharedLibraryRef = env.SHARED_LIBRARY_REF ?: odsImageTag
agentImageTag = env.AGENT_IMAGE_TAG ?: odsImageTag
}

library("ods-jenkins-shared-library@${sharedLibraryRef}")

odsQuickstarterPipeline(
imageStreamTag: "${odsNamespace}/jenkins-agent-base:${agentImageTag}",
) { context ->

odsQuickstarterStageCopyFiles(context)

odsQuickstarterStageRenderJenkinsfile(context)

odsQuickstarterStageRenderJenkinsfile(
context,
[source: 'dev.yml.template',
target: 'environments/dev.yml']
)

odsQuickstarterStageRenderJenkinsfile(
context,
[source: 'test.yml.template',
target: 'environments/test.yml']
)

odsQuickstarterStageRenderJenkinsfile(
context,
[source: 'prod.yml.template',
target: 'environments/prod.yml']
)

odsQuickstarterStageRenderJenkinsfile(
context,
[source: 'testing.yml.template',
target: 'environments/testing.yml']
)
}
183 changes: 183 additions & 0 deletions e2e-etl-python/Jenkinsfile.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
/* generated jenkins file used for building and deploying AWS-infrastructure in projects */

@Library('ods-jenkins-shared-library@@shared_library_ref@') _

node {
aws_region = env.AWS_REGION ?: 'eu-west-1'
dockerRegistry = env.DOCKER_REGISTRY
}

odsComponentPipeline(
podContainers: [
containerTemplate(
name: 'jnlp',
image: "${dockerRegistry}/ods/jenkins-agent-terraform-2306:@shared_library_ref@",
envVars: [
envVar(key: 'AWS_REGION', value: aws_region)
],
alwaysPullImage: true,
args: '${computer.jnlpmac} ${computer.name}'
)
],
branchToEnvironmentMapping: [
'*': 'dev',
// 'release/': 'test'
]
) { context ->
getEnvironment(context)
addVars2envJsonFile(context)
odsComponentStageInfrastructure(context, [cloudProvider: 'AWS'])

withEnv(["AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}",
"AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}"
])
{
stage ("AWS Testing Preparation"){
generateTerraformOutputsFile()
}

def outputNames = stageGetNamesFromOutputs()
def aws_pipelineName = outputNames.aws_codepipeline_name
def bitbuckets3_name = outputNames.bitbuckets3_name
def results3_name = outputNames.results3_name

stage ("Publish Bitbucket Code To AWS"){
publishBitbucketCodeToAWS(context, bitbuckets3_name)
}

stage ("Run Tests"){
awsCodePipelineTrigger(context, aws_pipelineName)
awsCodePipelineWaitForExecution(context, aws_pipelineName)
}

stage ("Test Results"){
retrieveReportsFromAWS(context, results3_name)
archiveArtifacts artifacts: "build/test-results/test/**", allowEmptyArchive: true
junit(testResults:'build/test-results/test/*.xml', allowEmptyResults: true)
stash(name: "acceptance-test-reports-junit-xml-${context.componentId}-${context.buildNumber}", includes: "build/test-results/test/acceptance*junit.xml", allowEmpty: true)
stash(name: "installation-test-reports-junit-xml-${context.componentId}-${context.buildNumber}", includes: "build/test-results/test/installation*junit.xml", allowEmpty: true)
stash(name: "integration-test-reports-junit-xml-${context.componentId}-${context.buildNumber}", includes: "build/test-results/test/integration*junit.xml", allowEmpty: true)
}
}

}

def getEnvironment(def context){
sh "echo Get Environment Variables"
AWS_ACCESS_KEY_ID = sh(returnStdout: true, script:"oc get secret aws-access-key-id-${context.environment} --namespace ${context.cdProject} --output jsonpath='{.data.secrettext}' | base64 -d")
AWS_SECRET_ACCESS_KEY = sh(returnStdout: true, script:"oc get secret aws-secret-access-key-${context.environment} --namespace ${context.cdProject} --output jsonpath='{.data.secrettext}' | base64 -d")

}


def generateTerraformOutputsFile() {
sh 'terraform output -json > terraform_outputs.json'
sh 'cat terraform_outputs.json'
}

def stageGetNamesFromOutputs() {
def outputNames = [:]
def terraformOutputJson = readJSON file: 'terraform_outputs.json'

outputNames.aws_codepipeline_name = terraformOutputJson.codepipeline_name.value
outputNames.bitbuckets3_name = terraformOutputJson.bitbucket_s3bucket_name.value
outputNames.results3_name = terraformOutputJson.e2e_results_bucket_name.value

return outputNames
}

def awsCodePipelineTrigger(def context, pipelineName) {
sh "aws codepipeline start-pipeline-execution --name ${pipelineName}"
}


def awsCodePipelineWaitForExecution(def context, pipelineName) {
def pipelineExecutionStatus = ''

while (true) {
pipelineExecutionStatus = ''
sleep(time: 40, unit: 'SECONDS')
def pipelineState = sh(
script: "aws codepipeline get-pipeline-state --name ${pipelineName} --query 'stageStates[*]' --output json",
returnStdout: true
).trim()

def pipelineStages = readJSON(text: pipelineState)

pipelineStages.each { stage ->
def stageName = stage.stageName
def stageStatus = stage.latestExecution.status
echo "Stage: ${stageName}, Status: ${stageStatus}"

if (stageStatus == 'InProgress') {
pipelineExecutionStatus = 'InProgress'
return
} else if (stageStatus == 'Failed') {
pipelineExecutionStatus = 'Failed'
echo "Pipeline execution failed at stage ${stageName}"
error("Pipeline execution failed at stage ${stageName}")
return
}
}

if (pipelineExecutionStatus == 'InProgress') {
continue
} else if (pipelineExecutionStatus == 'Failed') {
echo "Pipeline execution failed at stage ${stageName}"
break
} else {
echo 'Pipeline execution completed successfully.'
break
}
}
}



def publishBitbucketCodeToAWS(def context, bitbuckets3_name) {
def branch = context.gitBranch
def repository = context.componentId
zip zipFile: "${repository}-${branch}.zip", archive: false, dir: '.'
sh " aws s3 cp ${repository}-${branch}.zip s3://${bitbuckets3_name}/${repository}-${branch}.zip"
}

def retrieveReportsFromAWS(def context, results3_name) {
sh "aws s3 cp s3://${results3_name}/junit/acceptance_GX_junit.xml ./build/test-results/test/acceptance_GX_junit.xml"
sh "aws s3 cp s3://${results3_name}/junit/acceptance_pytest_junit.xml ./build/test-results/test/acceptance_pytest_junit.xml"
sh "aws s3 cp s3://${results3_name}/junit/installation_pytest_junit.xml ./build/test-results/test/installation_pytest_junit.xml"
sh "aws s3 cp s3://${results3_name}/junit/integration_pytest_junit.xml ./build/test-results/test/integration_pytest_junit.xml"

sh "aws s3 cp s3://${results3_name}/GX_test_results ./build/test-results/test/artifacts/acceptance/acceptance_GX_report --recursive"
sh "aws s3 cp s3://${results3_name}/GX_jsons ./build/test-results/test/artifacts/acceptance/GX_jsons --recursive"
sh "aws s3 cp s3://${results3_name}/pytest_results/acceptance/acceptance_allure_report_complete.html ./build/test-results/test/artifacts/acceptance/acceptance_pytest_report.html"
sh "aws s3 cp s3://${results3_name}/pytest_results/installation/installation_allure_report_complete.html ./build/test-results/test/artifacts/installation/installation_pytest_report.html"
sh "aws s3 cp s3://${results3_name}/pytest_results/integration/integration_allure_report_complete.html ./build/test-results/test/artifacts/integration/integration_pytest_report.html"

sh "ls build/test-results/test"
}

def addVars2envJsonFile(def context) {
echo "Starting addVars2envJsonFile"
def environment = context.environment
def projectId = context.projectId
def branch_name = context.gitBranch
def repository = context.componentId
def filePath = "./environments/${environment}.json"

def existingJson = readFile file: filePath
def existingData = readJSON text: existingJson

existingData.environment = environment
existingData.projectId = projectId
existingData.aws_region = aws_region
existingData.repository = repository
existingData.branch_name = branch_name

echo "Environment: ${existingData}"

def updatedJson = groovy.json.JsonOutput.toJson(existingData)
writeFile file: filePath, text: updatedJson

echo "Finishing addVars2envJsonFile"
}

5 changes: 5 additions & 0 deletions e2e-etl-python/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# e2e-etl-python Quickstarter (e2e-etl-python)

Documentation is located in our [official documentation](https://www.opendevstack.org/ods-documentation/opendevstack/latest/getting-started/index.html)

Please update documentation in the [antora page directory](https://github.com/opendevstack/ods-quickstarters/tree/master/docs/modules/quickstarters/pages)
7 changes: 7 additions & 0 deletions e2e-etl-python/dev.yml.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
region: eu-west-1

credentials:
key: @project_id@-cd-aws-access-key-id-dev
secret: @project_id@-cd-aws-secret-access-key-dev

account: "<your_aws_account_id>"
19 changes: 19 additions & 0 deletions e2e-etl-python/files/.editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# EditorConfig is awesome: http://EditorConfig.org

# top-most EditorConfig file
root = true

[*]
charset = utf-8
end_of_line = lf
indent_size = 2
indent_style = space
insert_final_newline = true
trim_trailing_whitespace = true

[*.md]
trim_trailing_whitespace = false ; trimming trailing whitespace may break Markdown

[Makefile]
tab_width = 2
indent_style = tab
20 changes: 20 additions & 0 deletions e2e-etl-python/files/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
.bundle
.kitchen
.terraform
.terraform.lock.hcl
.terraform-data.json
.vscode
.devcontainer/devcontainer.json
*.auto.tfvars*
inspec.lock
outputs.json
terraform.tfvars*
terraform.tfstate*
tfplan
vendor
test/integration/*/files/*.json
test/integration/*/files/*.yml
reports/install/*
!reports/install/.gitkeep
Pipfile.lock
.venv
Loading