Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowflake example #1481

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions snowflake-py-pulumi-search-export/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
*.pyc
venv/
.vscode
.build
6 changes: 6 additions & 0 deletions snowflake-py-pulumi-search-export/Pulumi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
name: snowflake-pulumi-search-export-py
runtime:
name: python
options:
virtualenv: venv
description: A minimal Python Pulumi program
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be something like "An example of exporting Pulumi Cloud data to Snowflake"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should.

81 changes: 81 additions & 0 deletions snowflake-py-pulumi-search-export/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Pulumi Cloud Data Export to Snowflake using AWS Lambda

This folder contains an example to extract [Pulumi Cloud Data Export](https://www.pulumi.com/docs/pulumi-cloud/cloud-rest-api/#data-export) and load it into Snowflake. Pulumi Cloud's exported data contains detailed information about all of your resources that are managed by Pulumi. After deploying and running this example, you can query your Pulumi Cloud data in Snowflake directly or join it to other data sets of your choosing (like pricing) to create dashboards that provide valuable visibility into your organization's cloud usage.

The infrastructure contains the following resources:

![Architecture diagram showing a Lambda function reading a CSV from the Pulumi API and writing it to an S3 bucket, and a Snowflake pipe reading the file into Snowflake](images/snowflake-pulumi-architecture.png)

- An S3 bucket which will contain our exported data from the Pulumi Cloud. (The exported data is in CSV format.)
- An AWS Lambda function that queries the Pulumi Cloud REST API to [export search data](https://www.pulumi.com/docs/pulumi-cloud/cloud-rest-api/#resource-search) and place the file in an S3 bucket.
- Snowflake resources (database, schema, table) to hold the data along with [Snowpipe](https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro) resources that automatically import the data whenever a file is written to the S3 bucket. The Snowflake table is designed to be append-only while still allowing easy point-in-time queries.

## Prerequisites

1. [Install Pulumi](https://www.pulumi.com/docs/get-started/install/)
1. [Configure AWS credentials](https://www.pulumi.com/registry/packages/aws/installation-configuration/#configuration)
1. [Configure Snowflake credentials](https://www.pulumi.com/registry/packages/snowflake/installation-configuration/#configuring-credentials)
1. [Install Python](https://www.pulumi.com/docs/languages-sdks/python/)

## Deploy the App

### Step 1: Initialize the Project

1. Install packages:

```bash
python3 -m venv venv
venv/bin/pip install -r requirements.txt
```

1. Create a new Pulumi stack:

```bash
pulumi stack init
```

1. Deploy the Pulumi stack:

```bash
pulumi up
```

Once the `pulumi up` command completes, we'll execute the Lambda which will pull the data from the Pulumi Cloud API and place it in the S3 bucket.

### Step 2: Trigger the Lambda

Trigger the Lambda with the following command:

```bash
aws lambda invoke --function-name $(pulumi stack output lambdaArn) /dev/stdout
```

You should see output similar to the following:

```json
{
"StatusCode": 200,
"ExecutedVersion": "$LATEST"
}
```

After a few seconds, your data should be visible in your Snowflake database:

![Screenshot of Snowflake Worksheet showing querying of imported Pulumi data](images/snowflake-query.png)

## Clean Up

Once you're finished experimenting, you can destroy your stack and remove it to avoid incurring any additional cost:

```bash
pulumi destroy
pulumi stack rm
```

## Summary

In this tutorial, you created a simple extract/load process that exports data from the Pulumi Cloud API to Snowflake. Now you can query this data in Snowflake and join it with other data sets to gain valuable insights into your organization's cloud usage!

## Next Steps

To enhance this architecture, you could [add a rule to run the Lambda on a schedule](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-run-lambda-schedule.html).
Loading