Skip to content

hfoley/lakehouse

Repository files navigation

Building Lakehouse with Azure Synapse Analytics

In this workshop we'll show how we can setup a full data lake house architecture. We'll build out the components in Azure using PowerShell automated scripts, including metadata driven pipelines to saturate your data lake. We'll also build out external tables and views and connect Power BI to them.

1. Build components in Azure with PowerShell scripts 
2. Build and run metadata driven pipeline to build data lake
3. Build external table/view in Synapse Analytics with serverless SQL pool 
4. Connect to table with Power BI  

The architecture of the solution built in this workshop is diagrammed below.

alt text

alt text

Workshop Modules

The workshop is broken into the modules below. Complete 00 PreReqs prior to the workshop. We'll go through 01-03 together.

Asset List - These items will be created in your Azure subscription

At the completion of this workshop you'll build these assets in Azure.

1. Azure Resource Group
2. Azure SQL Server & Database - source table to extract data and metadata tables location 
3. Azure Data Lake Gen 2 - Synapse Analytics requires an ADLS Gen 2 account for system related usage
3. Azure Data Lake Gen 2 - Separate ADLS Gen 2 we'll use as our data lake and extracted parquet files 
4. Azure Synapse Workspace - workspace where pipelines and SQL serverless pool, external tables, and views will live
5. Azure Synapse - SQL Date Based Extract pipeline - extracts data from SQL Server tables specified (example uses Azure SQL DB created or specified) by a date range
6. Azure Synapse - SQL Date Not Date Based Extract pipeline - extracts data from SQL Server tables specified (example uses Azure SQL DB created or specified) by a specified value 

About

Data lakehouse workshop

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published