This project depends on the following two data sets
- SNOWFLAKE_SAMPLE_DATA that is available by default in all new Snowflake accounts
- Cybersyn Financial & Economic Essentials
- It is available for free in the Snowflake Data Markeplace
- If named other than "CYBERSYN_FINANCIAL_ECONOMIC_ESSENTIALS", update the database name in sources.yml
- If you can install your own software, I generally recommend using Miniforge to create an isolated Python environment just for dbt. Miniforge is completely open source, does not require a license like Anaconda or Miniconda, and uses the free Conda-Forge repository for packages. A dbt-conda-env.yml file has been provided so you can set up this environment and switch to it with the following commands. This is much simpler than some other flavors of Python.
conda env create -f dbt-conda-env.yml conda activate dbt
- If you have SSL errors, you may need to install pip_system_certs into your base environment first. The dbt-conda-env.yml file already includes this for your child environment. The
--trusted-host
parameters below will allow you to bypass firewall issues.conda activate base python -m pip install pip -U --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org python -m pip install pip_system_certs -U --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org
- Install Python using your company's Self Service portal or install the open source version of Python.org
- First you will want to see if you have the correct version of Python in your path using a Windows Command Prompt:
where python python --version
- It should return a path like
C:\Program Files\Python311
. You can use Python 3.8 and higher. If you can't runpython --version
it is likely that you do not have the right version of python in your PATH.
- If this is not the correct version of Python or the wrong location, you can update your PATH on Windows 10 & 11 using the following:
- Open Start Search, type “env”, and select “Edit the system environment variables”.
- Click the “Environment Variables…” button.
- In the “User Variables” section, locate “Path”, and click edit.
- In the “Edit environment variable” UI, click “New” to add the new path to your preferred version of Python.
- Use the "Move Up" button to make your new path the first entry
- Close and reopen any command prompts or VS Code to use the new PATH
- Use the
where python
andpython --version
commands again to verify that you are now using the correct version.
- Next make sure pip is up to date and that pip_system_certs is also installed. The
--trusted-host
parameters below will help you avoid firewall issues.python -m pip install pip -U --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org python -m pip install pip_system_certs -U --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org
- Next install virtualenv and create a new virtual environment called
dbt
python -m pip install --user virtualenv python -m venv dbt
- Now you can activate your virtual environment, verify that the location of python has changed, and install dbt
.\dbt\Scripts\activate where python python -m pip install -U dbt-core dbt-snowflake dbt --version
- Install Python using your company's Self Service portal or install the open source version of Python.org
- First you will want to see if you have the correct version of Python in your path using a shell:
which python python --version
- You can use Python 3.8 and higher. If you can't run
python --version
it is likely that you do not have the right version of python in your PATH.
- This command can be used to append your folder before the existing PATH. You may need to add this to your .bash_profile or other shell configuration file to make the change permanant.
export PATH=/my/path/to/python/:$PATH
- Next make sure pip is up to date and that pip_system_certs is also installed. The
--trusted-host
parameters below will help you avoid firewall issues.python -m pip install pip -U --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org python -m pip install pip_system_certs -U --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org
- Next install virtualenv and create a new virtual environment called
dbt
python -m pip install --user virtualenv python -m venv dbt
- Now you can activate your virtual environment, verify that the location of python has changed, and install dbt
source dbt/bin/activate which python python -m pip install -U dbt-core dbt-snowflake dbt --version
- The dbt init command can step you through creating a
~/.dbt/profiles.yml
file and starter project structure. - Be aware that running this project will create a table and a view in the schema you specify.
- Also be aware that the sample project is designed to have a test fail when you first run
dbt build
. If you open the first model under /models/ you will see that it has a where clause you can uncomment to make the test pass.dbt init dbt compile dbt build
- Most dbt users edit their dbt scripts with Microsoft's free editor, VSCode
- Download VSCode
- From the Extensions screen (icon looks like Tetris) you should install two extensions
- "Snowflake"
- "dbt Power User" (which will also add "vscode-dbt")
- In the Explorer, right click in the background and "Add folder to workspace" to add where your dbt project will be located.
- On Windows, you will want to change the default terminal to "Command Prompt". Under File -> Preferences -> Settings, search for "windows terminal" and scroll down to where it says the default is "null" and change that to "Command Prompt".
- You will want to set the default intepreter to your new "dbt" environment using these instructions from Microsoft.
- Create a target schema in Snowflake that you want to deploy your dbt demo into
- Add the Knoema Economy Data Atlas and Snowflake Sample Data to your account if necessary
- Update your ~/.dbt/profiles.yml file with your credentials and target DB/schema. There is a sample profiles.yml file you can copy to your ~/.dbt/ folder and update if you don't already have one.
- From the root folder, run
dbt deps
to download modules from the dbt hub - Run
dbt build --full-refresh
and troubleshoot any errors such as missing objects or permission issues
dbt deps
- download 3rd party packages (necessary for this project before build)dbt build
- both compile and then run all models & associated testsdbt build --full-refresh
- have incremental models run as a full reloaddbt build --select modelname
- will only compile/run modelnamedbt build --select +modelname
- will compile/run modelname and all parentsdbt build --select modelname+
- will compile/run modelname and all childrendbt build --select +modelname+
- will compile/run modelname, and all parents and childrendbt build --select @modelname
- will compile/run modelname, all parents, all children, AND all parents of all childrendbt build --exclude modelname
- will compile/run all models except modelnamedbt compile
- compile all models but do not execute themdbt run
- run all models & testsdbt seed
- create or refresh small tables from .csv seed filesdbt clean
- clear your logs and compiled scripts (can fix issues)dbt docs generate
- refresh the documentation for your projectdbt docs serve
- open this documentation in your browser
Additional commands and details are available in dbt's documentation
- How to nest models:
- DIM_ORDERS
- DIM_CURRENT_YEAR_ORDERS
- DIM_CURRENT_YEAR_OPEN_ORDERS
- Snowflake commands in a pre-hook:
- DIM_CALENDAR_DAY
- Materializations:
- LKP_EXCHANGE_RATES (table)
- LKP_CUSTOMERS_WITH_ORDERS (ephemeral)
- DIM_CUSTOMERS_SHARE (secure view)
- FACT_ORDER_LINE (incremental fact)
- DIM_CUSTOMERS, DIM__CUSTOMERS (incremental dim)
- DIM_CUSTOMERS_TYPE2 (snapshot)
- Source data quality tests:
- sources.yml
- Model data quality tests:
- schema.yml
- Features available in dbt_project.yml
- run-start/run-end hooks
- logging before and after modules
- default materializations by folder path
- Snowflake features - copy_grants, secure views, warehouse
- schemas for models
- Macro examples:
- snowflake_surrogate_key
- copy_log_to_snowflake
- create_masking_policies
- Jinja expressions:
- Q1_FACT_PRICING_SUMMARY_REPORT_QUERY
- Q2_MINIMUM_COST_SUPPLIER_QUERY
- Q3_SHIPPING_PRIORITY_QUERY
- Q4_ORDER_PRIORITY_CHECKING_QUERY
- Learn more about dbt in the docs
- Free on-demand training
- Additional Packages
- Create PK, UK, and FK in Snowflake using dbt Constraints
- Snowflake Guide - Accelerating Data Teams with dbt Core & Snowflake
- Snowflake Guide - Accelerating Data Teams with dbt Cloud & Snowflake
- Snowflake Guide - Data Engineering with Apache Airflow, Snowflake & dbt