Skip to content
This repository was archived by the owner on Dec 4, 2024. It is now read-only.

dbt-labs/coalesce-2022-python-snowflake

 
 

Repository files navigation

Archival Notice

This repository has been archived.

As a result all of its historical issues and PRs have been closed.

Please do not clone this repo without understanding the risk in doing so:

  • It may have unaddressed security vulnerabilities
  • It may have unaddressed bugs
Click for historical readme

dbt Python models on Snowpark demo for Coalesce 2022

This repository contains a demo of dbt Python models on Snowflake via Snowpark for the Coalesce 2022 conference. It will not be actively maintained. See the repository it was forked from for a current version -- we will work to merge the Python models into the main branch there after Coalesce.

Cool gifs

What a cool DAG! Python and SQL side-by-side in dbt!

DAG

Python models in dbt Cloud!

py_gif

Get started

Follow these instructions to run yourself.

Environment

If you're running in dbt Cloud, ensure your environment(s) that need to run Python models are on v1.3+.

To run locally, you need to update dbt-core and dbt-snowflake to 1.3 or later. We recommend creating a fresh venv and pip installing the packages. The exact steps may vary by your platform, but as an example with an environment named dbt_py:

$ python3 -m venv dbt_py
$ source dbt_py/bin/activate
$ (dbt_py) pip install --upgrade dbt-core dbt-snowflake
$ (dbt_py) which python3

You can run dbt --version to ensure you have v1.3 installed.

To deactivate:

$ (dbt_py) deactivate
$ which python3

You may want to create an additional venv for locally running the Snowpark Python package. Instructions for setup are in Untitled.ipynb, preceding local prototypes for the Python models.

Source data

If you're a dbt Labs employee, you can skip this step -- the source data is already loaded into the Snowflake sandbox account.

Otherwise, run the snowflake.sql script through the Snowflake UI or locally via snowsql. This will create the ecommerce database and source tables from parquet files in S3. Modify the script as needed.

dbt deps

Run dbt deps.

Run or build

Run individual models or build the entire project!

$ (dbt_py) dbt run -s describe_py
$ (dbt_py) dbt build

And generate the docs!

$ (dbt_py) dbt docs generate && dbt docs serve

Challenges

See the challenges directory's README.md.

Contributing

We'd welcome contributions to this demo project. However, we will likely archive this repository sometime after Coalesce 2022. Consider contributing to the repository this one is forked from instead!

Languages

  • Jupyter Notebook 99.6%
  • Other 0.4%