In this tutorial, we'll learn how to write pandas dataframe to BigQuery and how to query from BigQuery within pandas.
- If you haven't already, you may sign-up for the free GCP trial credit.
Set up your project on GCP and enable the BigQuery API. - (optional, but recommended) Set-up a Python virtual environment:
- Install pyenv-virtualenv (Instructions for macOS).
- Create and activate an environment:
pyenv virtualenv gcp_env pyenv activate gcp_env
- Install pandas and the pandas wrapper for BigQuery:
pip install -r requirements.txt
- Delete the BigQuery dataset to avoid incurring charges to your account.
- If you created a Python virtual environment, deactivate it:
pyenv deactivate
The dataflow
and datalab
directories within this repository have tutorials on how to write dataflow results to BigQuery and how to write BigQuery queries within Datalab.