How to connect BigQuery and Python

Introduction

This article will guide you through connecting Python to the Google BigQuery API using the ZappySys ODBC API Driver. By leveraging the ZappySys ODBC BigQuery Connector, you can seamlessly access, manage, and analyze Google BigQuery data for enhanced reporting and analytics within Python.


Prerequisites

  • ODBC PowerPack: Download and install the ZappySys ODBC PowerPack from the Customer Download Area or the trial version.
  • Google Account: Required for accessing BigQuery.
  • pyodbc Module: A Python library for connecting to ODBC databases.

Steps

Step 1: Install pyodbc in Python

To connect Python to BigQuery, you need to install the pyodbc module:

  1. Ensure Python is installed: If not installed, download it from the official Python website.

  2. Install pyodbc: Open your terminal or command prompt and run the following command:

    python -m pip install pyodbc
    

    This installs the pyodbc library, essential for ODBC database connections in Python.

Why Install pyodbc?

  • Seamlessly connects Python to ODBC-supported databases like SQL Server, PostgreSQL, and more.
  • Facilitates querying, retrieving, and managing database records directly from Python.
  • Supports cross-platform compatibility (Windows, macOS, and Linux).

If pyodbc is not installed, running a script that depends on it will return the following error:

ModuleNotFoundError: No module named 'pyodbc'

Step 2: Create a New ODBC Driver

  1. Open ODBC Data Source Administrator by searching for “ODBC” on your system.
    Open ODBC Data Source

  2. Choose User DSN or System DSN. For a system-wide configuration, select System DSN and click Add.
    ZappySys ODBC Driver - Open UI

  3. Select ZappySys API Driver from the list, then choose BigQuery as the connector. Click Continue.


Step 3: Configure the BigQuery Connector

  1. Fill in the required fields:

    • Client ID
    • Client Secret
    • Project ID
    • Dataset ID
    • Scopes (separated by spaces)

    After entering the information, generate a token and test the connection.

  2. Preview the data:

    • Navigate to the Preview Tab, select a table, and view the results.
    • Once satisfied, click OK to save the configuration.

Step 4: Read Data in Python Using ODBC DSN

  1. Create a Python script (e.g., BigQueryExample.py) with the following code:

    import pyodbc
    
    # Connect to BigQuery using DSN
    conn = pyodbc.connect('DSN=API Driver - BigQuery')
    cursor = conn.cursor()
    
    # Execute query to fetch data
    cursor.execute("SELECT * FROM users")
    
    print("------------BigQuery & Python----------------")
    for row in cursor:
        print(row)
    
  2. Run the script to fetch data from BigQuery. The output will display the data fetched:


Step 5: Use a Full Connection String

  1. Open the ODBC Data Source Administrator, configure your DSN, and click Copy Settings to get the connection string.

  2. Replace the DSN in your Python script with the whole connection string:

    conn = pyodbc.connect('DRIVER={ZappySys API Driver};ServiceUrl=https://yourservice.provider.com/api/xxxx....;AuthName=Http;')
    

Video Tutorial

Conclusion

Using the ZappySys ODBC API Driver, you can integrate Python with Google BigQuery to automate tasks like retrieving, processing, and analyzing data. This workflow simplifies data handling and enhances productivity. For additional support, contact the ZappySys support team.


References