Most analysis is disposable — run a script, read the output, close the notebook. That works until you need results to persist, accumulate over time, or be accessible from somewhere other than your laptop. Supabase gives you a real PostgreSQL database on a free tier, accessible via a Python client. This project walks through setting one up and using it as a lightweight data store.

Supabase has a free tier that is generous enough for this project and most personal data work. Free tier limits apply and can change — check supabase.com for current details before building anything production-critical on it.


Step 1 — Create a Supabase project. Go to supabase.com, create a free account, and start a new project. Once it's provisioned, navigate to Project Settings → API. You need two values: the Project URL and the anon public key. Store both in Colab Secrets before writing any code.

# Colab: open Secrets panel (🔑 left sidebar)
# Add: SUPABASE_URL  → your project URL (https://xxxx.supabase.co)
# Add: SUPABASE_KEY  → your anon public key
# Enable notebook access for both
from google.colab import userdata

SUPABASE_URL = userdata.get('SUPABASE_URL')
SUPABASE_KEY = userdata.get('SUPABASE_KEY')

Step 2 — Create your table. In the Supabase dashboard, go to the SQL Editor and run this to create the table you'll write to:

-- Run this in the Supabase SQL Editor (not in Colab)
CREATE TABLE events (
  id         BIGSERIAL PRIMARY KEY,
  source     TEXT NOT NULL,
  event_type TEXT NOT NULL,
  value      FLOAT,
  recorded_at TIMESTAMPTZ DEFAULT NOW()
);

Step 3 — Connect and write data from Colab. Install the Supabase Python client and insert some rows:

!pip install supabase
from supabase import create_client

db = create_client(SUPABASE_URL, SUPABASE_KEY)

rows = [
    {"source": "web",    "event_type": "signup",   "value": None},
    {"source": "mobile", "event_type": "purchase",  "value": 49.99},
    {"source": "web",    "event_type": "purchase",  "value": 19.99},
    {"source": "mobile", "event_type": "signup",    "value": None},
    {"source": "web",    "event_type": "purchase",  "value": 99.00},
]

result = db.table("events").insert(rows).execute()
print(f"Inserted {len(result.data)} rows")

Step 4 — Query it back with filters. The Supabase Python client supports chainable filter methods that map directly to SQL WHERE clauses:

# All purchase events from web
purchases = (db.table("events")
               .select("*")
               .eq("event_type", "purchase")
               .eq("source", "web")
               .execute())

import pandas as pd
df = pd.DataFrame(purchases.data)
print(df[["source","event_type","value","recorded_at"]])

# Total revenue
print(f"\nTotal revenue: ${df['value'].sum():.2f}")

The data persists in Supabase between sessions — unlike a local DataFrame that vanishes when the notebook closes. This is the core difference between analytical scripts and data infrastructure. You've just built a lightweight pipeline that can accumulate data over time.

The interesting question isn't how to write data to a database — it's what you decide to store, at what granularity, and why. That's where data engineering starts to get hard.


Feel inspired? Browse our full resource library →