Quick Start

Installation

# For Pandas backend (default)
pip install frameright

# For Polars backend (optional)
pip install frameright[polars]

# For Narwhals backend (optional)
pip install frameright[narwhals]

FrameRight supports multiple backends. Use backend-specific classes for type safety, by importing from the appropriate backend module (frameright.pandas, frameright.polars.eager, etc.).

Defining a Schema

Define your DataFrame schema as a Python class using Col[T] type hints:

Note

For the best editor experience, import backend-specific typing shims:

Pandas: from frameright.typing.pandas import Col
Polars eager: from frameright.typing.polars_eager import Col
Polars lazy: from frameright.typing.polars_lazy import Col

The generic from frameright.typing import Col also works and preserves the inner type parameter T for schema annotations.

Important typing note: Pandas has mature type stubs, so type checkers can often treat attribute accessors like obj.amount as pd.Series[float]. Polars and Narwhals do not currently expose fully generic Series[T] / Expr[T] types upstream, so type checkers typically see pl.Series / pl.Expr / nw.Series (inner type is best-effort today).

from frameright import Schema, Field
from frameright.typing import Col
from typing import Optional
import pandas as pd

class Customer(Schema):
    customer_id: Col[int] = Field(unique=True, nullable=False)
    """Unique customer identifier."""
    name: Col[str] = Field(min_length=1)
    """Customer's full name."""
    age: Col[int] = Field(ge=18, le=120)
    """Customer's age in years."""
    email: Col[str] = Field(regex=r'^[\w\.\-]+@[\w\.\-]+\.\w+$')
    """Contact email address."""
    lifetime_value: Optional[Col[float]]
    """Total spend (optional)."""

Loading Data

With Pandas (using base Schema):

# From a pandas DataFrame
df = pd.DataFrame({...})
customers = Customer(df)

# Load from CSV and wrap
df = pd.read_csv("customers.csv")
customers = Customer(df)

With Polars (eager - recommended for interactive use):

import polars as pl
from frameright.polars.eager import Schema, Col, Field

# Define schema using polars eager module
class Customer(Schema):  # Uses Polars eager backend
    customer_id: Col[int] = Field(unique=True)
    name: Col[str]
    ...

# From a Polars DataFrame
df = pl.DataFrame({...})
customers = Customer(df)  # Uses Polars eager backend

All backends work the same way:

# Pandas
from frameright.pandas import Schema, Col, Field

# Polars eager
from frameright.polars.eager import Schema, Col, Field

# Polars lazy
from frameright.polars.lazy import Schema, Col, Field

# Narwhals (backend-agnostic)
from frameright.narwhals.eager import Schema, Col, Field

# Explicitly specify polars
df = pl.DataFrame({...})
customers = Customer(df, backend="polars")

Type-Safe Access

# IDE autocomplete works on all columns
print(customers.name)
print(customers.age.mean())

# Filter using the backend's native API, then re-wrap
young_df = customers.fr_data[customers.age < 30]
young = Customer(young_df, validate=False)

Validation (Powered by Pandera)

FrameRight uses Pandera for runtime validation, giving you production-tested constraint checking with helpful error messages.

Validation runs automatically on construction:

customers = Customer(df)  # Validates schema and constraints

You can also run validation manually:

customers.fr_validate()

To skip validation (e.g. after filtering):

customers = Customer(df, validate=False)

Tip

In production pipelines, a common pattern is to validate at I/O boundaries (CSV reads, API inputs) and at team handoffs (function outputs), while skipping validation on intermediate steps for speed. You can always call obj.fr_validate() right before returning a Schema from a public API.

Benefits of Pandera integration:

Industry-standard validation library with extensive testing
Clear, actionable error messages with row/column context
Works with both Pandas and Polars backends
Extensible — use Pandera directly for custom checks on obj.fr_data

Type Coercion

When loading messy data (e.g. CSV where everything is a string):

messy_df = pd.read_csv("data.csv")
customers = Customer(messy_df, coerce=True)

Schema Introspection

for col in Customer.fr_schema_info():
    print(col["attribute"], col["type"], col["required"])
# customer_id  int  True
# ...