Quick Start =========== Installation ------------ .. code-block:: bash # For Pandas backend (default) pip install frameright # For Polars backend (optional) pip install frameright[polars] # For Narwhals backend (optional) pip install frameright[narwhals] FrameRight supports multiple backends. Use backend-specific classes for type safety, by importing from the appropriate backend module (``frameright.pandas``, ``frameright.polars.eager``, etc.). Defining a Schema ----------------- Define your DataFrame schema as a Python class using ``Col[T]`` type hints: .. note:: For the best editor experience, import backend-specific typing shims: * Pandas: ``from frameright.typing.pandas import Col`` * Polars eager: ``from frameright.typing.polars_eager import Col`` * Polars lazy: ``from frameright.typing.polars_lazy import Col`` The generic ``from frameright.typing import Col`` also works and preserves the inner type parameter ``T`` for schema annotations. **Important typing note:** Pandas has mature type stubs, so type checkers can often treat attribute accessors like ``obj.amount`` as ``pd.Series[float]``. Polars and Narwhals do not currently expose fully generic ``Series[T]`` / ``Expr[T]`` types upstream, so type checkers typically see ``pl.Series`` / ``pl.Expr`` / ``nw.Series`` (inner type is best-effort today). .. code-block:: python from frameright import Schema, Field from frameright.typing import Col from typing import Optional import pandas as pd class Customer(Schema): customer_id: Col[int] = Field(unique=True, nullable=False) """Unique customer identifier.""" name: Col[str] = Field(min_length=1) """Customer's full name.""" age: Col[int] = Field(ge=18, le=120) """Customer's age in years.""" email: Col[str] = Field(regex=r'^[\w\.\-]+@[\w\.\-]+\.\w+$') """Contact email address.""" lifetime_value: Optional[Col[float]] """Total spend (optional).""" Loading Data ------------ **With Pandas (using base Schema):** .. code-block:: python # From a pandas DataFrame df = pd.DataFrame({...}) customers = Customer(df) # Load from CSV and wrap df = pd.read_csv("customers.csv") customers = Customer(df) **With Polars (eager - recommended for interactive use):** .. code-block:: python import polars as pl from frameright.polars.eager import Schema, Col, Field # Define schema using polars eager module class Customer(Schema): # Uses Polars eager backend customer_id: Col[int] = Field(unique=True) name: Col[str] ... # From a Polars DataFrame df = pl.DataFrame({...}) customers = Customer(df) # Uses Polars eager backend **All backends work the same way:** .. code-block:: python # Pandas from frameright.pandas import Schema, Col, Field # Polars eager from frameright.polars.eager import Schema, Col, Field # Polars lazy from frameright.polars.lazy import Schema, Col, Field # Narwhals (backend-agnostic) from frameright.narwhals.eager import Schema, Col, Field # Explicitly specify polars df = pl.DataFrame({...}) customers = Customer(df, backend="polars") Type-Safe Access ---------------- .. code-block:: python # IDE autocomplete works on all columns print(customers.name) print(customers.age.mean()) # Filter using the backend's native API, then re-wrap young_df = customers.fr_data[customers.age < 30] young = Customer(young_df, validate=False) Validation (Powered by Pandera) -------------------------------- FrameRight uses **Pandera** for runtime validation, giving you production-tested constraint checking with helpful error messages. Validation runs automatically on construction: .. code-block:: python customers = Customer(df) # Validates schema and constraints You can also run validation manually: .. code-block:: python customers.fr_validate() To skip validation (e.g. after filtering): .. code-block:: python customers = Customer(df, validate=False) .. tip:: In production pipelines, a common pattern is to validate at I/O boundaries (CSV reads, API inputs) and at team handoffs (function outputs), while skipping validation on intermediate steps for speed. You can always call ``obj.fr_validate()`` right before returning a Schema from a public API. **Benefits of Pandera integration:** * Industry-standard validation library with extensive testing * Clear, actionable error messages with row/column context * Works with both Pandas and Polars backends * Extensible — use Pandera directly for custom checks on ``obj.fr_data`` Type Coercion ------------- When loading messy data (e.g. CSV where everything is a string): .. code-block:: python messy_df = pd.read_csv("data.csv") customers = Customer(messy_df, coerce=True) Schema Introspection -------------------- .. code-block:: python for col in Customer.fr_schema_info(): print(col["attribute"], col["type"], col["required"]) # customer_id int True # ...