Narwhals: Write DataFrame Code Once, Run It Anywhere in Python

The Python data ecosystem has a problem: too many DataFrame libraries, and no good way to make them talk to each other. You pick pandas for compatibility, Polars for speed, or cuDF for GPU acceleration — and then you’re locked in. Library authors face an even worse choice: support every DataFrame library, or alienate half their users.

Narwhals solves this. It’s a lightweight compatibility layer that lets you write DataFrame code once and run it on any supported backend. Think of it as the WSGI or DB-API of DataFrame operations — a standard interface over multiple implementations.

The project hit version 2.22.1 in mid-2026 and has been quietly becoming the go-to answer for “how do I make my library accept both pandas and Polars DataFrames?”

The Problem Narwhals Solves

If you’ve ever tried to write a data processing function that works with both pandas and Polars, you know the pain. The APIs look similar but diverge in dozens of small ways:

# pandas
df["col"].fillna(0)
df.rename(columns={"old": "new"})
df.groupby("key").agg({"val": "mean"})

# Polars
df.fill_null(0)
df.rename({"old": "new"})
df.group_by("key").agg(pl.col("val").mean())

Three operations, three different method names. Scale that to a real data pipeline with dozens of transformations, and you’re maintaining parallel codebases or forcing a single backend on your users.

Narwhals gives you a unified API:

import narwhals as nw

def process(df_raw):
    df = nw.from_native(df_raw)
    return (
        df.fill_null(0)
        .rename({"old": "new"})
        .group_by("key")
        .agg(nw.col("val").mean())
    )

Pass in a pandas DataFrame, a Polars DataFrame, or any supported native type — the same code works on all of them.

Getting Started

Installation is straightforward:

pip install narwhals

You don’t need to install any specific DataFrame library as a dependency of Narwhals itself. The backend libraries are your responsibility — Narwhals just provides the translation layer. This keeps the package small and avoids dependency conflicts.

Here’s the basic pattern:

import narwhals as nw
import pandas as pd
import polars as pl

# Create data in different backends
pdf = pd.DataFrame({"a": [1, 2, 3], "b": [4.0, 5.0, 6.0]})
pldf = pl.DataFrame({"a": [1, 2, 3], "b": [4.0, 5.0, 6.0]})

# Wrap them with Narwhals
nw_pd = nw.from_native(pdf)
nw_pl = nw.from_native(pldf)

# Same operations on both
result_pd = nw_pd.select(nw.col("a") + nw.col("b"))
result_pl = nw_pl.select(nw.col("a") + nw.col("b"))

# Convert back to native types
print(nw.to_native(result_pd))  # pandas DataFrame
print(nw.to_native(result_pl))  # Polars DataFrame

The from_native() function accepts pandas DataFrames and Series, Polars DataFrames and LazyFrames, cuDF objects, and several other backends. The to_native() function converts back to whatever the original type was.

Building a Real Data Pipeline

Let’s build something practical — a data pipeline that cleans, transforms, and aggregates sales data. The key advantage: this pipeline works regardless of which DataFrame library your data comes from.

import narwhals as nw
from datetime import datetime

def sales_pipeline(df_raw):
    """Process raw sales data — works with pandas, Polars, or any Narwhals backend."""
    df = nw.from_native(df_raw)
    
    # Clean: drop rows with missing critical fields
    df = df.drop_nulls(subset=["order_id", "amount"])
    
    # Transform: calculate derived columns
    df = df.with_columns(
        (nw.col("amount") * 1.08).round(2).alias("amount_with_tax"),
        nw.col("date").str.to_datetime().alias("order_date"),
    )
    
    # Filter: keep only recent orders
    cutoff = datetime(2025, 1, 1)
    df = df.filter(nw.col("order_date") >= cutoff)
    
    # Aggregate: summary by region
    summary = (
        df.group_by("region")
        .agg(
            nw.col("amount").sum().alias("total_revenue"),
            nw.col("amount").mean().round(2).alias("avg_order"),
            nw.col("order_id").count().alias("order_count"),
        )
        .sort("total_revenue", descending=True)
    )
    
    return nw.to_native(summary)

Run this with a pandas DataFrame and it returns a pandas DataFrame. Run it with a Polars DataFrame and it returns a Polars DataFrame. The logic stays identical.

Lazy Execution Support

One of Narwhals’ more powerful features is its support for lazy execution. If your backend supports lazy evaluation (Polars LazyFrame, DuckDB), Narwhals preserves that:

import narwhals as nw
import polars as pl

# Create a LazyFrame
lazy_df = pl.scan_csv("sales_data.csv")

# Wrap with Narwhals — lazy evaluation is preserved
nw_lazy = nw.from_native(lazy_df)

# Build your pipeline
result = (
    nw_lazy.filter(nw.col("status") == "completed")
    .group_by("category")
    .agg(nw.col("revenue").sum())
)

# Convert back — still lazy, nothing executed yet
native_result = nw.to_native(result)

# Now collect (execute)
final = native_result.collect()

This means you get the query optimization benefits of lazy backends without changing your Narwhals code. For large datasets, this can be the difference between minutes and seconds.

For Library Authors

This is where Narwhals really shines. If you’re building a data processing library, accepting Narwhals objects instead of requiring a specific DataFrame library dramatically expands your user base:

def my_library_function(df_raw):
    """Accept any DataFrame type via Narwhals."""
    df = nw.from_native(df_raw)
    
    # Your processing logic here
    # Works whether user passes pandas, Polars, cuDF, etc.
    
    return nw.to_native(result)

Major libraries are already adopting this pattern. It means users can pass whatever DataFrame type they’re already using — no conversion step, no extra memory overhead, no dependency conflicts.

Supported Backends

As of version 2.22.1, Narwhals supports:

pandas — the ubiquitous choice, with full Series and DataFrame support
Polars — both eager DataFrame and lazy LazyFrame
cuDF — GPU-accelerated DataFrames from NVIDIA’s RAPIDS ecosystem
Modin — distributed pandas-compatible DataFrames
Dask — parallel computing with DataFrame API
PyArrow — columnar data format with Table and ChunkedArray support
DuckDB — analytical SQL engine with DataFrame integration

The list continues to grow. Each new backend expands the reach of code written once.

When Not to Use Narwhals

Narwhals isn’t a silver bullet. There are situations where you should go direct to the backend:

Performance-critical hot paths — while Narwhals overhead is minimal, direct API calls are always faster. If you’re in a tight loop doing millions of operations, native calls may be worth the complexity.
Backend-specific features — Narwhals covers the common operations, but not every niche feature of every backend. If you need Polars’ specific join algorithms or pandas’ extensive time series functionality, you’ll need native code.
Simple projects with a single backend — if you’re already committed to one DataFrame library and don’t need flexibility, Narwhals adds an unnecessary abstraction layer.

For most data pipelines, ETL processes, and library code, the tradeoff is heavily in Narwhals’ favor.

Migration Strategy

If you have an existing pandas codebase and want to test Polars without a full rewrite, Narwhals gives you an incremental path:

Install Narwhals alongside your existing code
Wrap your input DataFrames with nw.from_native()
Replace pandas method calls with Narwhals equivalents, one section at a time
Test with both pandas and Polars inputs at each step
Convert back with nw.to_native() where needed for downstream code

You can migrate piecemeal — no big-bang rewrite required. And at any point, your code still works with the original backend.

The bigger picture

Narwhals is a sign that the Python data ecosystem is growing up. Instead of picking sides in a DataFrame war, people are building bridges. We’ve seen this pattern before: pip, poetry, and uv all coexist for packaging. WSGI and ASGI standardize how web servers talk to frameworks. DB-API gives you a common interface for databases. Narwhals does the same thing for DataFrames.

In 2026, the question isn’t “pandas or Polars?” anymore. It’s “write code that works with both.” Narwhals makes that practical.

I keep going back to how simple the API actually is. A couple of functions — from_native and to_native — and you get to write your data processing logic once. That’s it. No configuration files, no adapter classes, no runtime detection gymnastics. Just wrap, process, unwrap.

If you’re building data pipelines for users with different infrastructure constraints, or if you maintain a library and want to stop fielding “does this work with Polars?” issues, Narwhals is worth a look.

Sources: Narwhals PyPI, Narwhals GitHub, Narwhals Documentation

Narwhals: Write DataFrame Code Once, Run It Anywhere in Python

The Problem Narwhals Solves

Getting Started

Building a Real Data Pipeline

Lazy Execution Support

For Library Authors

Supported Backends

When Not to Use Narwhals

Migration Strategy

The bigger picture

Leave a comment

No comments yet

The Problem Narwhals Solves

Getting Started

Building a Real Data Pipeline

Lazy Execution Support

For Library Authors

Supported Backends

When Not to Use Narwhals

Migration Strategy

The bigger picture

Share this guide

Leave a comment

No comments yet

Related Articles

Polars vs Pandas: The Complete Migration Guide

DuckDB in Python: A Complete Data Analytics Guide for 2026

DuckDB in Python Data Pipelines: Why In-Process Analytics Is Replacing Your Local Database