Streamlit started as a way to turn Python scripts into shareable web apps in minutes. Five years in, it has become the default choice for data scientists who need to build internal tools, dashboards, and prototypes without touching JavaScript.
The 2026 picture is more nuanced. Streamlit is no longer just for quick prototypes. Teams are running it in production, embedding it in larger applications, and using it as the frontend for complex ML pipelines. Here is what actually matters when building Streamlit apps today.
The State of Streamlit in 2026
Streamlit’s core promise is unchanged: write Python, get a reactive web app. But the tooling around it has matured. The st.session_state API, stabilized in 1.x, handles state management without the bugs that plagued early versions. Multipage apps, introduced natively with pages/ directories, let you organize larger applications without third-party hacks.
The biggest shift is Streamlit’s integration with the broader Python data ecosystem. You can connect to DuckDB for fast local queries, use Polars for efficient DataFrame operations, and serve Plotly figures without serialization headaches. The once-fragile bridge between data processing and web rendering is now solid.
Performance has also improved. The 1.35 release in late 2025 added fragment reruns, which let you update only specific sections of a page instead of reloading the entire app on every interaction. For apps with dozens of widgets, this cuts rerender time from seconds to milliseconds.
What Works and What Doesn’t
Streamlit is best for internal tools. Data exploration dashboards, experiment tracking interfaces, model evaluation UIs — anything where the audience is technical and the goal is iteration speed over pixel-perfect design. You trade fine-grained layout control for not having to write HTML.
It is not the right choice for public-facing consumer applications. Streamlit’s top-to-bottom render model means every widget interaction triggers a full script rerun, which creates visible latency and a clunky user experience for non-technical users. If you need a polished consumer UI, reach for FastAPI with a React frontend or a framework like Reflex or NiceGUI that compiles to a proper SPA.
The sweet spot is the internal tooling layer that sits between raw Jupyter notebooks and full-stack web applications. Data teams that adopt Streamlit for this layer report cutting tool delivery time from weeks to hours.
Session State: The Part Everyone Gets Wrong
st.session_state is the most important concept in Streamlit, and it trips up beginners more than anything else. Here is the mental model that makes it click:
Streamlit reruns your entire script on every interaction. Variables defined at the top level get recreated and lose their values. Session state is a dictionary that survives these reruns. Every piece of mutable state in your app — form inputs, selected tabs, filter values, cached computations — lives in session state.
import streamlit as st
# Initialize state once
if "messages" not in st.session_state:
st.session_state.messages = []
# Add a message
user_input = st.chat_input("Type something")
if user_input:
st.session_state.messages.append({"role": "user", "content": user_input})
# Simulate a response
st.session_state.messages.append(
{"role": "assistant", "content": f"Echo: {user_input}"}
)
# Display all messages — this survives reruns
for msg in st.session_state.messages:
with st.chat_message(msg["role"]):
st.write(msg["content"])
A common mistake is putting keys like st.session_state["df"] at the module level without the initialization guard. On the first run, the key does not exist and the app crashes. Always check with if "key" not in st.session_state before accessing.
Caching With @st.cache_data and @st.cache_resource
Loading data from disk or an API on every rerun destroys performance. Streamlit’s caching decorators solve this:
import streamlit as st
import pandas as pd
@st.cache_data(ttl=3600) # Cache for 1 hour
def load_sales_data():
return pd.read_parquet("sales_2026.parquet")
@st.cache_resource # Cache indefinitely, share across sessions
def get_db_connection():
import duckdb
return duckdb.connect("analytics.db")
@st.cache_data is for data returned by functions — DataFrames, lists, arrays. It serializes results and caches them by hash of the function arguments. @st.cache_resource is for objects that cannot be serialized — database connections, ML models, API clients. Resources persist across all user sessions.
The ttl parameter is underused. Without it, cached data never expires, which causes stale results in production apps. Set a TTL that matches how often your data changes. For hourly-updated dashboards, 3600 seconds. For daily reports, 86400.
Multipage Apps: Beyond a Single Script
The pages/ directory pattern, stabilized in 1.30, is the cleanest way to organize larger Streamlit apps:
my_app/
├── main.py
└── pages/
├── 1_dashboard.py
├── 2_data_explorer.py
└── 3_settings.py
Each file in pages/ becomes a separate page in the sidebar navigation. Numbers in filenames control sort order. All pages share the same st.session_state, which means you can set filters on one page and read them on another.
The downside is that pages are independent scripts. Moving complex shared logic into a utils.py module and importing it across pages is essential. Otherwise, you end up copy-pasting the same 200 lines of data loading code into every page file.
Integrating With ML Pipelines
Most Streamlit apps in 2026 serve as frontends for ML workflows. A typical pattern: the app loads a model from MLflow or a model registry, fetches prediction inputs from a feature store, and displays results with Plotly charts.
import streamlit as st
import mlflow
import plotly.express as px
@st.cache_resource
def load_model():
return mlflow.pyfunc.load_model("models:/production@champion")
model = load_model()
st.title("Churn Prediction Explorer")
customer_id = st.text_input("Customer ID")
if customer_id and st.button("Predict"):
features = fetch_features(customer_id)
result = model.predict(features)
st.metric("Churn Risk", f"{result[0]:.1%}")
# Show feature importance
importances = model.get_feature_importance()
fig = px.bar(importances, x="importance", y="feature")
st.plotly_chart(fig)
The key detail: wrap model loading in @st.cache_resource. Loading a model on every rerun eats memory and adds seconds of latency. With caching, the model loads once and stays in memory until the server restarts.
Where Streamlit Falls Short and How to Work Around It
Authentication is still Streamlit’s weak point. There is no built-in auth system. The community solution is to put Streamlit behind a reverse proxy like Nginx or Traefik with OAuth, or use Cloudflare Access for internal tools. Streamlit Cloud offers Google OAuth, but it is limited to single-provider login.
Layout control is intentionally restricted. You get columns (st.columns) and containers, but no CSS grid or flexbox. If you need pixel-level control, embed Streamlit components inside a custom iframe or use custom components built with React. The tradeoff is intentional: most data apps do not need CSS, and the people building them do not want to write it.
Logging and monitoring are not first-class. Streamlit does not expose metrics natively. Use a tool like Prometheus with the prometheus_client Python library and expose a /metrics endpoint on a separate port. Wrap your Streamlit script in a process manager like systemd or Docker so restarts are clean.
The Bottom Line
Streamlit in 2026 is the fastest way to turn a Python script into something colleagues can use. It solves a real problem that Jupyter notebooks and full web frameworks both fail to address: the gap between analysis and sharing. The tradeoffs — limited layout control, full-page reruns — are acceptable for internal tools and unacceptable for consumer products. Know which side of that line you are on before you start building.
Discussion
Leave a comment
No comments yet
Be the first to start the conversation.