Geospatial analysis in Python has come a long way from the days of GDAL command-line tools and shapefile headaches. In 2026, the ecosystem supports cloud-native formats, fast spatial joins, and interactive mapping without leaving your Jupyter notebook.
GeoPandas and the Parquet Revolution
GeoPandas remains the foundation of Python geospatial analysis. It extends pandas with geometry types and spatial operations. Read a GeoJSON file, perform a spatial join with a GeoParquet dataset, and plot the result — all with familiar pandas syntax.
GeoParquet is the format that changed everything. Unlike shapefiles, which split data across multiple files and have column name length limits, GeoParquet is a single file with no limits. Cloud storage services can do predicate pushdown on GeoParquet files, meaning you can query spatial data in S3 without downloading it first.
The ecosystem shift from shapefiles to GeoParquet is nearly complete. Overture Maps, the open map data project backed by Meta, Microsoft, and Amazon, distributes its data as GeoParquet. Most government open data portals now support GeoParquet export. The format is fast, compact, and cloud-native.
DuckDB Spatial
DuckDB’s spatial extension brings fast spatial queries to the in-process analytical database. Spatial joins that take minutes in GeoPandas run in seconds with DuckDB. The syntax is standard SQL with spatial functions: ST_Within, ST_Intersects, ST_Distance, and the rest of the PostGIS vocabulary.
The combination of DuckDB and GeoPandas is powerful. Use DuckDB for the heavy spatial operations — filtering millions of points by a polygon boundary, computing distances, finding nearest neighbors. Use GeoPandas for visualization and integration with the broader Python data ecosystem.
Interactive Mapping
Folium, based on Leaflet.js, remains the easiest way to create interactive maps in Jupyter notebooks. For more control, plotly’s scatter_mapbox and choropleth_mapbox functions produce publication-quality interactive maps with a few lines of code. Kepler.gl, with its Python binding through keplergl, handles large datasets with GPU-accelerated rendering.
Cloud-Native Geospatial
The STAC (SpatioTemporal Asset Catalog) specification has become the standard for cataloging geospatial data in cloud storage. pystac-client lets you search STAC catalogs from Python. Combined with stackstac for reading cloud-optimized GeoTIFFs, you can search, filter, and analyze satellite imagery directly from cloud storage without downloading terabytes of data.
Geospatial analysis in Python has matured from a niche specialty to a mainstream data science capability. The tools are fast, the formats are cloud-native, and the barrier to entry is lower than ever.
Discussion
Leave a comment
No comments yet
Be the first to start the conversation.