Introduction
Geospatial data catalogs are the backbone of modern spatial data infrastructure (SDI). They enable organizations to discover, describe, and access geographic datasets — from satellite imagery and vector maps to sensor observations and 3D city models. Whether you’re building a municipal open data portal, managing an environmental monitoring network, or curating a research data repository, a self-hosted geospatial catalog server ensures your spatial assets are discoverable and interoperable.
In this guide, we compare three leading open-source geospatial catalog platforms: pygeoapi, pycsw, and GeoNetwork. Each implements OGC (Open Geospatial Consortium) standards but takes a different architectural approach to serving geospatial metadata.
Comparison Table
| Feature | pygeoapi | pycsw | GeoNetwork |
|---|---|---|---|
| Primary Role | OGC API server | CSW metadata catalog | Full geospatial catalog |
| GitHub Stars | 519 | 171 | 407 |
| Language | Python | Python | Java |
| OGC Standards | OGC API, WFS3, STAC | CSW 2.0.2/3.0, OAI-PMH | CSW, OGC API Records, WMS, WFS |
| Metadata Standards | STAC, OGC API Records | ISO 19115/19139, Dublin Core, FGDC | ISO 19115/19139, Dublin Core, FGDC |
| Web UI | OpenAPI/Swagger UI | No built-in UI | Full admin + search UI |
| Database Backend | Elasticsearch, PostgreSQL | SQLAlchemy (any RDBMS) | PostgreSQL, H2 |
| Harvesting | Via plugins | CSW harvesting | CSW, OAI-PMH, WFS, Z39.50 |
| Docker Support | Official image | Manual setup | Official + community images |
| License | MIT | MIT | GPLv2 |
pygeoapi: The Modern OGC API Server
pygeoapi is a Python-based OGC API server that implements the next generation of OGC standards — moving from the traditional SOAP/XML-based services (WMS, WFS, CSW) to modern RESTful JSON APIs. It serves geospatial data through OpenAPI-documented endpoints, making it accessible to web developers without specialized GIS knowledge.
Docker Deployment
| |
Configuration Example
pygeoapi uses YAML configuration to define data providers:
| |
pygeoapi’s strength is its modern architecture. It natively supports OGC API Features (replacing WFS), OGC API Records (replacing CSW), and STAC (SpatioTemporal Asset Catalog) for Earth observation data. The built-in OpenAPI documentation means every endpoint is self-describing, and the Python plugin system allows custom data providers for any backend.
pycsw: The Lightweight CSW Specialist
pycsw is a minimalist, standards-compliant OGC CSW (Catalog Service for the Web) server written in Python. It focuses on doing one thing exceptionally well — serving geospatial metadata through the CSW 2.0.2 and 3.0 protocols — without the overhead of a full-featured catalog application.
Docker Deployment
| |
pycsw is the reference implementation for OGC CSW and passes the OGC compliance test suite. It supports all major metadata standards — ISO 19115/19139, Dublin Core, and FGDC — and provides CSW harvesting for aggregating metadata from other catalogs.
Its lightweight design means it can run on minimal resources (512MB RAM is sufficient for most deployments) while handling thousands of metadata records. For organizations that only need standards-compliant metadata discovery without a heavy web UI, pycsw is the ideal choice.
GeoNetwork: The Full-Featured Spatial Data Catalog
GeoNetwork is the most mature and feature-rich open-source geospatial catalog. Developed by the UN Food and Agriculture Organization (FAO), it has been deployed by national mapping agencies, environmental ministries, and research institutions worldwide for over two decades.
Docker Deployment
| |
GeoNetwork provides a comprehensive web interface for metadata editing, validation, and searching. It includes a map viewer for spatial extent visualization, user/group management with fine-grained permissions, and powerful harvesting capabilities that can pull metadata from CSW, OAI-PMH, WFS, Z39.50, and even WebDAV sources.
The built-in metadata editor supports complex ISO 19139 templates with validation, making it suitable for organizations that need to produce standards-compliant metadata for INSPIRE (European SDI) or similar regulatory frameworks. GeoNetwork also includes a schema plugin system that allows custom metadata profiles.
Choosing the Right Geospatial Catalog
Your choice depends on your organization’s scale and requirements. pygeoapi is ideal for modern deployments that want to embrace OGC API standards and serve data directly through RESTful JSON endpoints. It’s the best fit if you’re building a new spatial data infrastructure from scratch with web-native technologies.
pycsw excels when you need a lightweight, standards-compliant CSW endpoint that “just works.” It’s perfect as a metadata aggregation node in a federated catalog network, or as a backend for custom catalog applications where you want maximum control over the frontend.
GeoNetwork is the choice for organizations that need a complete, turnkey catalog solution — especially those subject to regulatory metadata requirements (INSPIRE, ISO 19115). Its mature web interface, harvesting capabilities, and metadata editing tools make it the go-to for national and regional SDI deployments.
Why Self-Host Your Geospatial Catalog?
Geospatial data is often sensitive — precise locations of critical infrastructure, environmental monitoring data, and cadastral information have legal and security implications. Self-hosting ensures this data never leaves your control and can be governed by your organization’s data policies.
Open standards compliance (OGC) means your self-hosted catalog can interoperate with national and international SDI networks while maintaining autonomy. When government agencies require CSW endpoints for data sharing mandates, a self-hosted pycsw or GeoNetwork instance satisfies these requirements without exposing data to third-party cloud services.
Open source geospatial catalogs have powered national spatial data infrastructures for over two decades, proving their reliability at scale. From the European INSPIRE geoportal to the UN’s Food and Agriculture spatial data infrastructure, these tools have demonstrated that self-hosted open-source solutions can meet the most demanding governmental and scientific requirements. For environmental monitoring and research organizations, self-hosting enables long-term data preservation independent of vendor roadmaps. GeoNetwork deployments at institutions like the FAO and national geological surveys demonstrate that open-source catalogs can manage millions of metadata records reliably. For related data infrastructure, see our open data portal comparison. If you’re working with spatial analysis pipelines, our scientific simulation guide covers complementary HPC workflows.
FAQ
Can GeoNetwork handle millions of metadata records?
Yes, GeoNetwork has been deployed at national scales with millions of records. The Elasticsearch backend provides fast full-text search across large catalogs, and PostgreSQL with PostGIS handles spatial queries efficiently. For deployments exceeding 5 million records, allocate 8GB+ RAM and use SSD storage for Elasticsearch indices.
Does pygeoapi replace GeoServer or MapServer?
pygeoapi complements rather than replaces map servers. While pygeoapi serves vector data through OGC API Features and raster metadata through STAC, it doesn’t render map tiles or handle complex spatial analysis. Run pygeoapi alongside GeoServer or MapServer — pygeoapi for modern RESTful data APIs, the map server for WMS/WMTS visualization.
How does pycsw compare to CKAN for geospatial metadata?
CKAN is a general-purpose data portal with geospatial extensions, while pycsw is purpose-built for OGC CSW metadata exchange. If you need a full data portal with dataset pages, user dashboards, and API management, CKAN is the better choice. If you need a standards-compliant CSW endpoint that can be harvested by national SDI networks, pycsw is the right tool.
Can I migrate from GeoNetwork 3.x to 4.x with existing metadata?
Yes, GeoNetwork 4 provides migration tools for upgrading from 3.x deployments. The upgrade path preserves all metadata records, user accounts, and harvesting configurations. Back up your database and data directory before migration, and test the upgrade on a staging instance first.
Which tool works best with QGIS and other desktop GIS?
All three work well with QGIS. GeoNetwork has native QGIS integration via the MetaSearch plugin for CSW discovery. pygeoapi’s OGC API endpoints are directly consumable by QGIS 3.22+. pycsw can be queried from QGIS via the CSW protocol in the MetaSearch plugin. For day-to-day desktop GIS use, any of the three will provide seamless metadata discovery.
💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到技术监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测技术相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com