Skip to content

Examples

This page contains examples for using the geospatial harmonization tools in this repository.


Geospatial Data Harmonization

The geospatial_harmonizer module helps harmonize multiple geospatial datasets by projecting them to a common CRS, clipping to a common extent, and creating visualizations.

Running the Colorado Example

The main example harmonizes four datasets for Colorado fire risk analysis:

python examples/colorado_fire_risk/colorado_harmonization.py

This downloads and processes:

  • FBFM40 Fire Behavior Fuel Models (raster) — Landfire 2024 Scott and Burgan 40-class model
  • MACAv2 Winter Precipitation (raster) — CCSM4 RCP8.5 Dec–Mar mean 2006–2099, streamed via OPeNDAP
  • MTBS Burned Area Boundaries (vector) — USGS fire perimeter data, kept as vector
  • Microsoft Building Footprints (vector, rasterized) — Colorado buildings rasterized to presence/absence at ~270 m

Output is saved to examples/colorado_fire_risk/output/.

Programmatic Usage

Import the harmonization functions and run a custom workflow:

from pathlib import Path
from src.geospatial_harmonizer import (
    DatasetSpec,
    ExampleWorkflow,
    run_harmonization_example,
)

# Define your target grid
TARGET_CRS = "EPSG:4326"
TARGET_EXTENT = (-109.05, 36.99, -102.04, 41.01)  # Colorado bounding box
TARGET_RESOLUTION = 0.00243  # ~270m in degrees
OUTPUT_DIR = Path("./my_harmonized_output")

# Define datasets to harmonize
DATASETS = [
    DatasetSpec(
        name="my_raster",
        url="https://example.com/data.tif",
        data_type="raster",
    ),
    DatasetSpec(
        name="my_vector",
        url="https://example.com/data.zip",
        data_type="vector",
        rasterize=True,
        burn_value=1,
    ),
]

# Run the workflow
workflow = ExampleWorkflow(
    name="my_workflow",
    datasets=DATASETS,
    target_crs=TARGET_CRS,
    target_extent=TARGET_EXTENT,
    target_resolution=TARGET_RESOLUTION,
    output_dir=OUTPUT_DIR,
    create_visualization=True,
    verbose=True,
)

output_files, interactive_map = run_harmonization_example(workflow)

Supported Data Types

  • Raster: GeoTIFF, COG, IMG (downloaded and harmonized)
  • NetCDF / OPeNDAP: Climate model outputs (MACAv2, ERA5 subsets) — set netcdf_variable and use a THREDDS dodsC URL to subset spatially before download
  • Vector: GeoJSON, Shapefile (optionally rasterized to match the raster grid)
  • Archives: ZIP files (automatically extracted)
  • STAC: Cloud-native collections (set is_stac=True)

Output Structure

After running, the output directory contains:

output/
└── colorado_harmonized_output/
    ├── harmonized_fbfm40_fuel_models.tif
    ├── harmonized_pr_winter_rcp85_ccsm4.tif
    ├── harmonized_mtbs_burned_areas.geojson
    ├── harmonized_building_footprints.tif
    ├── harmonized_visualization.png
    └── harmonized_visualization.html

All harmonized rasters share:

  • Common CRS (EPSG:4326)
  • Common extent (Colorado bounding box)
  • Common resolution (~270 m / 0.00243°)

Core Functions

DatasetSpec

Dataclass describing a single dataset to harmonize. Key fields:

Field Description
name Short identifier used in output filenames
url Direct download URL, OPeNDAP endpoint, or STAC API root
data_type "raster" or "vector"
rasterize Rasterize vector to match the target grid
burn_value Value to burn when rasterizing (default 1)
resampling_method "nearest" (categorical), "bilinear" (continuous), or "cubic"; auto-detected from dtype if not set
labels_url URL to a CSV with VALUE,LABEL columns for legend labels (e.g. FBFM40 fuel model names)
netcdf_variable Variable name inside a NetCDF file (e.g. "pr", "tasmax")
netcdf_months Months to average over, e.g. [12, 1, 2, 3] for Dec–Mar winter mean
secondary_url Second OPeNDAP URL for derived variables (e.g. rhsmin for VPD)
secondary_netcdf_variable Variable name in the secondary NetCDF
is_wcs Download from a WCS endpoint; set wcs_layer to the coverage name
is_wms Download from a WMS endpoint; set wms_layer to the layer name
is_stac Search a STAC catalog instead of downloading directly
stac_collection STAC collection ID, e.g. "sentinel-2-l2a"
stac_asset Asset key to download, e.g. "B08", "visual"
stac_datetime ISO-8601 date or range, e.g. "2023-06-01/2023-08-31"
stac_query Extra STAC filter properties, e.g. {"eo:cloud_cover": {"lt": 20}}

build_grid_spec(target_crs, target_extent, target_resolution)

Creates a GridSpec defining the target coordinate system and resolution.

download_file(url, output_dir, verbose)

Downloads a file from a URL to the specified directory.

harmonize_raster(input_path, grid, output_path, verbose)

Reprojects and resamples a raster to match the target grid.

rasterize_vector_to_grid(input_path, grid, output_path, burn_value, verbose)

Rasterizes vector geometries onto the target grid.

create_visualization(outputs, output_path, verbose)

Creates a multi-panel matplotlib visualization of harmonized outputs.

create_interactive_visualization(outputs, target_extent, output_path, verbose)

Creates a Folium HTML map with per-layer toggle checkboxes and opacity sliders.

run_harmonization_example(workflow)

Main entry point that orchestrates the complete harmonization workflow.