Skip to main content

Dr James Williams.

GeoAI  ·  Foundation Models  ·  Urban Intelligence

I build geospatial AI systems that operate at scale — from GNN-based urban embedding models spanning 24 cities, to cloud-native conflict data infrastructure indexing 10 million records. My work bridges spatial machine learning with real-world impact in humanitarian, urban, and policy contexts.

About

I build geospatial AI systems at the boundary of spatial computing and machine learning — designing infrastructure that translates large, messy geographic datasets into meaningful representations of places, people, and events.

My work is inherently cross-disciplinary, sitting between GIScience, computer science, and the social sciences. Whether indexing ten million conflict records, embedding urban street networks across 24 global cities, or routing cyclists through 49 fused datasets, the common thread is scale — and the conviction that spatial data, treated carefully, can support better decisions in humanitarian, urban, and policy contexts.

Research Systems

Key Systems Built

Areas of Focus

Research Themes

GeoAI & Representation Learning

Geospatial foundation models, graph neural networks for spatial representation, and discrete global grid systems (H3) for city-scale analysis.

  • Spatial GNNs
  • Place embeddings
  • OSM big-data parsing
  • H3 / DGGS

Conflict & Humanitarian Data

Cloud-native infrastructure for conflict documentation and displacement analysis, supporting interdisciplinary research into slavery and war.

  • CDISaW data platform
  • Topodex geocoding
  • Leverhulme Centre
  • Rights Lab, Nottingham

Urban Intelligence & Mobility

Data-driven analysis of urban place, active transportation, and mobility patterns using satellite data, crowdsourced trajectories, and civic APIs.

  • MORPHEME urban model
  • WalkGrid routing
  • PARM civic dashboards

Portfolio

Featured Projects

Scholarship

Selected Publications

All 21 papers
2026
journal Preprints

Macro-Regional Spatial Patterns of Ambient Air Pollution and Avoidable Hospitalizations for Community-Acquired Pneumonia in Mexico (2013–2020)

C. Hernandez-Nava, M. Mata-Rivera, R. Zagal-Flores, J. Williams

Ambient air pollution significantly contributes to respiratory illnesses, yet little is known about how industrial emissions are linked to preventable hospitalizations across atmospheric basins in middle-income countries. This study develops a basin-based geo-matics framework to examine the spatial and temporal relationship between industrial pollutants and age- and sex-adjusted avoidable hospitalizations for community-acquired pneumonia (PQI 11) in Mexico from 2013 to 2020. Using state-level data grouped into eight macro-regions, we combine bivariate choropleth maps, Pearson correlations, linear regression, and longitudinal time-series analysis to identify spatial clusters of high risk and to estimate regional sensitivities to changes in PM2.5, SO2, NOx, and volatile organic compound emissions. The findings reveal notable regional differences: northern border states and the Mexico City metropolitan basin form persistent high–high clusters where elevated emissions coincide with high PQI 11 rates, while coastal and peninsular regions show lower hospitalization burdens despite medium emission levels. Although national industrial PM2.5 emissions decreased over the study period, several macro-regions—particularly CDMX_Edomex, Centro, and Centro Norte—experienced significant increases in avoidable hospitalizations and decoupled emission–health patterns. Correlation matrices and regression slopes suggest that the strength and even direction of links between pollutants and PQI 11 vary across macro-regions, with emission-responsive patterns in Centro Norte and weak or inverse relationships in Peninsula and Pacifico Sur. These findings demonstrate that national averages obscure critical spatial disparities and highlight the value of basin-based geomatics approaches for regional air-quality governance, spatial decision support, and primary-care planning aimed at reducing preventable respiratory hospitalizations.

2026
dataset Zenodo

AnythingPOI - Australia POI Dataset v0.1

J. Williams

A unified, open point-of-interest (POI) dataset for Australia containing 1,735,980 POIs produced by the AnythingPOI pipeline, which fuses OpenStreetMap and Overture Maps Foundation data using H3-indexed spatial conflation, Jaro-Winkler name matching, and multi-signal confidence scoring. Source breakdown: OSM-only: 320,149 (18.4%) — from OpenStreetMap contributors (ODbL 1.0) Overture-only: 1,364,195 (78.6%) — from Overture Maps Foundation (CDLA-Permissive-2.0) Conflated (both sources matched): 51,636 (3.0%) Top categories: Professional & Business, Retail, Food & Beverage, Transportation, Healthcare. Full taxonomy: 18 Tier-1 categories, 196 Tier-2 subcategories. Contents: GeoParquet files (one per Tier-1 category), PMTiles v3 for interactive map visualisation, and coverage statistics CSVs. Each POI carries a confidence_score (0.01–0.99) reflecting the strength of the conflation evidence across spatial, name, website, phone, postcode, and Wikidata signals. Attribution: This dataset contains information from OpenStreetMap (© OpenStreetMap contributors, ODbL 1.0 — openstreetmap.org/copyright) and Overture Maps Foundation (CDLA-Permissive-2.0 — overturemaps.org). License: Open Database License (ODbL) 1.0. Any public use of this database or works produced from it must include the above attribution. Derivative databases must also be released under ODbL.

2026
dataset Zenodo

AnythingPOI - Canada POI Dataset v0.1

J. Williams

A unified, open point-of-interest (POI) dataset for Canada containing 5,565,256 POIs produced by the AnythingPOI pipeline, which fuses OpenStreetMap and Overture Maps Foundation data using H3-indexed spatial conflation, Jaro-Winkler name matching, and multi-signal confidence scoring. Source breakdown: OSM-only: 451,872 (8.1%) — from OpenStreetMap contributors (ODbL 1.0) Overture-only: 5,037,979 (90.5%) — from Overture Maps Foundation (CDLA-Permissive-2.0) Conflated (both sources matched): 75,405 (1.4%) Top categories: Professional & Business, Retail, Food & Beverage, Healthcare, Services. Full taxonomy: 18 Tier-1 categories, 196 Tier-2 subcategories. Contents: GeoParquet files (one per Tier-1 category), PMTiles v3 for interactive map visualisation, and coverage statistics CSVs. Each POI carries a confidence_score (0.01–0.99) reflecting the strength of the conflation evidence across spatial, name, website, phone, postcode, and Wikidata signals. Attribution: This dataset contains information from OpenStreetMap (© OpenStreetMap contributors, ODbL 1.0 — openstreetmap.org/copyright) and Overture Maps Foundation (CDLA-Permissive-2.0 — overturemaps.org). License: Open Database License (ODbL) 1.0. Any public use of this database or works produced from it must include the above attribution. Derivative databases must also be released under ODbL.