Dr James Williams.
GeoAI · Foundation Models · Urban Intelligence
I build geospatial AI systems that operate at scale — from GNN-based urban embedding models spanning 24 cities, to cloud-native conflict data infrastructure indexing 10 million records. My work bridges spatial machine learning with real-world impact in humanitarian, urban, and policy contexts.
About
I build geospatial AI systems at the boundary of spatial computing and machine learning — designing infrastructure that translates large, messy geographic datasets into meaningful representations of places, people, and events.
My work is inherently cross-disciplinary, sitting between GIScience, computer science, and the social sciences. Whether indexing ten million conflict records, embedding urban street networks across 24 global cities, or routing cyclists through 49 fused datasets, the common thread is scale — and the conviction that spatial data, treated carefully, can support better decisions in humanitarian, urban, and policy contexts.
Research Systems
Key Systems Built
MORPHEME
GNN-based embedding model utilising OpenStreetMap data to autonomously quantify urban characteristics — vibrancy, safety, and function — across 24 global cities without manual labelling.
CDISaW
Central Data Infrastructure for Slavery and War — a cloud-native conflict data platform indexing over 10 million records and delivering accessible outputs for interdisciplinary research at the Leverhulme Centre.
Topodex
Cloud-based, LLM-native geocoding system resolving over 894 million places worldwide, incorporating a supervised learning feedback loop for continuous accuracy improvement.
WalkGrid
Cloud-based active travel routing engine fusing 49 datasets — including Earth Observation, Ordnance Survey, OpenStreetMap, and Overture — across a 150,000-cell H3 grid for millisecond personalised route generation at city scale.
Areas of Focus
Research Themes
GeoAI & Representation Learning
Geospatial foundation models, graph neural networks for spatial representation, and discrete global grid systems (H3) for city-scale analysis.
- Spatial GNNs
- Place embeddings
- OSM big-data parsing
- H3 / DGGS
Conflict & Humanitarian Data
Cloud-native infrastructure for conflict documentation and displacement analysis, supporting interdisciplinary research into slavery and war.
- CDISaW data platform
- Topodex geocoding
- Leverhulme Centre
- Rights Lab, Nottingham
Urban Intelligence & Mobility
Data-driven analysis of urban place, active transportation, and mobility patterns using satellite data, crowdsourced trajectories, and civic APIs.
- MORPHEME urban model
- WalkGrid routing
- PARM civic dashboards
Portfolio
Featured Projects
PlaceCrafter
A web-based geospatial framework for identifying and visualizing 'platial' functional regions by clustering OpenStreetMap Points of Interest.
View project
Leisure Walking Framework
A comprehensive, grounded-theory framework for curating personalised leisure walking experiences, creating a bridge between subjective human narratives and computational routing systems.
View projectScholarship
Selected Publications
Macro-Regional Spatial Patterns of Ambient Air Pollution and Avoidable Hospitalizations for Community-Acquired Pneumonia in Mexico (2013–2020)
C. Hernandez-Nava, M. Mata-Rivera, R. Zagal-Flores, J. Williams
Ambient air pollution significantly contributes to respiratory illnesses, yet little is known about how industrial emissions are linked to preventable hospitalizations across atmospheric basins in middle-income countries. This study develops a basin-based geo-matics framework to examine the spatial and temporal relationship between industrial pollutants and age- and sex-adjusted avoidable hospitalizations for community-acquired pneumonia (PQI 11) in Mexico from 2013 to 2020. Using state-level data grouped into eight macro-regions, we combine bivariate choropleth maps, Pearson correlations, linear regression, and longitudinal time-series analysis to identify spatial clusters of high risk and to estimate regional sensitivities to changes in PM2.5, SO2, NOx, and volatile organic compound emissions. The findings reveal notable regional differences: northern border states and the Mexico City metropolitan basin form persistent high–high clusters where elevated emissions coincide with high PQI 11 rates, while coastal and peninsular regions show lower hospitalization burdens despite medium emission levels. Although national industrial PM2.5 emissions decreased over the study period, several macro-regions—particularly CDMX_Edomex, Centro, and Centro Norte—experienced significant increases in avoidable hospitalizations and decoupled emission–health patterns. Correlation matrices and regression slopes suggest that the strength and even direction of links between pollutants and PQI 11 vary across macro-regions, with emission-responsive patterns in Centro Norte and weak or inverse relationships in Peninsula and Pacifico Sur. These findings demonstrate that national averages obscure critical spatial disparities and highlight the value of basin-based geomatics approaches for regional air-quality governance, spatial decision support, and primary-care planning aimed at reducing preventable respiratory hospitalizations.
Geospatial Experience-Oriented Notation (GEON): A Semantic Format for LLM-Native Spatial Intelligence
J. Williams
Existing geospatial data formats such as GeoJSON, Well-Known Text (WKT), and CityGML, are optimised for geometric computation and rendering. While effective for Geographic Information Systems (GIS), these approaches present limitations when used with Large Language Models (LLMs). For example, coordinate arrays carry no inherent semantic meaning, spatial relationships require computational geometry to extract, and the human experience of place is usually absent. This manuscript introduces foundational work on Geospatial Experience-Oriented Notation (GEON), a text-based format that bridges machine-optimised geospatial data and human-readable spatial descriptors. GEON encodes identity, geometry, purpose, experiential qualities, spatial relationships, temporal patterns, and data provenance in a readable and structured syntax designed for human comprehension and LLM reasoning. This manuscript presents the initial specification, reference implementations in Python, Rust, and JavaScript, and an empirical evaluation demonstrating how GEON achieves 20\% fewer tokens than equivalent GeoJSON files, while encoding 31\% more semantic facts per token. This manuscript explores the implementation and how LLMs reason about place-making, urban design interventions, and spatial intelligence tasks that existing formats struggle to support.
Diabetes Disparities in Mexico: A Spatio-Temporal and Marginalization Index Analysis
C. Hernandez-Nava, J. Williams, S. Flores-Hernandez, M. Mata-Rivera
Understanding the geospatial and temporal distribution of diabetes mellitus in Mexico can be an essential tool in supporting vulnerable populations and addressing health inequalities. This article presents a spatio-temporal investigation of patients aged 18 years and older with diabetes mellitus in Mexico, associated with geographical area and a temporal range from 2005 to 2022. This approach includes calculating diabetes-related hospitalizations and deaths and its association with the margination index segmented into eight geographical areas of Mexico. Furthermore, this research stratifies based upon age group and type of medical institute of the health services in Mexico. The main contribution of this research is to explore the relationship between diabetes-related hospitalizations, deaths, geographical area, age, sex, and margination index of populations to support preventive action. The results highlight that adults between the ages of 45 and 64 years old who live in areas with a high margination index have a greater likelihood of suffering complications related to diabetes. The age-adjusted rate of DRAH shows that the Peninsula has the highest values among geographical areas. Research will now continue to explore mapping interventions to specific states and external datasets, to further extrapolate the results of the analysis.
Writing
Latest from the Blog
Get in Touch
Open to research collaboration, grant partnerships, and PhD supervision enquiries.