← Back to Lectures
HalesAir Logo
HalesAir · Session 01

What is
Air Quality?

Halesowen College · T Level Data Analytics · Friday 6 March 2026

What is air quality? Who is the STEM Partner? Sensor design Research questions

Who am I?

Dr James Williams

Research Fellow in Slavery and War: Geospatial & Data Science
Rights Lab & Leverhulme Centre
University of Nottingham

Previously Lecturer in Computer Science,
Birmingham Newman University

jwilliams.science
jwilliams.science

Research areas

Geospatial data science · IoT · Human mobility · Place-based sensing · Modern slavery mapping

On this project

STEM Partner — here to support your project. You own the work; I help you do it well.

What that means in practice

I'll join sessions, answer questions, and help you apply real data-science methods to your findings.

🌿

Part 1
What is Air Quality?

Composition · Pollutants · Standards · Health

What's actually in the air?

Clean, dry air at sea level — by volume:

  • 78.1% Nitrogen (N₂) — inert
  • 20.9% Oxygen (O₂) — essential for life
  • 0.9% Argon (Ar)
  • 0.04% Carbon dioxide (CO₂)
  • trace Water vapour, noble gases
  • <1% Pollutants — tiny fraction, huge impact
Source: NOAA / US Standard Atmosphere
Air composition graphic

Air quality describes the degree to which the air is free from pollutants that could harm human health, ecosystems, or the built environment.

Indoor vs outdoor: Indoor air can be 2–5× more polluted than outdoor — VOCs from cleaning products, etc.

Activity

⏱ 10 minutes

Think about your journey to college this morning.

1

(5 minutes) Write down 3 things you walked or drove past that might be producing air pollution. (Examples: buses, factories, car parks, busy junctions, construction.)

2

(5 minutes) Compare your list with the person next to you. Discuss: Did you identify the same sources? Were any surprising?

💡 We'll revisit your list when we look at monitoring site types later in the session.

The Key Pollutants

PM2.5 & PM10

Particulate matter — dust, soot, tyre wear, brake particles. PM2.5 (<2.5µm) bypasses the body's defences and lodges deep in the lungs. Linked to 29,000 early deaths/year in UK.

NO₂ · Nitrogen Dioxide

Produced by diesel engines and gas boilers. Inflames airways and reduces lung function. Highest near junctions — traffic idling is the main source.

VOCs · Volatile Organic Compounds

Released by solvents, paints, cleaning products, cooking, and vehicle exhausts. React with NOx in sunlight to form ground-level ozone. Our BME680 detects these.

O₃ · Ground-Level Ozone

Not emitted directly — formed from VOCs + NOx in sunlight. Higher in summer. Damages lung tissue even at moderate concentrations. Rural areas often worse than city centres.

Sources: DEFRA Air Quality Expert Group; WHO Global Air Quality Guidelines 2021

How is Air Quality Measured?

UK uses DAQI — Daily Air Quality Index — scale 1 to 10

1–3
Low
4
Moderate
5
6
7
High
8
9
10
Very High

Professional monitoring

  • DEFRA's AURN — 170+ fixed stations nationally
  • Instruments cost £10,000–£100,000+
  • Nearest to us: Birmingham Tyburn, Walsall Willenhall Road
  • Data updated hourly at uk-air.defra.gov.uk

Low-cost sensors (us)

  • Pico W + BME680 — under £30 per unit
  • Less precise, but many more locations possible
  • Fill gaps between AURN stations
  • Ideal for relative comparison across sites
AURN: Automatic Urban and Rural Network · Source: DEFRA / uk-air.defra.gov.uk

WHO Air Quality Guidelines (2021)

Annual mean limits — revised downward significantly from 2005 guidelines

PM2.55 µg/m³
PM1015 µg/m³
NO₂10 µg/m³
O₃ (peak season)60 µg/m³
Source: WHO Global Air Quality Guidelines, Sept 2021

UK vs WHO

  • UK legal annual NO₂ limit: 40 µg/m³ — 4× the WHO guideline
  • UK PM2.5 target: 10 µg/m³ (improving, not yet at WHO level)
  • Environment Act 2021 committed to tighter targets by 2040
  • West Midlands has multiple Clean Air Zones in response to persistent exceedances

Birmingham's city centre regularly exceeded WHO NO₂ limits before the Clean Air Zone launched in 2021.

Why does it matter here?

Health effects

  • Triggers and worsens asthma — ~5.4m people in UK affected
  • Long-term cardiovascular disease risk
  • Impairs cognitive development in children — affects school outcomes
  • PM2.5 linked to ~29,000 early deaths per year in the UK
  • Disproportionate impact on deprived communities near major roads
Sources: Asthma + Lung UK; Royal College of Physicians 2016 "Every Breath We Take"

Halesowen context

  • A456 & A459 — high-frequency bus and HGV routes
  • Industrial heritage: Cradley Heath, Dudley — legacy contamination
  • College sits between residential and retail zones
  • Significant variation expected: car park vs. courtyard vs. green space
  • No hyper-local monitoring currently exists — your data will be new

Activity

⏱ 15 minutes

Open Google Maps (or use your local knowledge) and look at the area around Halesowen College.

1

(5 minutes) Find: the nearest main road, a green or open area, and the main car park. Mark or note each one.

2

(5 minutes) Rank these three locations from worst to best air quality. Write a one-sentence justification for your top choice.

3

(5 minutes) Share back with the room: What time of day do you think pollution peaks at the worst location? Why?

💡 There are no wrong answers here — we're building hypotheses we'll test with real data over the coming months.

🔬

Part 2
Designing Your Sensor

Hardware · Capabilities · Limitations

What can our sensors measure?

Waveshare BME680 module — one board, four environmental measurements

🌡 Temperature

−40 to 85°C range · ±1.0°C accuracy
Can detect heat islands, south-facing walls, ventilation hotspots

💧 Relative Humidity

0–100%RH · ±3%RH accuracy
High humidity amplifies perceived air quality impact; dampness promotes mould VOCs

🔘 Barometric Pressure

300–1100 hPa · ±0.6 hPa accuracy
Low pressure = slower dispersion of pollutants; useful context for other readings

⚗️ VOC / Gas Resistance (IAQ)

Bosch IAQ index via BSEC library
Detects ethanol, acetone, CO₂-equivalent, and other combustion gases relatively

Important limitation: We do not directly measure PM2.5, PM10, NO₂, or O₃ — but VOC index combined with temperature and humidity still reveals meaningful local variation and patterns.

Low-Cost Sensors: What to Know

Understanding the limitations makes your methodology stronger, not weaker.

Temperature cross-sensitivity

The BME680 heats up slightly during VOC measurement. Readings should be corrected for self-heating. Placement matters — avoid direct sun.

Relative, not absolute IAQ

The IAQ index is comparative — it tells you if air is getting better or worse, not the exact ppb of a specific gas. Useful for trends and comparisons.

Baseline drift

VOC sensors need a "burn-in" period (several days) to stabilise. Early readings may vary. We'll account for this in data cleaning (Session 05).

Research context: Low-cost sensor networks are a growing area of environmental science. Papers from the Urban Observatory (Newcastle) and OpenAQ show they're genuinely useful when calibrated and interpreted carefully.

Further reading: Snyder et al. (2013) "The changing paradigm of air pollution monitoring" — Environmental Science & Technology. Lewis & Edwards (2016) in Nature on sensor networks.

📐

Part 3
Research Design

Questions · Locations · Methodology

The Scientific Method

Data analytics relies on rigorous scientific methodology.

1. Observe & Question

Notice a phenomenon (e.g., smog near a road) and ask a specific, testable question about it. This drives the whole project.

2. Form a Hypothesis

Propose a measurable expected outcome based on existing knowledge (e.g., "VOCs will peak during morning drop-off").

3. Collect Data (Experiment)

Deploy sensors with a controlled methodology. Consistency is key — if your methodology changes halfway, your data is invalid.

4. Analyse & Conclude

Process the data, look for statistical significance, and evaluate if it supports or refutes your hypothesis.

Why Methodology Matters

Controlling Variables

  • If you compare two sensors, but one is in the sun and one is in the shade, are you measuring air quality or just temperature?
  • We must control external factors as much as possible to ensure we're measuring what we claim to measure.

Reproducibility

  • Could another team take your project plan and reproduce your exact setup?
  • Detailed documentation of your deployment sites, sensor heights, and orientation is just as important as the code.

The difference between "just numbers" and "data-driven evidence" is the rigorous methodology that produced those numbers.

Designing a Research Question

A good research question is specific, measurable, and answerable with the data you can actually collect.

❌ Too vague

"Is the air quality bad at our college?"

"How does the weather affect the air?"

"Is it worse near the road?"

✓ Specific & testable

"Does VOC index differ significantly between the car park and the main courtyard during peak arrival times (08:30–09:00)?"

"How does relative humidity vary between the north and south faces of the main building across a full week?"

"Is there a correlation between temperature and IAQ index at the main entrance between 11:00 and 14:00?"

Activity

⏱ 10 minutes

Draft your own research question for the HalesAir project.

1

(5 minutes) Use this template as a starting point:
"Does [measurement variable] differ between [location A] and [location B] during [specific time or condition]?"

2

(5 minutes) Share with a neighbour. Check your question against criteria (variable? testable?) and give feedback.

💡 Your question will evolve as you collect data — that's normal. Getting a clear starting hypothesis is what matters today.

Picking Your Monitoring Location

Scientific considerations

  • Distance from roads & car parks
  • Sun/shade exposure (temperature bias)
  • Building airflow & sheltering effects
  • Ventilation inlets / exhausts nearby
  • Human activity patterns at that spot
  • Contrast with at least one other site

Practical constraints

  • Mains power socket within cable reach
  • Wi-Fi signal >-70 dBm (we'll test this)
  • Physical security — the units are small
  • Facilities permission — we need sign-off
  • Protection from direct rain/weather
  • Height: ~1.5m above ground (breathing level)

The most valuable datasets come from contrast — a car park vs. a courtyard vs. a green space will tell us far more than three units in similar spots.

Types of Monitoring Sites

DEFRA classification — used to contextualise and compare datasets

Kerbside

Within 1m of a busy road. Captures highest traffic-related pollution. Directly comparable to DEFRA roadside reference data. Best for vehicles-as-source studies.

Roadside

1–5m from traffic. Captures emissions with some dispersion already occurring. Most comparable to pedestrian exposure at crossings and bus stops.

Urban Background

Away from direct emission sources — rooftops, internal courtyards, parks. Represents the general "background" level for the area. Useful as a baseline site.

Indoor / Transitional

Corridors, covered walkways, building entrances. Captures people-generated pollutants (VOCs from deodorant, cleaning products, cooking) and infiltration from outdoors.

Further reading: DEFRA "Local Air Quality Management Technical Guidance" (TG22) — free to download. Explains site classification in detail.

Activity

⏱ 10 minutes

Evaluate two potential monitoring sites.

1

Pick two locations you're considering. For each, score 1–3 on: pollution exposure · power & Wi-Fi access · scientific contrast with other sites.

2

Using the DEFRA classification from the previous slide — which site type does each location best match? Write it next to each location.

3

Which location scores higher overall? That's your leading candidate — you'll confirm it during the main task.

Main Task

⏱ 30 minutes

Working individually — produce three outputs before the end of the session.

📍

Choose your monitoring site. Be precise — not "outside" but "beside the north car park entrance, under the covered walkway". Note its DEFRA site type.

Write your research question. Use the template from Activity C. Check it names a variable, two comparable locations or times, and is answerable with the BME680.

✏️

Sketch your deployment plan. A hand-drawn diagram showing: where the unit sits, where the power socket is, approximate Wi-Fi coverage, and how the enclosure is mounted.

💡 These three outputs will form part of your project methodology write-up. Keep them — you'll build on them in every session from here.

Group Share-back

⏱ 15 minutes
🗣️

Present your plan.

It's time to test your methodology against peer review from the rest of the room.

1

Where is your site, and why did you pick it?

2

What is your specific research question?

3

What potential issues have you identified regarding power, security, or Wi-Fi?

Coming Up — Session 02

Date TBC · Hardware & Electronics

What we'll cover

Getting your hands on the Raspberry Pi Pico W and BME680. Breadboarding your first circuit, writing your first MicroPython to read live sensor data, and understanding GPIO & I2C communication.

To bring next time: Your site choice, research question, and deployment sketch from today.

Optional reading: MicroPython docs at micropython.org — "Quick reference for the RP2040"

Questions? Get in touch:

jwilliams.science · HalesAir Project