Halesowen College · T Level Data Analytics
Building the complete data pipeline — one step at a time
Hardware, sensors, wiring, I2C. You built the node.
MicroPython basics. You read live data and printed it.
Save that data locally, sync the clock, send it to the internet.
Structured data · Persistent storage · Import into Excel/Python
CSV is intentionally simple. Its strength is that every piece of software on the planet can read it. We'll analyse it in Python in Session 05.
| timestamp | temp_c | hum_pct | pres_hpa | gas_ohm |
|---|---|---|---|---|
| 2026-03-10 09:00:02 | 18.3 | 62 | 1013.2 | 48200 |
| 2026-03-10 09:00:12 | 18.4 | 61 | 1013.2 | 49100 |
| 2026-03-10 09:00:22 | 18.5 | 61 | 1013.3 | 49800 |
"w" — writeCreates the file (or overwrites if it exists). Use once for the header row.
"a" — appendAdds to the end of the file without deleting existing content. Use every loop.
"r" — readOpens to read only. We'll use this in Session 05 to analyse the collected data.
Flash limit: The Pico has 2 MB of flash. At ~60 bytes/row × 8640 rows/day, that's ~500 KB/day — roughly 4 days before you need to download and clear the file.
Connecting to the network · NTP · Real timestamps
Never hardcode passwords in code you share or commit to Git. Store credentials in a separate secrets.py file — import it, don't paste it.
In the classroom: Use your phone's mobile hotspot, not the college Wi-Fi — enterprise 802.1x authentication is not supported by MicroPython's network library.
Network Time Protocol — used by every computer on the internet to stay synchronised
pool.ntp.org)time.localtime() is now accurateUTC vs local: NTP gives you UTC. For UK data during BST (summer), add 1 hour. During GMT (winter), add 0. We'll handle this in the analysis scripts.
HTTP POST · REST APIs · Real-time pipeline
When you visit a website, your browser sends a GET request. When you log in, it sends a POST request with your credentials. Our sensor does the same — but with air quality readings.
200 OK = success401 = auth error · 500 = server errorurequests — lightweight HTTP for MicroPython (install via Thonny)ujson.dumps() — converts Python dict to JSON stringr.close() to free the socketError handling: Wrap in try/except — if Wi-Fi drops, fall back to local CSV logging rather than crashing the sensor.
Session 04 uses a test endpoint — we'll confirm data arrives on the server before going live.
The complete main.py flow for a deployed sensor node
Resilience pattern: Always write to local CSV first, then attempt remote send. If the network is down, data is safe. Next session: we'll add re-transmission of any missed readings.
By the end you have a working data pipeline: sensor → CSV → server.
(8 min) Add CSV logging to your Session 03 script. Run it for 60 seconds, then download log.csv via Thonny's file manager. Open in Excel — confirm you see tidy columns.
(7 min) Add the connect_wifi() and sync_time() functions. Connect to your phone's hotspot. Verify that get_timestamp() now returns the correct date and time.
(10 min) Add post_reading() using the test endpoint URL provided. Confirm you see Status: 200 in Thonny. Check the live dashboard — your data should appear within seconds.
(5 min) Wrap the POST in a try/except block. Disconnect from Wi-Fi (turn off hotspot), run the loop, confirm it still writes to CSV and prints the fallback message.
💡 If you get OSError: -2 on connect, check your SSID spelling is exact (case-sensitive). If you get an empty CSV, check you opened with "a" not "w" in the loop.
Date TBC · Data Cleaning & Analysis
Loading your CSV into Python pandas, cleaning anomalous readings, computing summary statistics, and visualising temperature and IAQ trends as your first research outputs.
Before Session 05: Leave your sensor running overnight. Bring at least 24 hours of CSV data to the session.
Data target: We need at least 500 rows per site for meaningful statistical analysis. Keep your sensor online!
Questions? Get in touch:
jwilliams.science · HalesAir Project