This is a public Edge Impulse project, use the navigation bar to see all data and models in this project; or clone to retrain or deploy to any edge device.
Beach Water Safety: Enterococci Bacteria Classifier
Classifies ocean water as safe/unsafe for swimming by detecting enterococci bacteria using multi-sensor fusion (temp, turbidity, TDS, pH) on ESP32-S3 with Edge Impulse. Designed for remote beach safety alerts.
About this project
Ocean Water Bacteria Contamination Detection - ESP32-S3
Remote beach water quality classification based on enterococci levels using sensor fusion and LoRa
Project Overview
An Edge AI system trained with Edge Impulse that classifies ocean water quality by detecting enterococci bacteria levels using sensor fusion. Built for the Edge Impulse 2025 Hackathon, the device performs real-time inference on an ESP32-S3, transmits classifications via LoRa to an MQTT gateway, which forwards them to a Home Assistant dashboard and responds to voice assistant queries. Motivated by frequent beach closures and 24-hour lab result delays, the goal is to deliver immediate, remote assessments of swimming safety.
Demo Video: https://youtu.be/I6PYruD1Y70
GitHub Repository: https://github.com/burgueishon/TinyML-Water-Bacteria-Classification
Technologies
Hardware: ESP32-S3 • ESP32-WROOM-32 • RA-02 LoRa • DS18B20 • SHT41 • SEN0189 • SEN0244 • SEN0161
Platforms: Edge Impulse Studio • Arduino IDE • Home Assistant
Technologies: TinyML / Edge AI • LoRa 433 MHz • MQTT • WiFi
Languages: C++ • Python
The Problem
Traditional beach water quality testing has significant limitations:
- 24-hour delay for lab results
- Weekly sampling misses contamination events between tests
- Manual collection requires personnel at each location
The Solution
A TinyML device that provides:
- Real-time classification (safe/unsafe)
- Continuous monitoring capability
- Remote operation via LoRa
Hardware
Sensor Node (ESP32-S3)
- Water Temperature: DS18B20 (OneWire)
- Ambient Temperature: SHT41 (I2C)
- Turbidity: SEN0189 (ADC + voltage divider)
- TDS: SEN0244 (ADC)
- pH: SEN0161 (ADC)
- Communication: RA-02 LoRa 433MHz
Gateway (ESP32-WROOM-32)
- LoRa receiver
- WiFi/MQTT bridge to Home Assistant
Dataset
Collection Process
- Location: Revere Beach, Massachusetts
- Period: October 23 - November 13, 2024
- Laboratory: Biomarine Research Corporation (Gloucester, MA)
- Threshold: 104 CFU/100mL (Massachusetts standard)
Samples
| Sample | Date | CFU/100mL | Classification | Status |
|---|---|---|---|---|
| 01 | Oct 23 | 74 | SAFE | Used |
| 02 | Oct 27 | <10 | SAFE | Partial (turbidity issue) |
| 03 | Oct 28 | 41 | SAFE | Excluded (turbidity failed) |
| 04 | Oct 29 | 30 | SAFE | Excluded (turbidity failed) |
| 05 | Nov 4 | 531 | UNSAFE | Used |
| 06 | Nov 5 | 31 | SAFE | Used |
| 07 | Nov 6 | 10 | SAFE | Used |
| 08 | Nov 11 | 31 | SAFE | Used |
| 09 | Nov 12 | 165 | UNSAFE | Used |
| 10 | Nov 13 | 94 | SAFE | Used |
Final Dataset
- Total: 36 CSV files
- Safe: 26 files (72%)
- Unsafe: 10 files (28%)
- Sampling rate: 1 Hz
- Duration: 60 readings per file
Model
Input Features (5)
temp_c- Water temperature (°C)amb_temp_c- Ambient temperature (°C)turb_v- Turbidity voltage (V)tds_v- TDS voltage (V)ph_v- pH voltage (V)
Architecture
- Processing: Raw Data block
- Learning: Neural Network Classifier
- Output: Binary classification (safe/unsafe)
Training Results
Two experiments were compared:
| Experiment | Accuracy | F1 (safe) | F1 (unsafe) | ROC AUC |
|---|---|---|---|---|
| Spectral Features | 69.1% | 0.78 | 0.48 | 0.63 |
| Raw Data | 99.4% | 1.00 | 0.99 | 0.99 |
Raw Data significantly outperformed Spectral Features, achieving 99.4% accuracy with excellent precision and recall for both classes.
Note: The default 67/33 train-test split raises a warning due to the limited dataset size; results are indicative and not production-ready.
Resource Usage
- Inference time: <10ms on ESP32-S3
- Flash usage: ~27KB
- RAM usage: ~368 bytes
Deployment
Inference Stabilization
Implemented rolling majority voting:
- Window of 5 consecutive inferences
- Final result = majority vote
- Confidence = averaged scores
- Eliminates single-inference noise
System Integration
- Sensor Node: ESP32-S3 with Edge Impulse model, transmits via LoRa
- MQTT Gateway: ESP32-WROOM-32, bridges LoRa to WiFi/MQTT
- Home Assistant: Dashboard with color-coded status, voice assistant queries (ESP32-S3-BOX with Claude)
Results
Live Testing
Successfully deployed and tested with real ocean water. System provides stable classifications with confidence indicators displayed on Home Assistant dashboard.
Laboratory Validation
Final sample sent to laboratory for verification:
- Model prediction: SAFE (99.6% confidence)
- Lab result: 10 CFU/100mL
- Match: Yes ✓
The model correctly classified the water as safe, confirmed by laboratory analysis showing bacteria levels well below the 104 CFU/100mL threshold.
Conclusions
This project shows it's possible to measure complex variables like bacteria in water, which can't be directly detected with sensors, by using sensor fusion and machine learning to correlate measurable parameters with those hidden variables.
I acknowledge the limitations of a small dataset. This model isn't ready for production use. But this project demonstrates that it is possible, with a larger dataset, additional sensors, and environmental metadata.
Future Work
- Add rain sensor - Rainfall significantly affects bacteria levels
- Add salinity sensor - Could improve classification accuracy
- Migrate to LoRaWAN - Use The Things Network (TTN) for better coverage
- Waterproof buoy enclosure - Enable actual ocean deployment
- Larger dataset - More samples across different seasons and conditions
- Multi-class classification - Safe / Caution / Unsafe / Closure levels
- Regression model - Predict actual CFU/100mL counts instead of binary classification
Links
- GitHub: https://github.com/burgueishon/TinyML-Water-Bacteria-Classification
- Demo Video: https://youtu.be/I6PYruD1Y70
Author
- Luis Burgos.
- Boston, Massachusetts
Built for the Edge Impulse 2025 Hackathon - Edge AI Application Track
Run this model
Dataset summary
Data collected
2h 58m 58sSensors
temp_c, amb_temp_c, turb_v, tds_v, ph_v @ 1HzLabels
safe, unsafeProject info
| Project ID | 829967 |
| Project version | 1 |
| License | 3-Clause BSD |
| No. of views | 343 |
| No. of clones | 0 |