Luis Burgos / Beach Water Safety: Enterococci Bacteria Classifier Public

Beach Water Safety: Enterococci Bacteria Classifier

Classifies ocean water as safe/unsafe for swimming by detecting enterococci bacteria using multi-sensor fusion (temp, turbidity, TDS, pH) on ESP32-S3 with Edge Impulse. Designed for remote beach safety alerts.

water quality bacteria detection public health environmental monitoring hackaton 2025 imagine 2025 beach safety enterococci enterococcus sensor fusion

About this project

Ocean Water Bacteria Contamination Detection - ESP32-S3

Remote beach water quality classification based on enterococci levels using sensor fusion and LoRa

Beach Setup

Project Overview

An Edge AI system trained with Edge Impulse that classifies ocean water quality by detecting enterococci bacteria levels using sensor fusion. Built for the Edge Impulse 2025 Hackathon, the device performs real-time inference on an ESP32-S3, transmits classifications via LoRa to an MQTT gateway, which forwards them to a Home Assistant dashboard and responds to voice assistant queries. Motivated by frequent beach closures and 24-hour lab result delays, the goal is to deliver immediate, remote assessments of swimming safety.

Demo Video: https://youtu.be/I6PYruD1Y70

GitHub Repository: https://github.com/burgueishon/TinyML-Water-Bacteria-Classification

Technologies

Hardware: ESP32-S3 • ESP32-WROOM-32 • RA-02 LoRa • DS18B20 • SHT41 • SEN0189 • SEN0244 • SEN0161

Platforms: Edge Impulse Studio • Arduino IDE • Home Assistant

Technologies: TinyML / Edge AI • LoRa 433 MHz • MQTT • WiFi

Languages: C++ • Python

The Problem

Traditional beach water quality testing has significant limitations:

  • 24-hour delay for lab results
  • Weekly sampling misses contamination events between tests
  • Manual collection requires personnel at each location

The Solution

A TinyML device that provides:

  • Real-time classification (safe/unsafe)
  • Continuous monitoring capability
  • Remote operation via LoRa

Problem vs Solution

Hardware

Sensor Node (ESP32-S3)

  • Water Temperature: DS18B20 (OneWire)
  • Ambient Temperature: SHT41 (I2C)
  • Turbidity: SEN0189 (ADC + voltage divider)
  • TDS: SEN0244 (ADC)
  • pH: SEN0161 (ADC)
  • Communication: RA-02 LoRa 433MHz

LoRa Sender Connections

Gateway (ESP32-WROOM-32)

  • LoRa receiver
  • WiFi/MQTT bridge to Home Assistant

LoRa Receiver Connections

Dataset

Collection Process

  • Location: Revere Beach, Massachusetts
  • Period: October 23 - November 13, 2024
  • Laboratory: Biomarine Research Corporation (Gloucester, MA)
  • Threshold: 104 CFU/100mL (Massachusetts standard)

Samples

Sample Date CFU/100mL Classification Status
01 Oct 23 74 SAFE Used
02 Oct 27 <10 SAFE Partial (turbidity issue)
03 Oct 28 41 SAFE Excluded (turbidity failed)
04 Oct 29 30 SAFE Excluded (turbidity failed)
05 Nov 4 531 UNSAFE Used
06 Nov 5 31 SAFE Used
07 Nov 6 10 SAFE Used
08 Nov 11 31 SAFE Used
09 Nov 12 165 UNSAFE Used
10 Nov 13 94 SAFE Used

Final Dataset

  • Total: 36 CSV files
  • Safe: 26 files (72%)
  • Unsafe: 10 files (28%)
  • Sampling rate: 1 Hz
  • Duration: 60 readings per file

Model

Input Features (5)

  1. temp_c - Water temperature (°C)
  2. amb_temp_c - Ambient temperature (°C)
  3. turb_v - Turbidity voltage (V)
  4. tds_v - TDS voltage (V)
  5. ph_v - pH voltage (V)

Architecture

  • Processing: Raw Data block
  • Learning: Neural Network Classifier
  • Output: Binary classification (safe/unsafe)

Training Results

Two experiments were compared:

Experiment Accuracy F1 (safe) F1 (unsafe) ROC AUC
Spectral Features 69.1% 0.78 0.48 0.63
Raw Data 99.4% 1.00 0.99 0.99

Raw Data significantly outperformed Spectral Features, achieving 99.4% accuracy with excellent precision and recall for both classes.

Note: The default 67/33 train-test split raises a warning due to the limited dataset size; results are indicative and not production-ready.

Model Results

Resource Usage

  • Inference time: <10ms on ESP32-S3
  • Flash usage: ~27KB
  • RAM usage: ~368 bytes

Deployment

Inference Stabilization

Implemented rolling majority voting:

  • Window of 5 consecutive inferences
  • Final result = majority vote
  • Confidence = averaged scores
  • Eliminates single-inference noise

Rolling Majority Voting

System Integration

  • Sensor Node: ESP32-S3 with Edge Impulse model, transmits via LoRa
  • MQTT Gateway: ESP32-WROOM-32, bridges LoRa to WiFi/MQTT
  • Home Assistant: Dashboard with color-coded status, voice assistant queries (ESP32-S3-BOX with Claude)

Home Assistant Dashboard

Results

Live Testing

Successfully deployed and tested with real ocean water. System provides stable classifications with confidence indicators displayed on Home Assistant dashboard.

Live Test Setup Live Test Setup - Water Live Test Setup - Gateway Live Test Dashboard

Laboratory Validation

Final sample sent to laboratory for verification:

  • Model prediction: SAFE (99.6% confidence)
  • Lab result: 10 CFU/100mL
  • Match: Yes ✓

The model correctly classified the water as safe, confirmed by laboratory analysis showing bacteria levels well below the 104 CFU/100mL threshold.

Lab Validation

Conclusions

This project shows it's possible to measure complex variables like bacteria in water, which can't be directly detected with sensors, by using sensor fusion and machine learning to correlate measurable parameters with those hidden variables.

I acknowledge the limitations of a small dataset. This model isn't ready for production use. But this project demonstrates that it is possible, with a larger dataset, additional sensors, and environmental metadata.

Future Work

  • Add rain sensor - Rainfall significantly affects bacteria levels
  • Add salinity sensor - Could improve classification accuracy
  • Migrate to LoRaWAN - Use The Things Network (TTN) for better coverage
  • Waterproof buoy enclosure - Enable actual ocean deployment
  • Larger dataset - More samples across different seasons and conditions
  • Multi-class classification - Safe / Caution / Unsafe / Closure levels
  • Regression model - Predict actual CFU/100mL counts instead of binary classification

Author

  • Luis Burgos.
  • Boston, Massachusetts

Built for the Edge Impulse 2025 Hackathon - Edge AI Application Track

safe.sample_07_2
unsafe.sample_09_3
safe.sample_10_2
safe.sample_07_4
safe.sample_08_5
safe.sample_08_3
safe.sample_01_5
safe.sample_01_4

Run this model

Scan QR code or launch in browser

Dataset summary

Data collected
2h 58m 58s
Sensors
temp_c, amb_temp_c, turb_v, tds_v, ph_v @ 1Hz
Labels
safe, unsafe

Project info

Project ID 829967
Project version 1
License 3-Clause BSD
No. of views 343
No. of clones 0