princep / ThingTank Public

Target: Rubik Pi 3 Clone this project

Welcome to Edge Impulse, the largest community of edge AI developers!
This is a public Edge Impulse project, use the navigation bar to see all data and models in this project; or clone to retrain or deploy to any edge device.

ThingTank

Object detection

TotTalk Speech processing

About this project

TotTalk Box

TotTalk helps toddlers learn to speak by pairing object recognition with real-time pronunciation coaching — everything runs offline on affordable hardware.

DEMO

VIDEO DEMO

IMPLEMENTATION LINK

Overview
Core Workflow
Highlights
Edge Impulse Development Workflow
Dataset Summary
Hardware Setup
Visual References
Device Preparation
Software Installation
Application Setup
Running the Application
Troubleshooting
FAQ
Contributing
Additional Resources

Overview

TotTalk uses the world around a toddler to introduce vocabulary, reinforce pronunciation, and keep learning fun. When a child presents an everyday object, the box recognizes it, introduces the word, listens to the child's attempt, and provides gentle feedback.

Everything runs locally on a Qualcomm RubikPi 3 so families keep complete control over their data. Vision, audio, UI, and speech feedback all execute on-device without any cloud dependency.

Core Workflow

https://github.com/PrinceP/tottalk-box/tree/main?tab=readme-ov-file#core-workflow

The child shows any toy or household item to the camera.
TotTalk identifies the object, announces the word, and prompts the child to repeat it.
Whisper-based speech recognition monitors pronunciation, offers retries when needed, and celebrates correct repetitions.

Demo1 Demo2 Demo3

Highlights

Real-world vocabulary: Builds a personalized dictionary around objects the child already loves.
Active engagement: Encourages movement and call-and-response instead of passive screen time.
Robust toddler speech handling: Whisper.cpp pipeline tolerates silence, mumbles, and multiple attempts within edge constraints.
Low-latency inference: Computer vision, audio processing, and UI run together on the RubikPi 3.
Completely offline: No cloud services—vision, transcription, and feedback logic all execute locally for privacy.

Edge Impulse Development Workflow

We rely on Edge Impulse to capture data, design the computer vision impulse, and deploy an optimized model to the RubikPi 3.

1. Prepare the Project & Dataset

Sign in to Edge Impulse Studio and create a project dedicated to TotTalk object recognition.
Connect the device with edge-impulse-linux to stream camera frames directly into the Data acquisition tab.
Label each sample so that every object class remains balanced; the studio supports bounding-box and image labeling workflows.
Maintain an 85% / 15% train-test split to keep hold-out data for unbiased evaluation.

Reference: Edge Impulse data acquisition and labeling tools.

2. Design the Impulse

In the Impulse design view, add an Image processing block to resize incoming frames (e.g., 96×96 RGB) and enable automatic normalisation.
Add a Transfer Learning learning block; MobileNetV2/ResNet-based backbones usually offer a strong accuracy-to-latency balance on the RubikPi 3.
Configure data augmentation (random flip, crop, color shift) so the model generalizes to varied backgrounds and lighting.

Reference: Edge Impulse impulse design.

3. Train & Evaluate the Model

Launch training with appropriate hyperparameters (learning-rate scheduler, 20–50 epochs, early stopping).
Use the Model testing tab to validate accuracy on the held-out dataset and inspect the per-class confusion matrix.
Iterate on data balance, augmentation, or impulse configuration when misclassifications appear.

Reference: Edge Impulse model training and model testing.

4. Deploy & Integrate

Generate a Linux eim package or TensorFlow Lite file from the Deployment tab.
Install the Edge Impulse Linux SDK for Python to run inference alongside the TotTalk UI.
Stream inference results over HTTP or pass them directly into the TotTalk feedback loop for speech prompts.

Reference: Edge Impulse Linux deployment and Linux SDK for Python.

Dataset Summary

We currently track 256 labeled images across 20 object classes. 85% of the samples train the model and 15% remain in the testing set.

Label	Emoji
jcb	🚜
monkey	🐒
bat	🦇
bicycle	🚲
candle	🕯️
car	🚗
chair	🪑
comb	💇
elephant	🐘
glass	🥛
green ball	🟢
guitar	🎸
helmet	⛑️
kiwi	🥝
minion	🤖
octopus	🐙
slippers	🥿
stool	🪑
tiger	🐯
tortoise	🐢

Hardware Setup

https://github.com/PrinceP/tottalk-box/tree/main?tab=readme-ov-file#hardware-setup

Bill of Materials

RubikPi 3 (Qualcomm QCS6490 SoC)
7″ LCD display (1024×600)
Two 5 W speakers
Logitech C270 HD webcam (640×480 with integrated microphone)

Device Preparation

Flash Canonical Ubuntu 24.04 using Qualcomm Launcher

https://github.com/PrinceP/tottalk-box/tree/main?tab=readme-ov-file#flash-canonical-ubuntu-2404-using-qualcomm-launcher

Qualcomm® Launcher (see Thundercomm documentation) streamlines flashing Canonical Ubuntu 24.04 Server onto the RubikPi 3. Follow the official walkthrough to install Renesas USB firmware and replace the stock image.

Complete instructions: https://www.thundercomm.com/rubik-pi-3/en/docs/rubik-pi-3-user-manual/1.0.0-u/Update-Software/3.2.Flash-using-Qualcomm-Launcher

Upgrade to the Latest Canonical Ubuntu Build

Update the board to the most recent certified packages:

sudo apt upgrade -y
git clone -b ubuntu_setup --single-branch https://github.com/rubikpi-ai/rubikpi-script.git
cd rubikpi-script
./install_ppa_pkgs.sh

The helper script installs:

gstreamer1.0-plugins-base-apps, gstreamer1.0-qcom-python-examples, gstreamer1.0-qcom-sample-apps,
gstreamer1.0-tools, libqnn-dev, libsnpe-dev, qcom-adreno1, qcom-fastcv-binaries-dev,
qcom-libdmabufheap-dev, qcom-sensors-test-apps, qcom-video-firmware, qnn-tools, snpe-tools,
tensorflow-lite-qcom-apps, weston-autostart, xwayland, Rubikpi3 camera packages, wiringrp, wiringrp_python,
and tooling such as ffmpeg, net-tools, pulseaudio-utils, python3-pip, selinux-utils, unzip, v4l-utils.

Validate the Platform

cat /etc/os-release
uname -a

Expected output confirms Ubuntu 24.04.2 LTS and the Qualcomm-specific kernel (Linux ubuntu 6.8.0-1055-qcom ...).

Install the Edge Impulse CLI

wget https://cdn.edgeimpulse.com/firmware/linux/setup-edge-impulse-qc-linux.sh
sh setup-edge-impulse-qc-linux.sh
edge-impulse-linux

Follow the browser link presented in the terminal to authenticate the device with Edge Impulse Studio.

Software Installation

Install Drivers, AI Engine Direct, and the IM-SDK

Base tooling:

sudo apt update
sudo apt install -y unzip wget curl python3 python3-pip python3-venv software-properties-common

Qualcomm AI Engine Direct SDK and GStreamer components:

if [ ! -f /etc/apt/sources.list.d/ubuntu-qcom-iot-ubuntu-qcom-ppa-noble.list ]; then
    sudo apt-add-repository -y ppa:ubuntu-qcom-iot/qcom-ppa
fi

sudo apt update
sudo apt install -y gstreamer1.0-tools gstreamer1.0-plugins-good gstreamer1.0-plugins-base \
    gstreamer1.0-plugins-base-apps gstreamer1.0-plugins-qcom-good gstreamer1.0-qcom-sample-apps \
    libqnn1 libsnpe1 libqnn-dev libsnpe-dev

OpenCL GPU drivers:

sudo apt update
sudo apt install -y clinfo qcom-adreno1

if [ ! -f /usr/lib/libOpenCL.so ]; then
    sudo ln -s /lib/aarch64-linux-gnu/libOpenCL.so.1.0.0 /usr/lib/libOpenCL.so
fi

sudo reboot
clinfo
# Expected:
#   Number of platforms: 1
#   Platform Name: QUALCOMM Snapdragon(TM)
#   Platform Version: OpenCL 3.0 QUALCOMM build: 0808.0.7

Visual References

The screenshots below are explained with steps to reproduce them in Edge Impulse Studio.

Data Collection

https://github.com/PrinceP/tottalk-box/tree/main?tab=readme-ov-file#data-collection

Data Setup1

We can see the devices connected to Edge Impulse Studio. Phone was used to capture the data directly into the studio.

We can see the data collected in the Data Acquisition tab. There are 256 items. Each item has a bounding box around it. We can see the data distribution also which provides a good overview of the data and about data imbalance.

We can see the data labeling in action. The UI is very intuitive and easy to use. Data Setup2

Impulse Design

https://github.com/PrinceP/tottalk-box/tree/main?tab=readme-ov-file#impulse-design

Design the impulse (signal processing + learning block) that powers detection:

Go to Impulse design → Create impulse.
Click Add a processing block and choose Image (preprocess + normalize).
Click Add a learning block and choose Object Detection (Images).
Set image size to 320×320 and click Save impulse.

Feature Extraction

Navigate to Impulse design → Image.
Set Color depth to RGB and click Save parameters.
On the next page, click Generate features. This typically takes a few minutes.

RAMusage

Model Training

Go to Impulse design → Object Detection.
Advanced training settings: No color space augmentation (to preserve colored object cues).
Choose the latest YOLO‑Pro model and click Save & train.
After training, review the metrics and confusion matrix. In our run we observed ~97% precision on the training set (results vary by dataset).

Testing

Model Deployment

https://github.com/PrinceP/tottalk-box/tree/main?tab=readme-ov-file#model-deployment

Open the Deployment tab.
Select Linux (AARCH64 with Qualcomm QNN) to run on RubikPi 3’s Qualcomm AI accelerator.
Model optimizations: Quantized (int8), since float32 is not supported for this target.
Click Build to compile and download the EIM (Edge Impulse Model) binary.

Deployment

Application

We integrate the Edge Impulse Linux SDK for Python to run inference on webcam frames and feed detections into the TotTalk feedback loop:

Install the SDK following: https://docs.edgeimpulse.com/tools/libraries/sdks/inference/linux/python
Integrate detections with the UI/audio pipeline to drive prompts and feedback in real time.

Application Setup

Create an isolated environment and install TotTalk dependencies:

python3 -m venv .venv-totalk-box --system-site-packages
source .venv-totalk-box/bin/activate
pip3 install ai-edge-litert==1.3.0 Pillow
pip3 install opencv-python

Install additional system packages:

sudo apt install python3-gi python3-gi-cairo gir1.2-gtk-3.0
sudo apt install python3-venv python3-full
sudo apt install -y pkg-config cmake libcairo2-dev
sudo apt install libgirepository1.0-dev gir1.2-glib-2.0
sudo apt install build-essential python3-dev python3-pip pkg-config meson
sudo apt install fonts-noto-color-emoji
sudo apt install pulseaudio pulseaudio-utils
sudo apt install espeak-ng

Running the Application

python3 class_gallery.py

The UI launches on the RubikPi 3 display, listens for camera events, and streams inference results into the speech feedback loop.

Troubleshooting

Encountering an issue? Capture logs, hardware info, and reproduction steps, then open an issue in the repository.

FAQ

For common questions, start a discussion or open a question in the repository.

Contributing

Pull requests are welcome! Please fork the project, create a descriptive branch, and submit a PR once your changes are ready.

Additional Resources

Edge Impulse data workflow: https://docs.edgeimpulse.com/docs/edge-impulse-studio/data-acquisition
Edge Impulse impulse design: https://docs.edgeimpulse.com/docs/edge-impulse-studio/impulse-design
Edge Impulse model training: https://docs.edgeimpulse.com/docs/edge-impulse-studio/model-training
Edge Impulse Linux SDK (Python): https://docs.edgeimpulse.com/tools/libraries/sdks/inference/linux/python
Thundercomm RubikPi 3 flashing guide: https://www.thundercomm.com/rubik-pi-3/en/docs/rubik-pi-3-user-manual/1.0.0-u/Update-Software/3.2.Flash-using-Qualcomm-Launcher
Qualcomm AI Engine Direct setup notes: https://qc-ai-test.gitbook.io/qc-ai-test-docs/device-setup/rubik-pi3

Data (256 items)

View all

Tiger.68uhu6vl

ID: 2343237458

Tortoise.68uhvkld

ID: 2343243613

Minion.68uhrv5n

ID: 2343227939

Minion.68uhs263

ID: 2343228325

Tiger.68uhus8m

ID: 2343240244

Comb.68ui998h

ID: 2343307028

Glass.68ui6qnd

ID: 2343290085

Kiwi.68uho2an

ID: 2343211743

Model accuracy (Unoptimized (float32))

Validation set

0.94

Test set

0.90

On-device performance (Rubik Pi 3, Quantized (int8))

Latency

3 ms.

Flash usage

2.5M

Run this model

On any device

Clone project

Dataset summary

Data collected

256 items

Labels

bat, bicycle, candle, car, chair and 15 others

Project info

Project ID	794068
License	3-Clause BSD
No. of views	3,879
No. of clones	0