LLM Arduino Robot

Demos

Detect sound + send audio + run STT (speech-to-text)	move_motors.ino (test sketch with predefined movement)	move_servos.ino (test sketch with predefined movement)
listen_and_run_stt.mp4	move_motors.arduino.mp4	move_servos.arduino.mp4

How it's made

Connections: Fritzing diagram
Components: List
Chassis and motors: Step-by-step instructions
Vision and eye movement: Step-by-step instructions
Audio capture and playback: Step-by-step instructions
ESP32-CAM frame rate study: Quality-size fps study
ESP32-CAM calibration, distortion correction and rectification: Fisheye lens setup

Description

[Missing description]

To run the project:

arduino/: Arduino test and production sketches
esp32/: ESP32 (-CAMs, -WROVER) test and production sketches
computer/: Computer test and production scripts
requirements.txt: Project dependencies

To help understand/use the project:

drawio/: Drawio flowcharts to understand the inner workings of the robot
fritzing/: Cable connections in Fritzing
guides/: Step-by-step guides for the contruction of the robot
images/: Images used in the guides
LICENSE: Project license
README.md: Overview of the project
several-esp32-model-pinouts.txt: Collection of tested ESP32-CAM pinouts (to use depending on own camera in esp32/cam/XXXX-production.ino)
venv-pip-install.txt: Collection of pip install commands for manual / machine-dependent dependency installation
tts-constraints.txt: Fixed TTS dependencies versions for quicker installation

To help improve the docs on the robot-building process:

tools/: Shell scripts to compress videos, etc. before pushing to GitHub

Technical Overview

Robot Components

Vision:

2x ESP32-CAM (with OV2640 camera and async web server) to send each eye's frame to the computer (upon request)

Audio:

Input: KY-037 sound sensor (adjustable by potentiometer) triggers INMP441 I2S microphone for RECORDING_DURATION_MS (e.g., 5000 ms) audio recording. ESP32-WROVER sends this to computer via web sockets. Recording progress -e.g. Listening (3s)...- visualized on OLED SSD1306 I2C 128x64 screen
Output: Speaker with MAX98357A amplifier for audio playback. Audio (from Coqui.ai's text-to-speech conversion) sent from computer, received by ESP32-WROVER, and forwarded to MAX98357A

Mobility:

ESP32-WROVER server receives commands
Arduino Uno forwards commands to:
- L298N motor driver (for wheel movement)
- 2x SG-90 servos (for up/down/left/right eye movement)

Computer Components

Visual Processing:

YOLOv8 (You Only Look Once) to detect objects - potentially obstacles (if both frames are available)
SGBM (Semi-Global Block Matching) to estimate object depth (if both frames are available)
DeepFace to recognize interlocutor's face (if at least 1 frame is available)

Audio Processing:

Web sockets receive audio from ESP32-WROVER
Whisper for speech-to-text transcription, discarding if below MIN_WORDS_THRESHOLD

Memory:

ChromaDB to retrieve relevant long-term memories before every LLM or LMM call

AI Processing:

LMM (Large Multimodal Model) to describe the view in a context-relevant way
LLM (Large Language Model) to decide what to speak and which parts to move

Setup

Computer

On the computer:

Clone (and cd into) the repository:

git clone https://github.com/Any-Winter-4079/LLM-Arduino-Robot.git
cd LLM-Arduino-Robot

Create a virtual environment:

python3.11 -m venv venv

Activate the virtual environment:

macOS:

source venv/bin/activate

Upgrade pip:

pip install --upgrade pip

Install all of the dependencies:

pip install -r requirements.txt

Note: requirements.txt is generated via: pip freeze > requirements.txt so you'll see dependencies of dependencies.

Or manually install the main dependencies using the venv-pip-install.txt commands, which might be recommended if you do not have an M-series (M1 / M2 / etc.) Mac as some installs may need another flavor for your machine (e.g. you may want to use tensorflow rather than tensorflow-macos and tensorflow-metal or skip the nightly version of Pytorch).

Additionally, tts-constraints.txt makes installing coqui-ai's TTS easier by fixing dependency versions. So, if pip install TTS hangs, try pip install TTS -c tts-constraints.txt

Robot

For the robot, install the Arduino IDE (for example, v2.3.2) on the computer, and then:

Install esp32 (for example, v2.0.11) by espressif from the Boards Manager (left-side menu).

Install ArduinoWebsockets (for example, v0.5.4) from the Library Manager (left-side menu).

Install FreeRTOS (for example, v11.1.0-3) from the Library Manager (left-side menu).

And then add the following GitHub libraries:

into XXXX/Arduino/libraries (for example, Users/you/Documents/Arduino/libraries/ on macOS) for an async web server to send the images to the computer (to go from 1 fps to 37 fps at (320x240) resolution)

ESP32

On the ESP32 side, to allow them to communicate with your computer in your local network, replace:


const char* ssid1 = "***"; // Your network's name
const char* password1 = "***"; // Your network's password
IPAddress staticIP1(*, *, *, *); // Your ESP32's (desired) IP on the network
IPAddress gateway1(*, *, *, *); // Your router's local gateway IP

with your primary network (e.g. your home Wi-Fi) details in esp32/cam/XXXX-production.ino and esp32/wrover/production.ino.

Replace:


const char* ssid2 = "***";
const char* password2 = "***";
IPAddress staticIP2(*, *, *, *);
IPAddress gateway2(*, *, *, *);

with your secondary (backup) network (e.g. phone hotspot) details.

And in the case of the ESP32-WROVER, replace:


const char* websocket_server_host1 = "*.*.*.*";

with your primary network's computer IP (in esp32/wrover/production.ino).

And:


const char* websocket_server_host2 = "*.*.*.*";

with your backup network's computer IP.

Note: Make sure to request unique IPs for each ESP32 (e.g. 192.168.1.180 and 192.168.1.181 for your ESP32-CAMs and 192.168.1.182 for your ESP32-WROVER, with your computer at 192.168.1.174).

Finally, for each of your 2 cameras (e.g. AiThinker, M5Stack Wide) and WROVER (e.g. Freenove), flash (through their USB type C or VCC/GND/TX/RX) the corresponding (modified) production sketch (i.e. esp32/cam/m5stackwide-production.ino, esp32/cam/aithinker-production.ino or esp32/wrover/production.ino) with the following Arduino IDE Tools setup:


Board: "ESP32 Dev Module"
Port: "/dev/cu.usbserial-110" (select your own)
CPU Frequency: "240MHz (WiFi/BT)"
Core Debug Level: "None"
Erase All Flash Before Sketch Upload: "Disabled"
Events Run On: "Core 1"
Flash Frequency: "80MHz"
Flash Mode: "QIO"
Flash Size: "4MB (32Mb)"
JTAG Adapter: "Disabled"
Arduino Runs On: "Core 1"
Partition Scheme: "Huge APP (3MB No OTA/1MB SPIFFS)"
PSRAM: "Enabled"
Upload Speed: "115200"

Arduino Uno

Lastly, on the Arduino side, flash (through its USB type B) arduino/production.ino with the RX pin temporarily disconnected.

Usage

[Missing usage]

Quick communication diagrams

Motor (top left), Audio (top right) and Servo and Cameras (bottom left) communication diagrams (with more details on 'How it's made' guides)

License

[Missing license]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Arduino Robot

Demos

How it's made

Description

Technical Overview

Robot Components

Computer Components

Setup

Computer

Robot

ESP32

Arduino Uno

Usage

Quick communication diagrams

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
arduino		arduino
computer		computer
esp32		esp32
guides		guides
images/original		images/original
tools		tools
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
several-esp32-model-pinouts.txt		several-esp32-model-pinouts.txt
tts-constraints.txt		tts-constraints.txt
venv-pip-install.txt		venv-pip-install.txt

Any-Winter-4079/LLM-Arduino-Robot

Folders and files

Latest commit

History

Repository files navigation

LLM Arduino Robot

Demos

How it's made

Description

Technical Overview

Robot Components

Computer Components

Setup

Computer

Robot

ESP32

Arduino Uno

Usage

Quick communication diagrams

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages