In my last blog on the robotic lawn mower, I had indicated that I rewrote the old python code using AI. Since then I have refactored the entire codebase and made significant changes to the hardware and the corresponding software. I had AI write the summary below. But I will let the videos speak for themselves.
The field test went well, now the Nvidia Jetson Nano Orin is able to control the lawn mower successfully based on the controls signal sent by the phone via an RF transmitter.
Summary below is AI generated and is reviewed for accuracy.
ZTM Auto-Mower Refactored — Project History
A chronological log of all challenges, approaches, and fixes encountered during the development and integration of the refactored auto-mower control system.
Phase 1: Code Refactoring & Architecture
1.1 Modular Redesign
Goal: Refactor the original monolithic auto-mower app into a modular design with separation of threads.
Approach: Discussed and designed a new architecture with dedicated modules:
app/hardware/— DAC, relays, GPIOapp/io/— NRF radio, Arduino calibration, USB detectionapp/control/— stick, motor, heartbeat, calibration trainingapp/web/— Flask dashboard and APIapp/ai/— AI mowing (future)
Removed BLE and unused Arduino references. Added AI mowing feature stub to read predetermined GPS mapping with camera feedback.
1.2 Training Mode
Added a training mode to collect training data via GPS and cameras for future AI-assisted mowing.
Phase 2: Deployment & NRF Communication
2.1 Initial Deployment to Jetson Nano
Challenge: First deploy to the Nano via SSH/rsync. Had to establish remote workflow using sshpass with credentials for vivek@192.168.55.1.
Issues encountered:
- Connecting Mac to Nano over USB (
192.168.55.1) - Old app auto-starting and conflicting with new app
- Setting up systemd service to auto-start the refactored app
2.2 NRF Radio — No ACK
Challenge: After deploying the refactored app, the NRF radio would not send ACK responses back to the Android app. The heartbeat was not being received.
Approaches tried:
- Hard-coded USB port assignments via
.envfile — fragile, ports changed on reboot - Auto-probing all USB ports — cameras and GPS were identifiable, remaining port assigned to NRF
- Required Android NRF to be active before Nano boot
- Manual rescan from the dashboard
Root cause: The refactored app didn't start NRF communication after rescan. The NRF reader thread needed to be restarted when a new port was assigned.
2.3 USB Port Detection & Assignment
Challenge: Jetson Nano reassigns USB port numbers on reboot, so hard-coded paths broke.
Solution: Built a USB detection panel on the dashboard showing all detected ports, bytes received, device identifiers, and a dropdown to manually reassign if needed. Cameras and GPS were auto-detected by USB VID/PID; the remaining port defaulted to NRF.
2.4 Heartbeat Lost — Emergency Stop
Challenge: Even after NRF was detected, the watchdog would trigger emergency stop after 3.5 seconds of no heartbeat.
[NRF] Parse: H,0,0,0,0,1,54933,0,248
[NRF] HB: steer=0 speed=0 E=0 R=0 cmd=0 relays=11111000
[WATCHDOG] HEARTBEAT LOST (3.5s) - EMERGENCY STOP!
Fixes:
- Reduced heartbeat timeout from 3s to 1s for faster response
- Kept heartbeat alive even when receiving junk characters on the NRF serial line
- Ensured NRF thread restart on port rescan
Phase 3: Thread Deadlocks & State Management
3.1 App Crash on E-STOP Command
Challenge: The app crashed when an emergency stop command was sent from the Android app. Traced to state management issues.
Root cause: Multiple threads (NRF, web, heartbeat, stick) were reading and writing a shared state object with locks, causing deadlocks.
3.2 Deadlock Analysis & Lock-Free Redesign
Challenge: The master state object with locks caused the NRF thread to deadlock, freezing the entire app.
Approach — brainstormed each shared variable:
- Heartbeat: Written by NRF, read by watchdog and web — no lock needed (single writer)
- Motor controls: Written by active control mode (NRF, web, or AI), read by DAC — only one writer active at a time
- Relay bitmask: Written by NRF, web, or e-stop — single flag overwrite, unlikely deadlock
- Stick detection: Split into two fields (vertical/horizontal angle), each set by one camera
- DAC calibration: Written only from Android app, read by DAC module
Solution: Removed the master state lock. Each thread writes only its own data; other threads only read. Eliminated all deadlock scenarios.
3.3 Control Mode Conflicts (Web vs Android)
Challenge: The web dashboard and Android app kept overriding each other's motor controls and relay states. Pressing stop on Android was immediately overwritten by web.
Solution: Added a control mode toggle (WEB / ANDROID) on the dashboard. Only the active controller can write motor commands and relay states. Mode persists through e-stop (previously e-stop reset mode to web).
3.4 Relay Bitmask Sync Between Android and Nano
Challenge: The relay bitmask sent continuously by Android would overwrite Nano-initiated relay changes (like start/stop mower). The relays would flip back within milliseconds.
Solution: Implemented a two-way sync protocol:
- Android sends bitmask + change flag
- Once Nano's bitmask matches, change flag resets and Android stops overwriting
- When Nano changes relays (e.g., e-stop), it sets flag=2
- Android accepts the change, updates its switches, and sends the bitmask back to confirm
Phase 4: Motor Control & DAC Calibration
4.1 DAC Voltage Mapping
Challenge: Stopping the mower sent motors to 2.8V instead of the calibrated neutral voltages (left=2.44V, right=3.21V).
Root cause: The 0% motor command was hard-coded to 2.8V instead of reading from calibration.
Solution: Designed a proper voltage mapping:
- Android sends speed/steer → Nano converts to left/right motor percentage (-100 to +100)
- 0% maps to calibrated neutral voltage
- -100% and +100% map to calibrated low/high voltages
- Exponential acceleration curve between neutral and extremes
4.2 Left/Right Motor Inversion
Challenge: One motor uses inverted voltage (higher voltage = reverse). The inversion was applied inconsistently across the stack, sometimes double-inverting.
Issues found:
- Android displayed inverted direction for left motor
- DAC calibration panel showed pre-inversion values
- Left/right DAC pin assignments were swapped
- Steering logic was reversed (turning left spun the wrong motor faster)
Fixes applied iteratively:
- Swapped DAC pin assignments (left↔right)
- Applied inversion to right motor instead of left
- Fixed Android steering logic for gyro mode
- Added "DAC Sent" display showing actual voltage after inversion
4.3 Arduino Calibration System
Challenge: Needed to verify DAC output voltage matched intended values. No way to measure voltages remotely.
Solution: Connected an Arduino Uno to the Jetson via USB with 4 analog inputs:
- A0 = Mower left handle voltage
- A1 = Mower right handle voltage
- A2 = DAC left output
- A3 = DAC right output
Arduino streams V,mower_l,mower_r,dac_l,dac_r at 10Hz. Dashboard calibration panel shows live comparison.
4.4 Arduino Upload Difficulties
Challenge: Uploading the Arduino sketch was problematic.
Issues:
arduino-clinot installed on Nano — installed manually- Wrong bootloader selection (
atmega328oldvsatmega328/new bootloader) - Serial port conflicts when app held the Arduino port
Fix: Used arduino:avr:nano (new bootloader) FQBN. Kill the app before upload to release the serial port.
4.5 Arduino Left/Right Values Swapped
Challenge: The Arduino analog readings for left and right (both mower and DAC) were physically swapped compared to what the software expected.
Fix: Swapped the values in the Nano's serial parser (calibration_arduino.py) — dac_left = Arduino A3, dac_right = Arduino A2.
4.6 Training Mode & Calibration Verify
Built a calibration training mode that:
- Steps through a sequence of motor percentages (-100% to +100%)
- Sets DAC to corresponding voltage
- User follows instructions on the physical mower handles
- Arduino reads actual mower voltage and DAC output
- Logs differences for calibration refinement
Added a "Verify" button that sweeps 5 voltage pairs and reports accuracy.
4.7 Calibration Panel — Edit/Save Bug
Challenge: Editing calibration values on the dashboard would be overwritten by the live polling loop before the save button could be pressed.
Fix: Added an explicit Edit/Save/Refresh workflow — live polling pauses while editing, values only commit on Save.
Phase 5: Relay Logic Refinement
5.1 Relay Auto-Toggle on Joystick Movement
Challenge: Moving the joystick while mower was IDLE would auto-enable master kill and motor relays, creating unexpected behavior.
Fix: Removed the auto-relay-on logic. Relays only change via explicit start/stop buttons, emergency stop, or manual relay toggles.
5.2 E-STOP Reset Didn't Restore Relays
Challenge: After clearing an emergency stop, relays didn't return to their pre-e-stop state. Spare relays (used for other mower switches) stayed off.
Fix: On e-stop clear, restore spare relays to their default ON state for mower readiness.
5.3 Start/Stop Mower Relay Mapping
Added relay 7 and 8 (spare 4 and 5) to the start/stop mower sequence — on when running, off when stopped.
5.4 Relay Actor Logging
Added logs for every relay toggle with the responsible actor (NRF, WEB, E-STOP, WATCHDOG, etc.) at log level 5 for debugging relay fights.
Phase 6: Dashboard & Android App Polish
6.1 Camera Preview Toggle
Challenge: Camera thread drained CPU even when not needed.
Solution: Camera only runs in stick mode, AI training, and AI running mode. Added a button to toggle camera preview on the dashboard.
6.2 Android App UI Updates
- Changed from white-on-black to black-on-white theme
- Moved NRF TX/RX section above virtual joystick
- Added split motor bars: "Sent" (from gyro/joystick) vs "Received" (from NRF ACK), side by side
- Added motor test mode with individual left/right sliders
6.3 Android Motor Value Display
Challenge: Significant delay in motor values updating on Android. Slower than the NRF round trip.
Fix: Separated sent vs received motor values into independent display bars that update at different rates.
6.4 Android Sends Left/Right Instead of Speed/Steer
Challenge: Different control modes (joystick, gyro, motor test) used different formats. Gyro sent speed/steer while motor test sent left/right, causing confusion on the Nano.
Fix: All Android control modes now send left/right motor values. Nano only expects left/right — no more speed/steer conversion needed on the backend.
6.5 Dashboard Server Restart Button
Added a button on the dashboard to restart the Flask server itself, useful for applying code changes without SSH access.
6.6 Log Level Dropdown
Added a runtime log level selector on the dashboard (levels 1-10). Higher levels include all lower-level logs. Avoids restarting the app to change verbosity.
6.7 Log Retention
- New log file created on each app start
- Logs older than 30 days auto-deleted
- Rolling file handler prevents single-file bloat
Phase 7: E-STOP on Reverse & NRF RX Loss
7.1 E-STOP Triggered on Reverse
Challenge: Moving the control from forward to reverse triggered an emergency stop.
Investigation:
- Checked for divide-by-zero when motor speed crosses from positive to negative
- Checked for variables that could cause a crash during sign change
- Examined the DAC voltage mapping at the crossover point
Root cause: The NRF RX stopped entirely when reversing. Android TX continued sending, but Nano stopped responding. The heartbeat watchdog interpreted the silence as a lost connection.
7.2 NRF RX Loss During Reverse
Challenge: The Nano stopped receiving NRF data when motor commands went negative, even though Android kept transmitting.
Investigation: Pulled and analyzed logs showing:
- NRF parse continued working for forward values
- RX went silent at the exact moment of sign change
- No crash or exception in the logs
Root cause & fix: The NRF serial read was failing on certain byte patterns that appeared in negative motor values. Added error handling to keep the NRF reader alive regardless of parse failures, and ensured junk characters don't kill the heartbeat.
Phase 8: MCP4728 → MCP42010 Migration
8.1 SPI Driver Implementation
Goal: Replace the I2C DAC (MCP4728, channel A broken) with an SPI digital potentiometer (MCP42010, dual 10kΩ).
Approach: Created mcp42010_digipot.py wrapping Python spidev. Voltage divider config: PA→5V, PB→GND, wiper as output. Maps voltage to wiper position: wiper = round(voltage / VDD * 255).
8.2 SPI Bus Identification on Jetson Orin Nano
Challenge: Board was actually a Jetson Orin Nano (Tegra234), not a regular Nano. SPI bus naming differs — Tegra SPI1 on header pins 19/21/23/24 maps to spidev0 in Linux.
Approaches:
- Started with
DIGIPOT_SPI_BUS = 0 - Changed to
1based on misinterpretation - Corrected back to
0after device tree verification
8.3 Stale __pycache__ on Nano
Challenge: After deploying updated code, Nano still ran old MCP4728 code from cached .pyc files.
Fix: Clear all __pycache__ directories before each deploy.
8.4 MCP42010 Pinout Documentation Error
Challenge: Initial pinout had critical errors — pins 13 (SO) and 14 (VDD) were swapped, and PA/PB terminal assignments were reversed.
Impact: Chip was unpowered or parasitically powered. Wipers read constant ~2.75V (floating).
Fix: Cross-referenced official Microchip datasheet. Moved 5V from pin 13 to pin 14. Readings shifted to ~1.9V/2.1V — chip now powered but still not responding to commands.
8.5 Flat Wiper Readings — Exhaustive Software Testing
Challenge: After fixing VDD, wipers showed ~1.9V and ~2.1V but were completely unresponsive to SPI commands. Sweeping 0→255 produced no change.
All approaches tried (all failed):
spidev0.0at speeds from 10kHz to 1MHzspidev1.0- All 4 SPI modes (0, 1, 2, 3)
- 20x repeated writes
- Separate pot writes vs simultaneous
- GPIO bit-bang (manual pin toggling, bypassing spidev)
- MISO response analysis (returned
[0,0]— ambiguous: could be valid chip response or floating pin) spi.no_cs = True— threwOSError: Invalid argument(unsupported on Orin Nano kernel)
8.6 Arduino as Remote Pin Probe
Challenge: No multimeter access remotely. Needed to verify electrical signals on the Jetson header.
Solution: Connected Jetson SPI pins directly to Arduino analog inputs:
- Jetson pin 19 (MOSI) → Arduino A0
- Jetson pin 23 (SCLK) → Arduino A1
- Jetson pin 24 (CE0/CS) → Arduino A2
- Jetson pin 21 (MISO) → Arduino A3
Toggled each Jetson pin via GPIO and read Arduino to verify signals.
8.7 Root Cause — CS Wire on Wrong Pin
Discovery from probe test:
| Pin | Toggle Result | Verdict |
|---|---|---|
| Pin 19 (MOSI) | 0V ↔ 3.49V | OK |
| Pin 23 (SCLK) | 0V ↔ 3.49V | OK |
| Pin 24 (CE0/CS) | 0V ↔ 0V | STUCK AT 0V |
Root cause: The CS wire was on pin 25 (GND) instead of pin 24 (CE0) — adjacent pins on the 40-pin header.
Impact: With CS permanently LOW, the chip was always selected but never latched data. The MCP42010 requires CS to go HIGH→LOW before clocking data, then LOW→HIGH to execute the command. Permanent LOW means the chip just shifts data through without executing.
Fix: Moved the CS wire from pin 25 to pin 24.
8.8 Successful Integration
After the CS fix, the digipot immediately worked:
| Wiper | Expected | Left | Right |
|---|---|---|---|
| 0 | 0.0V | 0.015V | 0.000V |
| 64 | 1.25V | 1.266V | 1.251V |
| 128 | 2.5V | 2.527V | 2.507V |
| 192 | 3.75V | 3.778V | 3.763V |
| 255 | 5.0V | 4.990V | 4.995V |
Calibration verify passed with max error of 20mV across the full 0.5V–4.5V range, well within the 150mV tolerance.
Recurring Cross-Cutting Issues
SSH/SSHPASS Flakiness
SSH commands via sshpass intermittently failed with Permission denied throughout the project. Certain quoting styles and command chaining triggered failures. Workaround: vary quoting, split compound commands, retry.
Remote Development Workflow
All development done from a MacBook connecting to the Jetson Nano over USB at 192.168.55.1. Code edited locally, deployed via rsync, logs pulled via ssh. Arduino sketches compiled and uploaded from the Nano using arduino-cli.
Key Lessons Learned
- Always verify physical wiring electrically — days of software debugging couldn't find a wire on the wrong pin
- Use the Arduino as a remote probe — when you can't use a multimeter, analog inputs make an excellent remote voltmeter
- Clear
__pycache__on every deploy — stale bytecode is invisible and causes maddening "my fix didn't work" moments - Lock-free state management — single-writer-per-variable eliminates deadlocks in multi-threaded embedded systems
- Two-way bitmask sync — when two controllers can both change relay state, an explicit "change pending" flag prevents overwrite races
- Verify datasheets directly — don't trust secondary pinout documentation; always cross-reference the manufacturer's PDF
- Adjacent pin mistakes are common — pin 24 vs pin 25 on a 40-pin header is an easy error; always double-count