How to become Doc-Ock — an AR controlled robotic arm

4 min readDec 28, 2023

Authors: Alice Cai and Aida Baradari

Aida wearing LimbX on her back (left), Alice and Vik thinking about the control mechanisms (right)

Overview

Can we extend our perception of our own bodies? How do we easily control an arm that does not belong to us biologically?

Those were the questions that we asked ourselves in Summer 2022, coming up with the idea of a 7 segment supernumerary robotic limb that we wanted to control through augmented reality.

See a demo of the limb in 2D on our YouTube channel:

How We Built It

If you are interested in a more detailed description, you can find it on our website here, where we also post about our other projects and activities as an organization.

Hardware/Software Design

We built the continuum supernumerary robotic limb (SRL) from scratch. The continuum SRL comprises three modular segments constructed with custom rigid vertebrae connected by U-joints and a flexible spine. These segments are independently actuated through a servo-driven tendon system. Whereas controllers for SRL are usually designed using inverse kinematics, we have used a machine learning (ML) model mapping position and servo angle-end effector position together. Together with the AR control system using a Microsoft Hololens, we incorporated eye tracking and voice recognition to enable high-level intention-based control.

Hardware Development

Leveraging insights from The Bootup Guide to Homebrew Two-Stage Tentacle Mechanisms, our prototypes blend off-the-shelf and custom parts. The SRL’s three segments have 3D-printed vertebrae, U-joints, and a polyurethane spine, optimized for structural integrity and flexibility. The servo-driven system controls movement along two planes, with an electromagnet at the end for user-directed actuation.

Actuation Circuit Design

The SRL is powered by a 16-Channel 12-bit Servo Driver and a Raspberry Pi, with a MOSFET and a 9V power supply controlling the electromagnet. Servo drivers are an intermediary between the control system and servo motors, providing precision control and distributing power across multiple servo motors.

Circuit wiring schematic of Raspberry Pi with servo driver

Machine Learning Controller

Data Collection

Instead of a traditional kinematics-based controller, we trained a neural network to predict the needed servo angles for a desired location. We developed a data collection rig consisting of an overhead camera to collect video data of the limb configurations at different combinations of servo angles. We sweeped a large parameter space of servo angles, with over 800 data points and developed a computer-vision based post-processing system to level the alignment, segment the joints and end effectors, and map the pixel positions to real world dimensions.

Plotting our set of collected data, we can see an elliptic configuration space in which the robotic limb moves.

Machine Learning Model

For this complex problem, we decided to train two neural networks. The first one — forward model — predicts the end-effector position based on the servo angles we input. We used a standard multilayer perceptron neural network with ReLu activation functions, and trained it using the Adam optimizer with mean-squared-error loss (MSE). Performing hyperparameter search on the hidden layer depth and width, the regularization term, learning rate, and epoch number, we found the optimal neural network structure that minimizes our MSE loss function.

Forward neural network Structure. Input dimension: 3, Hidden Layers: [4,8,16,32,8,4], Output dimension: 2

Using this forward neural network, we then trained an inverse neural network that would actually predict the servo motor angles based on the desired location we want the limb to move to. Similarly to the forward Neural network, we are performing hyperparameter search to find the ideal architecture. However, the key difference is our loss function. The loss function is the euclidean distance between the training data position and the predicted position from the forward neural network based on the predicted angles from our inverse neural network.

Inverse Neural network Structure. Input dimensions: 2, Hidden Layers: [32,16,8,4], Output dimensions: 3

Using the trained inverse neural network, we can now predict the servo angles needed to get a desired (x,y) position. Running the network on the testing data and plotting the predictions in a scatter plot, we see similarity to our initial elliptic training data shape.

Scatter plot of predicted positions from trained neural network

We now have a Machine Learning algorithm that can predict the servo angle positions needed to get to a desired position. If you are interested in the code and want to take a look, see here.

AR User Control System

Our HoloLens-based system incorporates calibration, eye tracking, voice commands, and feedback mechanisms. It translates AR targets into real-world coordinates, allowing the SRL to respond to user commands like “Go there” or “Grab that”.