Building the SO-ARM101: First Impressions
2026·05·29 · 3 min
I built the SO-ARM101 to get hands-on with embodied AI. To collect manipulation data, train policies, and understand the hardware constraints that don't show up in papers. Here's how it went.
What Is It?
The SO-ARM101 is an open-source 6-DOF robot arm developed by The Robot Studio and Hugging Face, designed to work with LeRobot. The setup is two arms: a leader you control by hand, and a follower that mirrors it. You teleoperate to collect demonstrations, then train a neural network to replicate the behaviour autonomously.
It's 3D printed, uses off-the-shelf servo motors, and costs a few hundred pounds to build. Impressive hardware at a fairly accessible price point.
The Build
I printed all the parts ahead of time on my BambuLab A1 Mini in PLA Tough+. The STL files come pre-oriented for minimal supports, so this step was very straightforward.
The instructions are good, though the assembly still presented some challenges.
Thread your 3-pin cables through each joint before closing it up, the docs say to do this for good reason. It's also worth knowing upfront that the leader and follower use different motors: the follower has six identical units, while the leader uses three different gear ratios (1/191, 1/147, and 1/345) depending on the joint, since it needs to hold its own weight while still being light enough to move freely by hand.
At one point I put the wrong part on a servo and couldn't get it back off, the M3 screws had stripped the plastic threads and just spun freely. After exhausting multiple non-destructive methods to no success the solution was to chop through the part with snips, free the servo, and print a replacement while continuing to assemble the follower arm in parallel. By the time I had finished the follower arm (and had lunch) the replacement part was ready.

Once both arms were calibrated and connected, the follower mirrored my hand movements in real time. After the build process, that moment made it worth it.
Wrist Camera
I added a 32x32 UVC module to the follower's wrist using the hex-nut mount. It's fun to watch the POV from the arm, but the camera's real purpose is training data. When you record demonstrations for imitation learning, the wrist feed gives the policy a close-up view of what the gripper is doing at the point of contact. Fixed overhead cameras can miss what actually matters during manipulation when the gripper itself occludes the object.

What's Next
Record demonstrations, train a policy (LeRobot supports ACT, SmolVLA, and others), and experiment with it. I'll write that up once I have results worth sharing.