PRISM: Precision Relocation via Intelligent Slide Manipulation

Introduction

This project is an Industry Track collaboration with Ember Robotics to design and implement a vision-guided pick-and-place system using a Techman robotic arm. Our goal is to manipulate fragile glass lab slides with high precision, which requires robust object detection, accurate motion planning, and precise actuation. This work addresses critical challenges in places such as laboratory environments or chemical handling manufacturing lines, where current workflows suffer from:

👷‍♂️

Manual & Inefficient

Current slide handling workflow requires manual intervention

🦠

Contamination Risks

Potential contamination risks that affect experiment results

Our Goal

We designed, built, and tested a vision-guided pick-and-place system using a Techman TM12 robotic arm that can manipulate and relocate lab glass slides with high precision. For successful task execution, the robot detects the position and orientation of each slide, picks it from a source tray, and places it accurately onto a target rack. This project tackles several interesting technical problems that make automated slide manipulation difficult:

Computer vision for slide identification
Precise pick-and-place in 3D space
Robust operation for continuous tasks

Our approach combines computer vision, motion planning, and custom hardware into a complete ROS2 system. We detect slide poses, compute precise trajectories, and execute robust pick-and-place actions with a custom adapter and gripper, targeting repeatable performance in real-world laboratory settings.

Results

The final system demonstrates consistent and repeatable slide manipulation across multiple trials, as shown in the visual demonstrations below.

Design

Design Criteria & Requirements

Functional Requirements:

Detect and localize transparent glass slides with <5mm position error
Pick slides without damage or slippage
Place slides accurately in target rack
Execute pick and place cycles continuously without failure

System Constraints:

Must integrate with existing TM12M robot arm
No existing gripper available to use
Must be able to pick up standard size lab slides (25mm x 75mm x 1mm)
RGB-only vision (depth sensors fail on transparent objects)
Laboratory space constraints

Design Overview

The system is designed as a modular pipeline consisting of perception, planning, and actuation components. Hardware design focuses on a custom adapter to ensure mechanical stability, while software design emphasizes clear separation between sensing and execution through iterative refinement. This architecture separates concerns, enabling independent development and testing of each module while maintaining clear interfaces through ROS2 topics and services.

👁️

Perception

Computer Vision algorithm to detect slides from RGB image and estimate their 6D poses

🧠

Planning

Motion planning with MoveIt2, leveraging OMPL for path-finding and IK solvers

⚙️

Actuation

Custom hardware interface executing trajectories via ROS2 services on microcontroller-driven gripper

System Architecture

The system is designed to be launched via a centralized launch file that initializes all necessary ROS2 nodes. Once ready, the user sends a service request to the pick_and_place node, which orchestrates the entire manipulation pipeline. This node calls the slide_detector helper node to obtain slide information. The slide_detector subscribes to detected slide transforms (TFs) from either marker_detect or gsam_detect nodes and maintains a queue of slides to be picked. A static_tf_broadcaster connects the camera frame to the robot's TF tree, establishing the spatial reference. Finally, the ik_node computes inverse kinematics and generates optimal collision-free paths using MoveIt2, while the gripper_server node controls the servo actuator via Arduino interface.

Key Design Choices & Trade-offs

Design Choice	Rationale	Trade-off
Dual perception methods (ArUco + G-SAM)	ArUco is fast but needs markers; G-SAM is flexible but slower	Speed vs. adaptability. Therefore, we provide both options for the users to choose
Intel RealSense D435i camera	Camera already available for us; small and lightweight; Better resolution and frame rate than built-in camera	Additional hardware to mount and wire, but saves development time and provides the quality needed for precise slide detection
MoveIt2 + RRTstar planner	Proven collision avoidance, handles complex joint constraints; team members familiar with this library	Slower than analytical IK, but more robust for obstacles
Custom adapter (v2 with ribs)	No off-the-shelf mounts could fit camera + gripper together in desired location and orientation	Added design/fabrication time (especially multiple iterations), but customizable to our exact needs
Arduino-controlled gripper	Jetson Nano PWM was unstable over long distances; offloading to Arduino Nano improved reliability	Extra hardware component and cable, but cleaner ROS2 interface

Impact on Real World Engineering Criteria

These design choices directly improve real-world performance: Robustness is achieved through dual perception methods (ArUco and G-SAM) that provide fallback options when markers are unavailable or slides are repositioned, plus the MoveIt2 IK solver that prevents trajectory planning failures. Durability comes from the reinforced adapter v2 that survived 20+ test cycles without cracking, and the careful selection of wire gauges and WAGO connectors that reliably power the gripper. Efficiency is demonstrated by ArUco detection running at 30 fps for real-time operation, and the Arduino-controlled gripper achieves minimal latency because it runs dedicated bare-metal code. The modular ROS2 architecture also allows parallel development and easy component swapping, reducing integration time during testing and debugging.

Implementation

The implementation integrates custom hardware with a software pipeline for perception, motion planning, and execution, enabling reliable end-to-end slide manipulation.

Perception

The perception module is responsible for detecting the microscope slide and estimating its pose within the workspace. Using Intel RealSense input, the system identifies the slide’s position and orientation relative to the robot with CV algorithm, providing a consistent geometric reference for downstream modules. Accurate perception is critical for reliable grasping, as small errors in pose estimation can lead to failed grasps or misalignment during placement.

Challenge

Detect yellow slide box and transparent glass slides using PURE RGB only!

Input: An RGB image

Output: Slide position + orientation (in SE3)

Technical Challenge: Depth cam cannot see through transparent glass slide

Strategies

ArUco Marker + OpenCV

Marker Detection

→

Box Localization

→

Slide Detection

→

Frame Publication

Marker Detection: Stable OpenCV library to locate the box & region of interest
Output: SE3 of box (x-y position, depth, orientation)
Slide Detection: Canny edge detection with 25 kernels for possible slots
Performance: Very high reliability, fast processing
Limitation: Requires marker, fixed slide position

G-SAM (GroundedDINO + SAM)

Text-Object Detection

→

Image Segmentation (SAM)

→

Slide Detection

→

Frame Publication

GroundedDINO: Text-based object detection for identifying objects
SAM: Segment Anything for precise image segmentation
Output: Object boxes and masked objects
Performance: High reliability, slower processing
Advantage: No marker required, adaptable to any slide position

Detection Visualization

Kernel Index 1

Method Comparison

Metric	ArUco Markers	G-SAM
Speed	Fast	Slower
Reliability	Very High	High
Marker Required	Yes	No
Adaptability	Any slide position and orientation	Any slide position and orientation

Planning

The planning module takes the estimated slide pose from the perception stage and uses MoveIt2 to compute a feasible, collision-free motion plan for the robotic arm. This includes determining appropriate approach, grasp, and retreat motions while respecting kinematic constraints and environmental obstacles. The resulting trajectory ensures that the robot can safely and smoothly reach the target pose without violating task constraints.

1. Scan & Queue

Parse TFs to build target list

→

2. Aligned Pick

Match frame & grip

→

3. Move to Target

Plan & transport

→

4. Tilt & Place

Release with tilt

Computer Vision with MoveIt2

Integrating TM12 kinematics (link_6) for precise manipulation

ik.py Implementation

Inverse Kinematics: MoveIt2 /compute_ik service for pose-to-joint conversion
Trajectory Planning: RRTstar planner with scene objects and enforced joint limits
Execution: /follow_joint_trajectory action client with dual velocity profiles
- Z-axis: 20% velocity
- XY-axes: 40% velocity
Technical Challenge: Quaternion normalization required for stable IK solutions

Key Capabilities

Pose → target joint angles conversion
Collision avoidance
Stable after quaternion fix
Optimized path planning with RRTstar

Actuation

The actuation module executes the planned trajectory on the physical robot by sending motion commands to the arm and gripper. It coordinates joint-level control to achieve stable grasping and precise placement of the slide. Reliable actuation is essential for closing the loop between planning and real-world execution, ensuring that planned motions translate into consistent physical behavior. The gripper evolved from an initial MG996R servo design to an improved Arduino Nano–controlled gripper (USB to ROS2), providing more stable pick-and-place performance with the custom adapter.

Challenge

No gripper available to use
Built-in camera unable to access when ROS2 listener node is running

Our Solution

Design a custom gripper to pick and place slides, and CAD an adapter to mount camera and gripper

Initial Gripper

Motor: MG996R servo motor (PWM-Controlled)
Power: 4 AA battery holder (6V supply)
Controller: Jetson Nano as ROS2 computer

Improved Gripper

Enhanced Control: Arduino Nano mounted next to gripper
Communication: USB (Serial) back to ROS2 computer
Software: Gripper server in ROS2 + PWM Arduino code

Hardware Setup

CAD and 3D-print an adapter to connect:

Techman TM12M robot arm
Intel RealSense D435i depth camera
MG996R servo gripper

Adapter Iteration

Version 1

Loading 3D model... ensure model-viewer script is available.

Column structure failed after a few tests
Required a more robust structure

Version 2

Loading 3D model... ensure model-viewer script is available.

Higher infill rate to increase overall stiffness
Stronger ribs to distribute stress more equally
More suitable rib dimensions for simpler assembly

Conclusion

Summary

This project delivers a full-stack, vision-guided pick-and-place system for fragile lab slides, integrating perception, planning, and actuation on a Techman arm with custom adapters and grippers. Robust perception (ArUco and G-SAM), MoveIt2-based planning, and iterative hardware refinements improved grasp stability and placement accuracy. The system demonstrates consistent end-to-end performance under real laboratory constraints, showing that careful sensing, trajectory generation, and execution can reliably automate slide handling.

01

Perception — Implement G-SAM and slide edge detection pipeline to locate slides positions

02

Planning — Utilize MoveIt2 to calculate inverse kinematics and generate path

03

Actuation — Connect hardware components with the custom adapter and execute control sequence

Future Work

Future work focuses on four upgrades: a more robust CV model (e.g., end-to-end YOLO) to increase detection speed, improve accuracy, and reduce false positives; an adaptive gripper to handle a wider range of slide thicknesses and materials while maintaining a stable grasp; more precise placement to preserve orientation and spacing, preventing downstream alignment issues; and speed optimization via precomputed paths plus performance monitoring to cut cycle time and surface anomalies early. Together, these enhancements aim to make the system faster, more reliable, and easier to deploy across diverse lab workflows.

🎯

Robust CV Model

End-to-end YOLO model after dataset collection

🤏

Adaptive Gripper

Enhanced gripper design for various slide types

📊

Precise Placement

Accurate placement in sorted order

⚡

Speed Optimization

Improve throughput by precomputing paths

Team 45

Name	Contribution	Background
David Chen	Actuation	David is an EECS M.Eng. student specializing in Robotics and AI. He completed his undergraduate studies in Computer Science and Engineering at UC Irvine, where he worked on various embedded software projects involving object detection algorithms, neural networks, Raspberry Pi, Arduino, and drones. He also has hands-on experience in vehicle engineering and technical project management through his role as the Project Manager of the UCI Solar Car Project, where he led a multidisciplinary team in building a fully solar-powered electric vehicle from the ground up in two years.
Ryan Chung	Website, Support	Ryan is an M.Eng. student in Mechanical Engineering at UC Berkeley, focusing on robotics and control. He completed his undergraduate degree in Mechanical Engineering at National Taiwan University, with a minor in Music specializing in piano performance from National Taiwan Normal University. His academic and research interests center on medical and rehabilitation robotics, where he aims to develop systems that support healthcare professionals and improve patient care. His background in piano performance offers a distinctive perspective for the team’s robotic pianist project.
Yu-Wei Chang	CAD, Support	Yu-Wei is an M.Eng. student in Mechanical Engineering at UC Berkeley with a high interest in robotics, control, and mechanical design. He completed his undergraduate degree in Power Mechanical Engineering at National Tsing Hua University, where he worked on various projects such as a DC motor with a PI controller, Arduino applications, and mechanical structure design. His internship experiences at two manufacturing companies gave him great insight into industry needs.
Bryan Chang	Planning	Bryan is a MEng student in Electrical Engineering and Computer Science at UC Berkeley, specializing in autonomous driving systems and sensor data processing. With hands-on experience from multiple internships in the autonomous vehicle industry, he has developed expertise in LiDAR perception systems and created transformer-based machine learning models to automate lane line labeling for 3D point clouds. Currently working on an autonomous racing car project, Bryan is advancing LiDAR localization systems that operate under high-performance, real-time constraints. His technical foundation spans computer vision, signal processing, and robotics, with a particular strength in transforming raw sensor data into reliable, safety-critical insights for autonomous vehicles. Bryan's passion lies in solving the complex challenges of perception and localization that enable autonomous systems to navigate safely in dynamic environments.
Kain Hung	Perception	Kain is an EECS M.Eng. student with broad enthusiasm in computer vision, robotics, and AI. He completed his undergraduate degree at National Taiwan University, Computer Science department with three Human-Computer Interaction projects, where he explored numerous skills in hardware and software co-designing, including Arduino, OpenCV, OpenPose, and Pytorch. His hands-on experience includes a large-scale tangible interface, a thin, card-shaped robot with ML-based perception, and a multimodal AI system.

Additional Materials

Link to our GitHub page: https://github.com/YuKai0928/25F_206A_FinalProject

ROS2 packages, launch files, model, and instructions

Link to our Google Drive: https://drive.google.com/drive/folders/1J0ep10ODH-zIUo68zBBcHVBDpZjkjJoT?usp=sharing

CAD models for adapters
Component datasheets
Additional videos and images

PRISM

Precision Relocation via Intelligent Slide Manipulation

Introduction

Manual & Inefficient

Contamination Risks

Our Goal

Results

Design

Design Criteria & Requirements

Functional Requirements:

System Constraints:

Design Overview

Perception

Planning

Actuation

System Architecture

Key Design Choices & Trade-offs

Impact on Real World Engineering Criteria

Implementation

Perception

Challenge

Strategies

ArUco Marker + OpenCV

G-SAM (GroundedDINO + SAM)

Detection Visualization

Method Comparison

Planning

1. Scan & Queue

2. Aligned Pick

3. Move to Target

4. Tilt & Place

Computer Vision with MoveIt2

ik.py Implementation

Key Capabilities

Actuation

Challenge

Our Solution

Initial Gripper

Improved Gripper

Hardware Setup

Adapter Iteration

Version 1

Version 2

Conclusion

Summary

Future Work

Robust CV Model

Adaptive Gripper

Precise Placement

Speed Optimization

Team 45

Industry Sponsor

Additional Materials