ROS2 Wrapper for Depth Anything 3

Monocular depth estimation for robotics applications with real-time performance and GPU acceleration.

overview

depth anything 3 ros2 is a wrapper that brings ai-powered monocular depth estimation to ros2 robotics environments. it enables real-time depth perception from rgb camera streams, making it ideal for autonomous navigation, obstacle avoidance, and 3d scene understanding in robotic applications.

the wrapper integrates seamlessly with existing ros2 camera pipelines and supports multiple model variants optimized for different hardware configurations, from edge devices to high-performance gpu workstations.

features

real-time depth estimation: process rgb images to generate accurate depth maps in real-time
multiple model variants: choose from small, base, large, or giant parameter models
gpu acceleration: cuda support for high-performance inference on nvidia gpus
configurable parameters: customize topics, device selection, and model variants at runtime
ros2 integration: standard sensor_msgs/image compatibility with existing pipelines

performance

model performance characteristics on nvidia rtx 3090:

model	parameters	vram	fps
small	80m	~2gb	~45
base	120m	~3gb	~35
large	350m	~5gb	~25
giant	1.15b	~12gb	~10

system requirements

ros2: humble or newer
python: 3.10 or higher
pytorch: 2.0 or higher
cuda: optional, for gpu acceleration

installation

1. install pytorch and python dependencies:

pip install torch torchvision opencv-python

2. install depth anything 3:

pip install git+https://github.com/DepthAnything/Depth-Anything-V2.git

3. build the ros2 package:

cd ~/ros2_ws/src
git clone https://github.com/spectateursimon/Depth-Anything-3-ROS2.git
cd ~/ros2_ws
colcon build --packages-select depth_anything_3_ros2

usage

basic launch with default parameters:

ros2 launch depth_anything_3_ros2 default.launch.py

launch with custom parameters:

ros2 launch depth_anything_3_ros2 default.launch.py model_name:=depth-anything/DA3-Small device:=cpu

parameters

parameter	default	description
image_topic	/camera/color/image_raw	input rgb camera stream
depth_image_topic	/depth	output depth image topic
device	cuda:0	computation device (cuda:0, cpu)
model_name	depth-anything/DA3-Large	model variant to use

use cases

autonomous navigation: enable depth perception for mobile robots and drones
obstacle avoidance: detect and avoid obstacles in real-time using monocular cameras
3d scene understanding: generate depth maps for slam and mapping applications
research and development: prototype and test depth estimation algorithms in ros2

credits

this ros2 wrapper is based on depth anything 3 by bytedance. depth anything 3 is a state-of-the-art monocular depth estimation model that achieves exceptional performance across diverse scenarios.

view on github ↗