English | 简体中文
RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
📝Paper | 🌍Project Page | 🛢️Data
- [2025/01/08] 🔥 We release data collection method.
- [2024/12/04] 🔥 We release the RoboMatrix supervised fine-tuning (SFT) dataset containing 1,500 high-quality human-annotated demonstration videos.
crossing_obstacles_with_adversarial_interaction.mp4
We use robots from DJI’s RoboMaster series as the hardware platform, including the Engineering Robot (EP) and the Warrior Robot (S1). These two forms of robots share some common components, including the mobile chassis, monocular RGB camera, audio module, and controller. Additionally, each robot is equipped with a unique set of components to perform specific tasks, such as the target shooting capability of the S1 robot and the target grasping capability of the EP robot.
We modified the EP robot by mounting the camera above the robot to prevent the camera’s viewpoint from changing with the movement of the robotic arm. See 3D_Printing for the parts of the designed camera mount.
We use BEITONG ASURA 2PRO+ GAMEPAD NEARLINK VERSION as the controller for robot teleoperation.
We developed RoboMatrix using the ROS2 framework on Ubuntu 20.04. You can follow the official installation guidance to complete the installation of the Foxy distro of ROS2 and the necessary tools. In addition, we passed the test on Ubuntu 22.04 (ROS2 Humble), which may provide some reference for you if you want to install RoboMatrix in a later version of Ubuntu.
We provide a general installation procedure for ROS2, this might give you some help. If you already have it installed on your system, please skip this step.
ROS2 Installation
Open a terminal, check weather your system supports UTF-8.
locale
If not support (no output in terminal), please install.
sudo locale-gen en_US en_US.UTF-8
sudo update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8
export LANG=en_US.UTF-8
Open a terminal, check weather your system supports Ubuntu Universe.
apt-cache policy | grep universe
If not support (no output in terminal), please install.
sudo apt install software-properties-common
sudo add-apt-repository universe
sudo apt update && sudo apt install curl gnupg2 lsb-release
sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2/ubuntu $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/ros2.list > /dev/null
Install the specified version of ROS2, using Foxy as an example.
sudo apt update
sudo apt install ros-foxy-desktop
echo "source /opt/ros/foxy/setup.bash" >> ~/.bashrc
source .bashrc
Open a terminal, start talker node.
ros2 run demo_nodes_cpp talker
Open a new terminal, start listener node.
ros2 run demo_nodes_cpp listener
sudo apt install python3-colcon-common-extensions
git clone https://github.com/WayneMao/RoboMatrix.git
cd ~/RoboMatrix && colcon build
Install dependencies.
sudo apt install libopus-dev python3-pip
python3 -m pip install -U numpy numpy-quaternion pyyaml
Install SDK from source code.
python3 -m pip install git+https://github.com/jeguzzi/RoboMaster-SDK.git
python3 -m pip install git+https://github.com/jeguzzi/RoboMaster-SDK.git#"egg=libmedia_codec&subdirectory=lib/libmedia_codec"
pip install -r requirements.txt
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
cd ~/RoboMatrix/robomatrix_client/robomatrix_client
git clone https://github.com/IDEA-Research/Grounding-DINO-1.5-API.git
cd Grounding-DINO-1.5-API
pip install -v -e .
Download the RoboMaster official APP, follow the instructions to connect the robot to WiFi (only WiFi5), and connect the computer to the same WiFi to complete the connection.
source ~/RoboMatrixinstall/setup.bash
ros2 launch robomaster_ros collect_data.launch.py name:=example idx:=1 dir:=~/RoboMatrixDatasets
Parameter | Definition | Example |
---|---|---|
name | A custom task name | move_to_box |
idx | The sequence number of the current episode of the task | 10 |
dir | The folder where the data is saved | ~/MyDatasets |
NOTEs
- Make sure the robot is successfully connected to the specified WIFI before launching the launch file.
- Make sure the controller's button mode is XBOX, which you can view in the terminal. In the case of BEITONG, long press the
POWER
button to switch. - Ensure that the robot initialization is complete before proceeding with the following operations.
By pressing the START
button, the robot's status begins to be recorded and the other buttons on the handle are activated, allowing control of the robot's movement.
The control mode of the robot chassis is speed control. The RS
axis controls the translation speed of the chassis, and the LT
and RT
axes control the rotation speed of the chassis.
The control mode of the robot arm is position control. The HAT
key set changes the position of the end of the robot arm in the plane. Each press moves its position a fixed distance in the specified direction.
The gripper control is binarized. The A
button controls the gripper open to the maximum, and the B
button controls the gripper closed to the maximum.
Press the BACK
button to save the data, then press the POWER
button to clean the ROS2 node and wait for the video to finish saving.
Comming soon.
Comming soon.
- Package Docker
- 🤗 Release Supervised Fine-tuning dataset
- Optimize VLA ROS communication
- Open source VLA Skill model code
- Release VLA Skill model weights
- Open source Shooting code
If you find our work helpful, please cite us:
@article{mao2024robomatrix,
title={RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World},
author={Mao, Weixin and Zhong, Weiheng and Jiang, Zhou and Fang, Dong and Zhang, Zhongyue and Lan, Zihan and Jia, Fan and Wang, Tiancai and Fan, Haoqiang and Yoshie, Osamu},
journal={arXiv preprint arXiv:2412.00171},
year={2024}
}
- Implementation of Vision-Language-Action (VLA) skill model is based on LLaVA.
- RoboMatrix-ROS is based on official RoboMaster-SDK, modified RoboMaster-SDK and ROS2.
- Some additional libraries: Grounding-DINO-1.5, YOLO-World.