Skip to content

Chapter 8: Multimodal Perception and Intelligent Decision-Making

Visual perception refers to the process by which machines acquire environmental information through sensors and analyze and interpret images using computer vision techniques. It encompasses tasks such as object detection and recognition, endowing the system with environmental awareness. Obstacle avoidance decision-making relies on environmental information obtained through visual perception, employing environment modeling, path planning, and intelligent decision-making algorithms to formulate behavior strategies that avoid obstacles, prevent collisions, and achieve predefined objectives. These two components are interdependent: visual perception provides environmental data for obstacle avoidance decision-making, while the latter executes actions based on this data. They play a critical role in fields such as autonomous driving and robot navigation, promoting intelligent applications and development of unmanned systems in complex environments.


8.1 Background and Theory

Multimodal perception and intelligent decision-making technologies constitute the core pillars for intelligent unmanned systems to achieve efficient collaboration, autonomous operation, and safety assurance, forming a closed-loop mechanism of “perception–cognition–action.”

alt text

8.1.1 Multisource Information Fusion and Robust Perception

Traditional positioning and control schemes for unmanned systems often rely on single-sensor inputs, such as GNSS satellite positioning. In highly interfered environments—including forests, urban canyons, over sea surfaces, or indoors—integrating multimodal sensor data (e.g., vision, LiDAR, IMU) has become inevitable. Techniques such as SLAM algorithms enable the construction of robust, continuous, and dynamically updatable environmental models.

8.1.2 Obstacle-Avoidance Path Planning and Intelligent Decision-Making

Obstacle-avoidance planning depends on fused perception data and leverages advanced methods—including deep learning, reinforcement learning, behavior trees, graph search, and optimization algorithms—to enable autonomous assessment of obstacle risks, dynamic adjustment of speed and heading, and real-time generation of safe and efficient paths.


8.2 Framework and Interfaces

The RflySim toolchain, combined with typical development cases, provides a detailed introduction to the supporting capabilities for intelligent perception and decision-making tasks, including sensor interfaces, data acquisition and processing workflows, and typical task algorithm architectures.

alt text

8.2.1 Image Acquisition in Virtual Environments

RflySim offers a high-fidelity virtual sensor simulation environment, supporting the generation of multimodal sensor data—including RGB vision, depth images, LiDAR, and IMU—providing realistic test data sources for visual perception algorithms.

8.2.2 Object Detection and Tracking

The platform supports algorithm validation for typical vision tasks, including object detection and tracking, path planning, and obstacle-avoidance strategies. It provides standardized interface frameworks to help developers efficiently transition from simulation validation to real-machine deployment.

alt text


8.3 Showcase of Outstanding Cases

Five-UAV Visual Shared SLAM Hardware-in-the-Loop Simulation:

Simulation Algorithm Development and Validation:


8.4 Course-Linked Video Lectures

Public Lecture Replay (Session 7: Multimodal Perception and Intelligent Decision-Making):

8.5 Chapter Experiment Cases

The related verification experiments and guided cases for this chapter are located in the [Installation Directory]\RflySimAPIs\8.RflySimVision folder.

8.5.1 Interface Learning Experiments

Stored in the 8.RflySimVision\0.ApiExps folder, covering foundational platform interface tutorials and general introductions to various tools.

Experiment 1: Binocular Camera System Calibration

📝 Experiment Overview: Acquire RflySim 3D images via Python interface, demonstrate binocular camera system calibration by altering the position and orientation of the chessboard, and learn visual sensor configuration and camera parameter tuning.

Experiment 2: Vision Image Capture Interface Experiment

📝 Experiment Overview: Acquire RflySim 3D images using the Python interface VisionCaptureApi, learn visual sensor configuration, camera parameter settings, aircraft control, and UE4 control, achieving real-time image capture and flight controller integration.

Experiment 3: NX and PX4 Joint Hardware-in-the-Loop (HIL) Ring-Penetration Simulation

📝 Experiment Overview: Implement hardware-in-the-loop simulation jointly between NX and Pixhawk6x. Automatically obtain the IP address via the Python interface ReqCopterSim, subscribe to image data using ROS, control the aircraft via MAVROS, and use the OpenCV library to achieve ring-penetration functionality. Learn visual sensor configuration and MAVLink communication setup.

Experiment 4: Visual Development Environment Configuration and Preliminary Knowledge

📝 Experiment Overview: Configure the RflySim visual development environment, including virtual machine setup, ROS environment configuration, NX and Pixhawk joint simulation, and visual box HIL simulation as preparatory experiments.

Experiment 5: MAVROS Python OFFBOARD Control Experiment

📝 Experiment Overview: Achieve OFFBOARD mode control of the UAV via the MAVROS Python interface, learning automatic simulation IP acquisition, rospy node programming, aircraft arming, and position setting.

Experiment 6: RflySim Vision Interface Experiment

📝 Experiment Overview: Acquire RflySim 3D images and perform real-time control via Python interface, learning the usage of vision interfaces, including acquisition and processing of multi-camera images, depth maps, point clouds, and other visual data.

Experiment 7: RflySim Vision UDP Direct Transmission with PNG Compression Experiment

📝 Experiment Overview: Implement distributed simulation via UDP direct transmission of PNG-compressed images. Images are received on a remote Linux system or another Windows PC, and flight control commands are sent back. Learn configuration of the SendProtocol transmission mode.

Experiment 8: MAVROS C++ OFFBOARD Control

📝 Experiment Overview: Switch the aircraft to OFFBOARD mode and arm it via MAVROS C++ program, set fixed coordinate points for position control, and learn key technologies such as ROS nodes, topic publishing, and service calls.

Experiment 9: Visual Box Hardware-in-the-Loop Simulation (Ring-Penetration via Serial Port)
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview:
    Implement hardware-in-the-loop simulation for the Vision Box, automatically obtain IP addresses via the Python interface ReqCopterSim, establish co-simulation between RflySim 3D and CopterSim, integrate ROS and MAVROS to control the aircraft for closed-loop flight, and learn visual sensor configuration and the SendProtocol image transmission mode.

Experiment 10: Automatic Generation of AI Training Dataset

📝 Experiment Overview:
Automatically generate AI training datasets using the Python interface VisionCaptureApi. Image data is output in VOC format, and point cloud data in KITTI format, suitable for training object detection and recognition models.

Experiment 11: MAVROS C++ Control Interface Experiment

📝 Experiment Overview:
Control the aircraft via a C++ program using the MAVROS interface, achieving MAVLink-ROS message conversion. Demonstrates automatic IP acquisition and multi-mode aircraft control in distributed simulation.

Experiment 12: Multi-Camera Image Acquisition Experiment

📝 Experiment Overview:
Acquire RGB, grayscale, and depth images from three cameras using the Python interface. Covers visual sensor configuration, real-time camera parameter adjustment, and aircraft control, including usage of the VisionCaptureApi interface and UE4 control.

Experiment 13: PX4ApiTest Control Demonstration

📝 Experiment Overview:
Demonstrate aircraft control using the Python interface PX4MavCtrlV4.py, including usage of position, velocity, attitude, and acceleration control commands.

Experiment 14: RflySim 3D Object Position Acquisition

📝 Experiment Overview:
Acquire position information of dynamically created objects in RflySim 3D using the Python interface. Learn usage of the getUE4Pos function to achieve real-time aircraft position data acquisition.

Experiment 15: Point Cloud Segmentation Experiment

📝 Experiment Overview:
Acquire segmented point cloud data via the Python vision interface, enabling real-time point cloud display and processing. Covers visual sensor configuration, Open3D-based point cloud visualization, and drone control.

Experiment 16: UDP Direct Transmission of Point Cloud Data — Introductory Experiment

📝 Experiment Overview:
Learn to receive point cloud data sent by RflySim3D via UDP direct transmission using the Python interface VisionCaptureApi.py, and visualize the point cloud in a graphical interface. Master configuration of the SendProtocol transmission mode.

Experiment 17: Lightweight UAV Mass-Point Model Control Experiment
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview:
    Implements point-mass-based drone control via the Python-based PX4MavCtrl interface, delivering dynamic performance comparable to software/hardware-in-the-loop simulations, significantly reducing CPU resource consumption and enhancing flight stability.

Experiment 18: Getting Started with VMware

📝 Experiment Overview:
Learn fundamental VMware virtual machine operations, including installation and configuration, network mode selection (Bridged/NAT), basic settings, and login procedures, enabling successful VM startup and network configuration.

Experiment 19: Mid360 LiDAR Simulation

📝 Experiment Overview:
Simulates the Livox Mid360 LiDAR sensor using RflySim, establishing a complete data pipeline from sensor simulation to ROS-based visualization, and validating PX4 Offboard flight control functionality.

Experiment 20: Timestamp Acquisition

📝 Experiment Overview:
Acquires timestamp data via Python interfaces, learning to use mav.StartTimeStmplisten and vis.StartTimeStmplisten to monitor aircraft timestamps. Data includes checksum, aircraft ID, simulation start timestamp, current timestamp, and heartbeat counter.

Experiment 21: RflySim Fisheye Camera Experiment

📝 Experiment Overview:
Demonstrates fisheye camera usage in RflySim for vision-based simulation. Covers visual sensor configuration, image acquisition via VisionCaptureApi, and drone control via MAVLink.

Experiment 22: Simulated Pod UI Control System

📝 Experiment Overview:
Implements a complete UI-based control system for simulated pods in RflySim. Covers control of pod pitch/yaw angles, zoom, and magnification parameters; explores interaction mechanisms between visual sensors and the simulation environment; and introduces principles of AI-based target recognition and tracking.

Experiment 23: Distributed Vision Control

📝 Experiment Overview:
Transmits image data (PNG/JPG, compressed or uncompressed) directly via UDP to remote Linux or Windows machines, receives images, and sends back aircraft control commands—enabling distributed vision control and multi-simulation testing.

Experiment 24: RflySim Vision API – Direct UDP Uncompressed Image Transmission

📝 Experiment Overview:
Implements distributed co-simulation via direct UDP transmission of uncompressed PNG images to remote Linux or Windows machines, with image reception and aircraft command feedback. Focuses on configuring SendProtocol to 2 for uncompressed image transmission.

Experiment 25: Camera Calibration

📝 Experiment Overview:
Uses the Python VisionCaptureApi interface to capture RflySim 3D images, updates camera parameters in real time, collects calibration images, and performs intrinsic camera parameter calibration using MATLAB—learning camera imaging geometry calibration methods.

Experiment 26: ROS Environment Setup in Ubuntu Virtual Machine
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview:
    Master Ubuntu virtual machine configuration methods; learn ROS 1/2 installation and MAVROS configuration; familiarize yourself with installing commonly used libraries such as PCL and OpenCV; understand application scenarios for bridge mode and NAT mode network configurations.

Experiment 27: IMU and Camera Data Acquisition Experiment

📝 Experiment Overview:
This experiment acquires IMU and camera data via Python interface, teaching the use of the VisionCaptureApi interface, including configuring visual sensors, capturing images, dynamically modifying camera parameters in real time, and controlling aircraft flight.

Experiment 28: LiDAR Point Cloud API Display Experiment

📝 Experiment Overview:
Acquires LiDAR point cloud data via Python interface and displays it in real time. Learn to use the VisionCaptureApi visual interface and the Open3DShow point cloud visualization functionality, mastering platform-based image acquisition and shared-memory-based point cloud visualization techniques.

Experiment 29: Python Mavsdk Control Experiment

📝 Experiment Overview:
Demonstrates aircraft control using the Python mavsdk library. Automatically acquires IP addresses via ReqCopterSim, enabling distributed online simulation. Learn MAVLink communication and offboard control methods.

Experiment 30: Image Acquisition Without CopterSim

📝 Experiment Overview:
Acquires RflySim 3D image data via the Python interface VisionCaptureApi without launching CopterSim, and dynamically updates camera parameters (pose, position, FOV, etc.) in real time. Focuses on mastering the usage of visual sensor APIs and camera configuration.

Experiment 31: ROS Image Data Subscription Experiment

📝 Experiment Overview:
Subscribes to and acquires image data from RflySim via ROS. Learn distributed simulation online configuration, automatic IP acquisition via ReqCopterSim, visual sensor configuration, and rospy image topic subscription, enabling cross-platform image data transmission and processing.

Experiment 32: Vision Box Network Port Loopback HIL Simulation

📝 Experiment Overview:
Implements hardware-in-the-loop (HIL) simulation for a vision box via network port. Automatically acquires IP addresses via Python interface, establishes online connection between RflySim3D and CopterSim, and uses ROS and MAVROS to control the aircraft for ring-through missions.

Experiment 33: UDP LiDAR Point Cloud Data Transmission Experiment

📝 Experiment Overview:
Sends image acquisition requests to RflySim via Python interface VisionCaptureApi and PX4MavCtrler, receives point cloud data in UDP direct transmission mode, and dynamically plots point clouds in the virtual machine. Key focus: configuring SendProtocol transmission mode and mastering automatic IP acquisition via ReqCopterSim.

Experiment 34: RflySim Visual AI Interface Experiment
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview: A suite of six sub-experiments for the Vision AI interface, covering binocular calibration, camera calibration, AI training dataset generation, YOLO dataset generation, UE4 camera model derivation, and 3D position estimation. These experiments utilize the Python interface VisionCaptureApi to achieve real-time acquisition and processing of RflySim 3D images.

Experiment 35: Vision Sensor UDP Direct Transmission with JPEG Compression

📝 Experiment Overview: Implements distributed simulation with direct UDP transmission of JPEG-compressed images. Images are received on a remote Linux system (e.g., WinWSL, virtual machine, onboard board, intelligent vision box) or another Windows machine, and aircraft control commands are sent back.

Experiment 36: Derivation of Ideal UE4 Camera Model

📝 Experiment Overview: Uses the Python interface to acquire RflySim 3D images, applies object detection algorithms to derive the ideal UE4 camera model, calculates focal length and intrinsic/extrinsic matrices, and verifies the accuracy of camera parameters under different field-of-view angles.

Experiment 37: Vision Hardware-in-the-Loop Simulation – Modified sysID Loop-through

📝 Experiment Overview: Automatically acquires IP addresses via the Python interface ReqCopterSim.py, establishes co-simulation between RflySim 3D and CopterSim, subscribes to image data via ROS, and controls the aircraft via MAVROS, enabling hardware-in-the-loop loop-through experiments with modified sys_id.

Experiment 38: Depth Map Acquisition

📝 Experiment Overview: Configures camera parameters and acquires depth map data via the Python interface. Covers usage of the VisionCaptureApi vision interface, configuration of depth cameras (TypeID=2), real-time modification of camera pose and position, and depth map reading methods.

Experiment 39: Livox LiDAR Point Cloud Visualization

📝 Experiment Overview: Uses the Python interface to control DJI Livox LiDAR scanning, acquires point cloud data, and visualizes it in real time using Open3D.

Experiment 40: Python MAVROS Control Experiment

📝 Experiment Overview: Demonstrates automatic acquisition of the simulation computer’s IP address and aircraft control via the Python MAVROS interface. Covers usage of the ReqCopterSim interface and rospy topic publishing/subscribing and service calls, enabling aircraft control in distributed simulation.

Experiment 41: Getting Started with Vision Hardware-in-the-Loop Kit

📝 Experiment Overview: Introduces the basic usage of the vision hardware-in-the-loop kit, including hardware configuration and connection setup for the NVIDIA Jetson NX and PX4 flight controller.

Experiment 42: UDP Direct Transmission of Point Cloud Data
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview: Send image capture requests via the Python interface VisionCaptureApi.py, receive processed world-coordinate point cloud data in UDP direct transmission mode, and dynamically display the point cloud in the virtual machine.

Experiment 43: MAVROS Vision-Based Ring-Through Control

📝 Experiment Overview: Automatically obtain IP addresses via Python interface, establish co-simulation between RflySim 3D and CopterSim, subscribe to image data via ROS, and control the aircraft using MAVROS to achieve drone ring-through functionality based on OpenCV.

Experiment 44: UE4 Direct UDP JPEG-Compressed Distributed Simulation

📝 Experiment Overview: Automatically obtain IP addresses via Python interface ReqCopterSim, establish co-simulation between RflySim3D and CopterSim, and implement multi-window distributed visual transmission of UDP direct-transmitted JPEG-compressed images, including configuration and control of three visual sensors.

Experiment 45: Distributed UDP-Compressed Image Transmission (Automatic IP Acquisition)

📝 Experiment Overview: This experiment implements distributed image transmission and control command feedback from Windows to Linux/WinWSL using UDP compressed image transmission mode, and teaches how to use ReqCopterSim to automatically acquire IP addresses for co-simulation.

Experiment 46: Depth Map to Point Cloud Conversion

📝 Experiment Overview: Acquire depth map data via Python vision interface, convert it into point cloud images, and display them in real time. Learn visual sensor configuration, point cloud visualization, and flight controller control interface usage.

Experiment 47: GetCamObjDemo – Camera Object Information Acquisition

📝 Experiment Overview: Acquire information about aircraft, objects, and cameras via Python interface; learn to control aircraft using UE4CtrlAPI.py and PX4MavCtrl.py, and retrieve visual sensor data.

Experiment 48: Accurate 3D Position Acquisition of Objects

📝 Experiment Overview: Call the sendUE4Pos function via Python interface to generate aircraft and spheres, acquire 3D coordinates of camera, object, and target center, and compute relative positional relationships among objects.

Experiment 49: Python MAVROS Drone Control

📝 Experiment Overview: Demonstrate offboard mode control of drones via Python MAVROS interface; learn automatic IP acquisition via ReqCopterSim and usage of rospy node subscription and publishing.

Experiment 50: Multi-Visual-Box Collaborative Ring-Through Simulation
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview: Conduct joint hardware-in-the-loop simulation using two NX vision boxes, automatically obtain IP addresses via Python interface, and integrate ROS and MAVROS to control the aircraft for visual navigation experiments involving ring-penetration tasks.

Experiment 51: Point Cloud Data Visualization

📝 Experiment Overview: Automatically obtain IP addresses via the Python interface ReqCopterSim, transmit point cloud data via UDP, and visualize the point clouds using the Open3DShow interface.

Experiment 52: Direct UDP Transmission of PNG-Compressed Images

📝 Experiment Overview: Configure SendProtocol to enable UDP-based compressed image transmission. Learn to automatically obtain IP addresses via ReqCopterSim, and implement distributed co-simulation where images are received and aircraft control commands are sent back from remote systems such as WSL or virtual machines.

Experiment 53: VisionCapAPI IMU Data Acquisition

📝 Experiment Overview: Acquire IMU data from CopterSim via the Python interface VisionCaptureApi.py, and learn how to configure requests for IMU data transmission and data reading using vision interfaces.

Experiment 54: Automatic YOLO Dataset Generation via RflySim Vision

📝 Experiment Overview: Use the Python interface VisionCaptureApi to retrieve RflySim 3D images and camera parameters, automatically generate datasets in YOLO format, and split them into training and testing sets using maketxt.py.

Experiment 55: Distributed UDP Compressed Image Transmission (Manual IP Configuration)

📝 Experiment Overview: Transmit PNG-compressed images via UDP to a remote Linux system or another Windows computer, and send aircraft control commands back. This experiment requires manually setting IP addresses instead of automatic acquisition.

Experiment 56: Camera Segmented Image Acquisition

📝 Experiment Overview: Acquire RGB and segmented images from RflySim 3D via the Python interface VisionCaptureApi, and learn visual sensor configuration, camera parameter adjustment, and aircraft control.

Experiment 57: Linux Image Reception and ROS Publishing

📝 Experiment Overview: Use a Python interface on a remote Linux system to request sensor data from RflySim 3D, receive images and point clouds, forward them into ROS, and perform visualization using RViz in a distributed simulation setup.

Experiment 58: Vision Interface UDP Transmission Latency Test

📝 Experiment Overview: Acquire IMU and image timestamps via the Python vision interface VisionCaptureApi, compute the minimum achievable image capture latency under UDP network transmission, and evaluate latency performance at a 200 Hz image capture frequency.

Experiment 59: ROS System TF Tree Configuration Modification Experiment

📝 Experiment Overview: Customize and modify the frame_id of the ROS system's TF tree via the Config.json configuration file and Python interface, enabling TF coordinate system configuration and modification for sensor data topics. This experiment helps master TF tree construction methods in distributed simulation.

Experiment 60: One-Click PyTorch Environment Installation

📝 Experiment Overview: Learn to quickly configure a PyTorch deep learning environment using a one-click script, including automatic installation of dependencies such as CUDA and cuDNN.

Experiment 61: Ranging Sensor Experiment

📝 Experiment Overview: Create a laser ranging sensor via the Python interface and acquire ranging data in real time. This experiment covers key topics including visual sensor configuration, distance data acquisition, image display, visual interface usage, sensor parameter configuration, and aircraft control commands.

Experiment 62: Simulation of Three Position Tracking Controllers

📝 Experiment Overview: Use the Python interface PX4MavCtrlV4.py to simultaneously control the aircraft’s target position and forward velocity during visual control. This experiment teaches the usage of three position tracking controllers: PosCtrl, VelCtrlBody, and VelCtrlEarth. Note: This experiment supports execution only in the Windows Python environment.

Experiment 63: AirSim Interface Experiment

📝 Experiment Overview: Control the drone using the AirSim API via the Python interface, achieving position and attitude control.

Experiment 64: Infrared Grayscale and Thermal Image Acquisition

📝 Experiment Overview: Acquire infrared grayscale and thermal camera images from RflySim 3D using the Python interface VisionCaptureApi, mastering visual sensor configuration and image acquisition methods.

Experiment 65: Livox LiDAR UDP Direct Point Cloud Transmission Experiment

📝 Experiment Overview: Send image acquisition requests to RflySim 3D via the Python interface, retrieve 10 Hz point cloud data from the DJI Livox LiDAR, and use Open3D to display point clouds in real time, achieving distributed online simulation.

Experiment 66: Ceres and OpenCV Installation

📝 Experiment Overview: Learn how to quickly install the Ceres optimization library and OpenCV vision library on Ubuntu via scripts, including configuration files and one-click execution scripts, helping users rapidly set up a visual algorithm development environment.

Experiment 67: Getting Started with Anaconda

📝 Experiment Overview: Learn to manage Python environments using Anaconda, including basic operations such as environment creation, activation, and package installation and management.

Experiment 68: Point Cloud Data Transmission Experiment
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview:
    Learn to receive RflySim 3D point cloud data via both shared memory and UDP methods, and master the transmission and processing workflows for LiDAR and world-coordinate-system point clouds.

Experiment 69: UDP Direct Transmission of Camera Gimbal Data Simulation

📝 Experiment Overview:
Run the server on Ubuntu and transmit image data via UDP direct transmission. Process the returned images, subscribe to screenshot emitter view window messages and gimbal control messages, and publish camera and gimbal data topics.

Experiment 70: Serial Port Hardware-in-the-Loop (HITL) Simulation

📝 Experiment Overview:
Implement hardware-in-the-loop simulation with two serial port communications using PX4MavCtrl. Learn to configure serial port parameters, baud rates, and communication connections with the flight controller, and complete control command transmission and flight validation.

Experiment 71: OpenCV 4.10 Source Code Compilation on Ubuntu 22.04

📝 Experiment Overview:
Compile OpenCV 4.10 source code offline in the Ubuntu 22.04 environment, configure CUDA/GPU acceleration parameters, and complete compilation and installation via automated scripts or manual commands. Verify CUDA functionality for both C++ and Python versions.

8.5.2 Fundamental Usage Experiments

Stored in the 8.RflySimVision\1.BasicExps folder, these experiments provide a complete set of supplementary teaching materials for beginners.

Experiment 1: Point-Mass Model Visual Ring-Crossing Experiment

📝 Experiment Overview:
A lightweight drone ring-crossing experiment based on a point-mass model. Using the Python interface VisionCaptureApi.py, obtain RflySim 3D images, identify ring positions, and control the drone to complete the ring-crossing task. Learn visual sensor configuration and point-mass model control methods.

Experiment 2: Drone Visual Tracking of a Ball

📝 Experiment Overview:
Acquire images via the RflySim visual interface, use OpenCV to detect red balls and compute their centroids, and send velocity commands through PX4MavCtrl to guide the drone toward the ball, completing a visual tracking task.

Experiment 3: Drone Circular Target Following

📝 Experiment Overview:
Control a drone to follow a moving red circular target using computer vision techniques. Learn image processing algorithms for target recognition, visual servo control, and the application of PID control in drone following tasks to achieve real-time target tracking.

Experiment 4: Binocular Optical Face Recognition

📝 Experiment Overview:
Configure binocular optical cameras via the Python visual interface, acquire RflySim 3D images, and perform real-time face detection and bounding using cv2 cascade classifiers to achieve binocular visual face recognition.

Experiment 5: Screen-Capture Collision with Ball
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview:
    Acquire RflySim 3D window images via the screen capture API, detect the ball’s position using computer vision algorithms, and control the drone to collide with the ball. This experiment covers key techniques including window handle manipulation, image acquisition, and Offboard-mode velocity control.

Experiment 6: Basic Vision Control Experiment

📝 Experiment Overview:
Covers hands-on learning and practice of lightweight drone vision-based algorithms, including ring-crossing, gimbald control, target tracking, distributed cooperative control, and face recognition.

Experiment 7: RflySim Basic Vision Competition Interface Experiment

📝 Experiment Overview:
Validates the end-to-end data flow of perception pipelines in the platform, including cameras, depth cameras, and LiDAR—covering sensor data acquisition → ROS topic publishing → SLAM odometry → flight controller fusion. The experiment also teaches MAVROS control interface usage and ENU/NED coordinate system mapping.

Experiment 8: Drone Vision-Based Ring-Crossing Experiment

📝 Experiment Overview:
Acquire UE4-rendered images via the screen capture API, identify ring positions using computer vision libraries, compute velocity control commands for the drone, and realize a vision-based control experiment where the drone takes off, sequentially flies through three rings, and lands automatically.

Experiment 9: RflySim Platform Basic Functionality Demo

📝 Experiment Overview:
Learn simulation setup and data acquisition for vision/LiDAR-based localization and obstacle avoidance using RGB cameras, depth cameras, grayscale cameras, and LiDAR sensors.

Experiment 10: Drone Recognition and Path Planning Competition

📝 Experiment Overview:
A comprehensive competition-style experiment implemented via ROS 1, covering drone takeoff, YOLO-based box detection, ArUco-based moving vehicle detection, path planning for obstacle avoidance, and precise landing. The experiment focuses on vision-based navigation and target recognition techniques.

Experiment 11: LLM-Based Drone Control

📝 Experiment Overview:
Conducts drone control experiments using Large Language Models (LLMs), enabling tasks such as takeoff, frame crossing, and QR code recognition via natural language commands. The experiment focuses on integrating LLMs with drone control systems.

Experiment 12: LLM-Based Drone Control with LiDAR SLAM

📝 Experiment Overview:
Uses a large language model to control the drone via natural language commands, combined with Mid-360 LiDAR SLAM for autonomous localization. Tasks include takeoff, frame crossing, and QR code recognition. The experiment explores LLM applications in drone vision-based navigation.

Experiment 13: RflySim Air-Ground Cooperative Racing Simulation

📝 Experiment Overview:
Teaches deployment and operation of the RflySim simulation environment in air-ground cooperative racing scenarios, including sensor configuration for drones and ground vehicles, SLAM-based localization, and motion control algorithm development.

Experiment 14: Drone Following a Circular Board

📝 Experiment Overview:
Acquires RflySim 3D images via the Python vision interface VisionCaptureApi, detects the circular board’s position, and uses keyboard input to control the board’s movement direction. The drone follows the moving board, forming a vision-based control experiment. This experiment covers vision interface usage, drone control commands, and UE control.

Experiment 15: Keyboard-Controlled Gimballed Vision
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview:
    Control the gimbal’s pitch and yaw angles using the arrow keys; use Right Ctrl + arrow keys to control roll angle; use Alt + Up/Down arrows to adjust focal length. This experiment teaches keyboard interaction and shared-memory-based image data transmission.

Experiment 16: Three-UAV Distributed Control

📝 Experiment Overview:
This experiment implements distributed control of three UAVs, where they take off sequentially and visually pass through rings. It introduces fundamental operations in UAV distributed control and vision-based control.

Experiment 17: Three-UAV Distributed Ring-Crossing Control

📝 Experiment Overview:
Implements distributed vision-based ring crossing for three UAVs using the screen capture API. Students learn the ScreenCapAPI, PX4MavCtrl interface, and UAV vision control techniques.

Experiment 18: Two-UAV Distributed Control

📝 Experiment Overview:
Implements distributed control and vision-based ring crossing for two UAVs, teaching the steps for multi-UAV software/hardware-in-the-loop simulation within the RflySim toolchain.

Experiment 19: Two-UAV Distributed Control Experiment

📝 Experiment Overview:
This experiment uses two Python scripts to acquire images via screen capture, processes them to generate velocity control commands, and enables two UAVs to perform distributed ring crossing. It introduces screen capture API and UAV vision-based control.

8.5.3 Advanced Development Experiments

Located in the 8.RflySimVision\2.AdvExps folder, these experiments further familiarize users with certain low-level firmware ecosystem configurations.

Experiment 1: Ubuntu Gimbal Vision Keyboard Control

📝 Experiment Overview:
Conducted in an Ubuntu virtual machine environment, this experiment receives image data via UDP direct transmission and uses arrow keys to control the gimbal’s pitch, yaw, and roll angles, while the Alt key adjusts the field of view, enabling real-time visual sensor control.

Experiment 2: Windows Gimbal Vision Keyboard Control

📝 Experiment Overview:
Uses keyboard arrow keys and key combinations to control the gimbal’s pitch, yaw, roll angles, and focal length. Students learn RflySim sensor configuration and keyboard interaction control methods.

Experiment 3: Rviz Gimbal Vision Keyboard Simulation Experiment

📝 Experiment Overview:
Introduces gimbal vision control using ROS and Rviz. Keyboard inputs (Up/Down/Left/Right/Ctrl+Left-Right/Alt+Up-Down) control the gimbal camera’s pitch, yaw, roll, and focal length, enabling simulation of gimbal vision control.

Experiment 4: RflySim Platform Vision SLAM Experiment
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview:
    Automatically obtain the IP address via the Python interface ReqCopterSim, enabling distributed co-simulation between RflySim 3D and CopterSim. Visual sensors are employed for SLAM-based control, and real-time path planning control is achieved by traversing a sequence of target coordinates, adjusting heading angles and velocities accordingly.

Experiment 5: LiDAR SLAM Experiment

📝 Experiment Overview:
Utilizing the Python interface ReqCopterSim, this experiment conducts SLAM co-simulation among the drone, RflySim 3D, and CopterSim. It realizes real-time mapping and localization using a single-line LiDAR, enabling autonomous obstacle avoidance and navigation for the drone.

Experiment 6: YOLO-Based Ball Collision Experiment

📝 Experiment Overview:
Acquiring images via the RflySim platform interface, this vision-based navigation experiment detects balloon positions using the YOLO algorithm and controls the drone to autonomously collide with the balloon.

Experiment 7: RflySim Single-Object Tracking Experiment

📝 Experiment Overview:
Capturing images from RflySim 3D via a visual interface, this experiment employs OpenCV object tracking algorithms to control one drone to track another. It provides hands-on experience with visual interfaces, camera configuration, and drone control.

Experiment 8: Visual Servoing Object Following Experiment

📝 Experiment Overview:
Using the Python interface VisionCaptureApi.py to acquire images from RflySim 3D, this experiment leverages the platform’s built-in object detection/tracking outputs and applies visual servoing algorithms to achieve real-time following of highly maneuverable targets by the drone. It covers the principles of visual servoing control and drone target tracking techniques.

Experiment 9: A* Algorithm Path Planning Experiment

📝 Experiment Overview:
Implements A* path planning, modifying the traditional 4-neighborhood search to an 8-neighborhood search to enhance planning efficiency and flexibility.

Experiment 10: KCF Ring-Through and Companion Flight Experiment

📝 Experiment Overview:
The drone performs tasks of flying through ring gates and accompanying a ground vehicle, evaluating the integration of the KCF object recognition and tracking algorithm, the EGO-Planner path planning algorithm, and the RflySim toolchain.

Experiment 11: EGO-Swarm Visual Swarm Planning

📝 Experiment Overview:
Three drones execute navigation tasks in a forest environment to validate the integration of the EGO-Swarm path planning algorithm with the RflySim toolchain. It introduces visual swarm perception and autonomous obstacle avoidance techniques.

Experiment 12: Real-Time 3D Point Cloud Mapping with Drone-Borne LiDAR

📝 Experiment Overview:
Leveraging LiDAR and odometry data from the RflySim simulation platform, this experiment performs point cloud coordinate transformation and stitching in a ROS environment. A 3D solid map is constructed using the TF transform tree and homogeneous transformation matrices.

Experiment 13: TF Tree Construction for Dual-Drone Systems

📝 Experiment Overview:
This experiment teaches the construction of TF (Transform) trees in ROS. Using the RflySim simulation platform, it establishes coordinate transformation relationships for a dual-drone system, enabling mastery of coordinate frame isolation and management in multi-robot systems.

Experiment 14: ESDF and Voronoi Diagram-Based Drone Path Planning
  • 📦 Version Requirement: Free Edition

    📝 Experiment Overview:
    Construct an ESDF field and Voronoi topological skeleton from a 2D grid map, perform path planning using gradient-based optimization algorithms, and map the trajectory onto the RflySim platform to achieve closed-loop obstacle avoidance for UAVs.

Experiment 15: YOLO-based Balloon Detection and Collision Control

📝 Experiment Overview:
Acquire images via the RflySim platform interface, detect balloon positions using the YOLO algorithm, and control the UAV to autonomously track and collide with the balloons.

Experiment 16: Simple YOLO Sensor-based 10-UAV Simulation

📝 Experiment Overview:
Identify 10 UAVs using a simplified YOLO sensor; learn visual sensor configuration methods and the application of YOLO-based object detection in large-scale UAV simulation training.

Experiment 17: A* Path Planning Experiment

📝 Experiment Overview:
Learn to perform 8-neighborhood path planning on LiDAR-based maps using the A* algorithm; master image processing with the cv2 library, feasible region segmentation, and conversion from image coordinate system to NED coordinate system.

Experiment 18: RflySim LiDAR SLAM Demonstration

📝 Experiment Overview:
Introduce image and IMU data acquisition using the Python Vision API; analyze image acquisition latency; cover UE4 frame rate configuration, DataCheckFreq setting, and latency optimization techniques.

Experiment 19: RflySim Platform SLAM Vision and IMU Data Acquisition and Analysis

📝 Experiment Overview:
Learn to run visual SLAM simulation on the RflySim platform; configure UE4 frame rate and image acquisition frequency; acquire image and IMU data, analyze timestamps, and evaluate image acquisition latency performance for UAV control.

Experiment 20: LiDAR SLAM Vision Image Acquisition and IMU Data Collection Experiment

📝 Experiment Overview:
Learn to run the RflySim LiDAR SLAM vision image acquisition demonstration; modify configurations to adjust UE frame rate and image acquisition frequency; analyze image and IMU timestamps; evaluate image acquisition latency for UAV control applications.

Experiment 21: Simple YOLO Sensor

📝 Experiment Overview:
Detect UAVs using the YOLO algorithm; master visual sensor configuration and object detection techniques for large-scale simulation training.

Experiment 22: Real-time 3D Point Cloud Mapping with UAV LiDAR

📝 Experiment Overview:
Utilize RflySim platform LiDAR and odometry data; in a ROS environment, apply TF transformations and homogeneous transformation matrices to map local point clouds into the world coordinate system, thereby constructing a 3D solid map.

Experiment 23: A* Algorithm Path Planning ROS Experiment

📝 Experiment Overview:
Implement A* algorithm-based path planning in ROS; adapt the real-robot code from the company to the RflySim simulation platform, including LiDAR point cloud data processing and MAVROS-based control.

Experiment 24: UAV LiDAR 3D Point Cloud Mapping

📝 Experiment Overview: Utilize RflySim simulation platform’s LiDAR and odometry data to perform point cloud coordinate transformation and stitching in a ROS environment. Construct a TF transform tree to map local point clouds into the world coordinate frame, and generate a coherent 3D solid map in real time using Rviz. Master techniques of multi-sensor fusion and coordinate transformation.

Experiment 25: YOLOv5 Object Detection REST API Service

📝 Experiment Overview: Build a REST API using the Flask framework to expose the YOLOv5 object detection model, enabling other services to invoke the PyTorch Hub’s YOLOv5s model for image-based object detection inference via HTTP requests.

Experiment 26: YOLOv5 Integration with W&B Visualization Training Tool

📝 Experiment Overview: Introduce how to integrate Weights & Biases (W&B), a machine learning experiment tracking tool, into YOLOv5 to enable visualization of the training process, metric monitoring, dataset version management, and experiment comparison.

8.5.4 Advanced Development Experiments

Located in the 8.RflySimVision\3.CustExps folder, these are custom development experiments designed for advanced users.

Experiment 1: VINS-Fusion Visual SLAM Mapping

📝 Experiment Overview: Configure and run the VINS-Fusion visual SLAM algorithm in a Linux environment, performing autonomous localization and mapping using visual and IMU data streamed from the RflySim simulation platform.

Experiment 2: Integration and Validation of ORB-SLAM3 with RflySim

📝 Experiment Overview: Learn to integrate ORB-SLAM3 as a ROS node with RflySim data streams (images and IMU), build and run an ORB-SLAM3 ROS node, and collect trajectory and pose data.

Experiment 3: UAV Control Using Behavior Trees

📝 Experiment Overview: Implement UAV takeoff, point-to-point flight, and landing using behavior trees. Design a custom hover node to enable stable hovering at the target point for 10 seconds.

Experiment 4: Cooperative Reconnaissance and Strike by Dual UAVs

📝 Experiment Overview: Simulate a cooperative dual-UAV combat scenario: UAV #2 performs reconnaissance to locate a red balloon, transmits the target information to UAV #1, and UAV #1 takes off and strikes the target based on the received data. This validates the system’s capabilities in information sharing, target recognition, and precise strike.

Experiment 5: LLM-Based Behavior Tree UAV Control

📝 Experiment Overview: Load the Qwen3-0.6b model via Ollama, parse natural language commands into behavior trees and ROS control instructions, achieving semantic-to-action decision-making control for UAVs, and execute maneuvering tasks using PX4 SITL.

Experiment 6: Audio-Visual Sensor Fusion Experiment
  • 📦 Version Requirement: Full Version

    📝 Experiment Overview: Achieves sound source localization using the Interaural Time Difference (ITD) binaural principle, integrates with visual detection for target recognition and tracking, and trains users in multi-sensor fusion-based drone control methods.

Experiment 7: Air-Ground Cooperative Track SITL Simulation

📝 Experiment Overview: A software-in-the-loop (SITL) simulation platform tailored for the Air-Ground Cooperative Track of the 28th China Robot & Artificial Intelligence Competition. It encompasses the complete competition workflow simulation for both drones and ground vehicles:
- Drone: Takeoff → Detect square → Fly through → Detect QR code → Pop balloon
- Ground Vehicle: Detect gate → Pass through → Circumnavigate course → Return to start
Designed for algorithm validation and competition training.