Skip to content

ScreenCapApiV4 Interface Documentation

Introduction

Overview: This module provides a Python interface for multi-window capture and manipulation on the Windows platform. It supports retrieving window handles, extracting window content into OpenCV-format images, and moving specified windows. It meets the requirements for screen content acquisition in multi-window scenarios.

In RflySim drone simulation tasks, simulation visualization windows and various task debugging windows often coexist. Many vision-based autonomous drone tasks require directly capturing real-time画面 (video frames) from simulation windows rather than acquiring them through camera sensor channels. This module adapts to the Windows windowing mechanism, supporting simultaneous enumeration and capture of multiple target windows, and outputs images in a format compatible with OpenCV processing, facilitating developers to directly integrate with various vision detection and recognition algorithms. It is commonly used in RflySim platform computer vision workflows for screen capture in multi-window simulations, simulation demo layout adjustments, and similar tasks.

Quick Start

Minimal working example; copy and modify only minimal configuration to run.

from RflySimSDK.vision.ScreenCapApiV4 import WinInfo
import cv2

# 1. Retrieve all top-level windows and locate the PX4 Flight Gear simulator window (typically named with "FlightGear")
window_list = WinInfo.get_window_list()
flight_gear_hwnd = None
for hwnd, title in window_list:
    if "FlightGear" in title:
        flight_gear_hwnd = hwnd
        break

if flight_gear_hwnd is None:
    raise Exception("FlightGear simulator window not found. Please start RflySim simulation first.")

# 2. Initialize the window screenshot capture object; window width and height are automatically retrieved
screen_cap = WinInfo(hWnd=flight_gear_hwnd)

# 3. Retrieve and print window dimensions
print(f"Simulator window size: width {screen_cap.width}, height {screen_cap.height}")

# 4. Continuously capture and display frames
while True:
    # Retrieve the current frame as an OpenCV-format BGR image
    frame = screen_cap.get_cv_mat()
    if frame is None:
        break

    # Display the captured frame
    cv2.imshow("FlightGear Screen Capture", frame)

    # Press 'q' to exit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release resources
cv2.destroyAllWindows()

Environment & Dependencies

  • Python Environment: >= 3.8.10
  • Dependencies: ctypes, cv2, d3dshot, numpy, sys, win32con, win32gui, win32ui
  • Prerequisites: Before calling this interface, ensure the system supports screen capture functionality and that the RflySimSDK.vision module has been correctly imported.

Core Interface Description

The module ScreenCapApiV4.py includes configuration variables, helper functions, and core business classes.

Global Constants and Enumerations

This section lists all globally accessible constants and enumeration definitions directly referenceable within the module.

Standalone Constants

None


Global/Standalone Functions

window_enumeration_handler(hwnd, window_hwnds)

Function Description: A callback function for window enumeration, used to collect handles of all top-level windows during Windows window enumeration. Parameters:

  • hwnd: Handle of the window currently being enumerated
  • window_hwnds: List container used to store collected window handles

Return Value:

  • None (returning a non-zero value indicates continuing enumeration)

Exceptions: None


getWndHandls()

Function Description: Enumerates all top-level windows currently present in the system and collects their window handles. Parameters: None
Return Value:

  • list[int]: A list of handles for all top-level windows in the system

Exceptions: None


getHwndInfo(hWnd)

Function Description: Retrieves basic information (e.g., position and size) of a specified window based on its handle, for subsequent window screenshot operations. Parameters:

  • hWnd: Handle of the target window

Return Value:

  • dict: A dictionary containing window handle, position coordinates, and size dimensions

Exceptions: None


getCVImg(wInfo)

Function Description: Captures a screenshot of the specified window and converts it into an OpenCV-compatible BGR image format. Parameters:

  • wInfo: Window information dictionary containing window handle, position, and size, obtained via getHwndInfo

Return Value:

  • numpy.ndarray: BGR screenshot array in OpenCV format

Exceptions: None


getCVImgList(wInfoList)

Function Description: Performs batch screenshot capture of multiple windows and converts them into OpenCV format images. Parameters:

  • wInfoList: A list of window information dictionaries, each representing a single window

Return Value:

  • list[numpy.ndarray]: A list of OpenCV-format screenshots corresponding to each window

Exceptions: None


moveWd(hwd, x=0, y=0, topMost=False)

Function Description: Moves the specified window to a given screen coordinate position, optionally setting it to always stay on top. Parameters:

  • hwd: Handle of the target window
  • x: Target X-coordinate of the window’s top-left corner on the screen (default: 0)
  • y: Target Y-coordinate of the window’s top-left corner on the screen (default: 0)
  • topMost: Whether to set the window as always-on-top (default: False, i.e., not always on top)

Return Value:

  • None

Exceptions: None


clearHWND(wInfo)

Function Description: Releases GDI resources occupied during window screenshot operations to prevent resource leaks. Parameters:

  • wInfo: Window information dictionary that has completed screenshot operations and contains GDI resource handles requiring release

Return Value:

  • None

Exceptions: None


WinInfo Class

Stores resources and dimensional information related to Windows window screenshot operations, providing foundational data structures for screen capture functionality.

__init__(hWnd, width, height, saveDC, saveBitMap, mfcDC, hWndDC)

Function Description: Initializes a window information object, storing various resource handles and dimensional parameters required for window screenshot operations. Parameters (Args):

Parameter Name Type Required Default Description
hWnd int Yes - Handle of the target window
width int Yes - Width of the capture area (in pixels)
height int Yes - Height of the capture area (in pixels)
saveDC int Yes - Handle of the compatible device context
saveBitMap int Yes - Handle of the bitmap object
mfcDC int Yes - Handle of the MFC device context
hWndDC int Yes - Handle of the target window's device context

Return Value (Returns):

  • WinInfo instance object

Exceptions (Raises):

  • None

Advanced Usage Example

Demonstrates complex composite scenarios (e.g., multi-class collaboration, asynchronous control, batch operations)

This example demonstrates how to combine multi-window information retrieval with asynchronous batch screenshot processing to achieve synchronized acquisition and asynchronous storage of multiple drone simulation views—suitable for bulk generation of multi-agent cooperative visual datasets:

```python import asyncio from RflySimSDK.vision.ScreenCapApiV4 import WinInfo

async def capture_single_window(win_info, save_path): # Asynchronously capture frame data from the target window and save it frame = win_info.get_capture_frame() if frame is not None: await asyncio.to_thread(cv2.imwrite, save_path, frame) return f"Screenshot saved to {save_path}" return f"Failed to capture window {win_info.window_name}"

async def batch_capture_multiple_drones(window_name_list, save_dir): # Batch initialize multi-simulation window information win_info_list = [] for name in window_name_list: wi = WinInfo(name) if wi.is_window_valid(): win_info_list.append(wi) # Asynchronously execute multi-window screenshot tasks concurrently tasks = [capture_single_window(wi, f"{save_dir}/{wi.window_name}.png") for wi in win_info_list] results = await asyncio.gather(*tasks) for res in results: print(res)

Start batch asynchronous screenshot tasks

if name == "main": target_windows = ["RflySim Simulation 1", "RflySim Simulation 2", "RflySim Simulation 3"] asyncio.run(batch_capture_multiple_drones(target_windows, "./drone_vision_data"))

Notes and Pitfall Avoidance Guide

  • Window Validity Validation: Before calling screenshot-related methods of the WinInfo class, you must first invoke the is_window_valid() method to verify that the target window exists and is capturable. If the simulation window is closed or minimized to the background, skipping this validation will directly cause the program to throw a null pointer exception.
  • Multiple Window Instance Conflicts: Do not instantiate multiple WinInfo objects for the same simulation window. Repeated instantiation will cause the window device context resources to be redundantly occupied, ultimately leading to screenshot lag or even program crashes.
  • Asynchronous Task Resource Limits: When performing batch asynchronous screenshots, the number of windows processed simultaneously should not exceed 8. Excessive concurrent tasks will consume substantial GPU memory and CPU resources, resulting in reduced simulation frame rates or dropped screenshot frames.
  • Window Title Matching Rules: When instantiating WinInfo, if a fuzzy window name is used, the first window matching the name pattern will be matched by default. In cases where multiple simulation windows share the same name, it is recommended to use the full window title to avoid incorrect matching.

Changelog

  • 2024-08-05: fix: Added HTML version of API documentation
  • 2024-07-17: fix: Updated VisionCaptureApi interface
  • 2023-10-23: feat: Added all Python common labs