ScreenCapApiV4 Interface Documentation¶
Introduction¶
Overview: This module provides a Python interface for multi-window capture and manipulation on the Windows platform. It supports retrieving window handles, extracting window content into OpenCV-format images, and moving specified windows. It meets the requirements for screen content acquisition in multi-window scenarios.
In RflySim drone simulation tasks, simulation visualization windows and various task debugging windows often coexist. Many vision-based autonomous drone tasks require directly capturing real-time画面 (video frames) from simulation windows rather than acquiring them through camera sensor channels. This module adapts to the Windows windowing mechanism, supporting simultaneous enumeration and capture of multiple target windows, and outputs images in a format compatible with OpenCV processing, facilitating developers to directly integrate with various vision detection and recognition algorithms. It is commonly used in RflySim platform computer vision workflows for screen capture in multi-window simulations, simulation demo layout adjustments, and similar tasks.
Quick Start¶
Minimal working example; copy and modify only minimal configuration to run.
from RflySimSDK.vision.ScreenCapApiV4 import WinInfo
import cv2
# 1. Retrieve all top-level windows and locate the PX4 Flight Gear simulator window (typically named with "FlightGear")
window_list = WinInfo.get_window_list()
flight_gear_hwnd = None
for hwnd, title in window_list:
if "FlightGear" in title:
flight_gear_hwnd = hwnd
break
if flight_gear_hwnd is None:
raise Exception("FlightGear simulator window not found. Please start RflySim simulation first.")
# 2. Initialize the window screenshot capture object; window width and height are automatically retrieved
screen_cap = WinInfo(hWnd=flight_gear_hwnd)
# 3. Retrieve and print window dimensions
print(f"Simulator window size: width {screen_cap.width}, height {screen_cap.height}")
# 4. Continuously capture and display frames
while True:
# Retrieve the current frame as an OpenCV-format BGR image
frame = screen_cap.get_cv_mat()
if frame is None:
break
# Display the captured frame
cv2.imshow("FlightGear Screen Capture", frame)
# Press 'q' to exit
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release resources
cv2.destroyAllWindows()
Environment & Dependencies¶
- Python Environment:
>= 3.8.10 - Dependencies:
ctypes,cv2,d3dshot,numpy,sys,win32con,win32gui,win32ui - Prerequisites: Before calling this interface, ensure the system supports screen capture functionality and that the
RflySimSDK.visionmodule has been correctly imported.
Core Interface Description¶
The module ScreenCapApiV4.py includes configuration variables, helper functions, and core business classes.
Global Constants and Enumerations¶
This section lists all globally accessible constants and enumeration definitions directly referenceable within the module.
Standalone Constants¶
None
Global/Standalone Functions¶
window_enumeration_handler(hwnd, window_hwnds)¶
Function Description: A callback function for window enumeration, used to collect handles of all top-level windows during Windows window enumeration. Parameters:
hwnd: Handle of the window currently being enumeratedwindow_hwnds: List container used to store collected window handles
Return Value:
- None (returning a non-zero value indicates continuing enumeration)
Exceptions: None
getWndHandls()¶
Function Description: Enumerates all top-level windows currently present in the system and collects their window handles.
Parameters:
None
Return Value:
list[int]: A list of handles for all top-level windows in the system
Exceptions: None
getHwndInfo(hWnd)¶
Function Description: Retrieves basic information (e.g., position and size) of a specified window based on its handle, for subsequent window screenshot operations. Parameters:
hWnd: Handle of the target window
Return Value:
dict: A dictionary containing window handle, position coordinates, and size dimensions
Exceptions: None
getCVImg(wInfo)¶
Function Description: Captures a screenshot of the specified window and converts it into an OpenCV-compatible BGR image format. Parameters:
wInfo: Window information dictionary containing window handle, position, and size, obtained viagetHwndInfo
Return Value:
numpy.ndarray: BGR screenshot array in OpenCV format
Exceptions: None
getCVImgList(wInfoList)¶
Function Description: Performs batch screenshot capture of multiple windows and converts them into OpenCV format images. Parameters:
wInfoList: A list of window information dictionaries, each representing a single window
Return Value:
list[numpy.ndarray]: A list of OpenCV-format screenshots corresponding to each window
Exceptions: None
moveWd(hwd, x=0, y=0, topMost=False)¶
Function Description: Moves the specified window to a given screen coordinate position, optionally setting it to always stay on top. Parameters:
hwd: Handle of the target windowx: Target X-coordinate of the window’s top-left corner on the screen (default: 0)y: Target Y-coordinate of the window’s top-left corner on the screen (default: 0)topMost: Whether to set the window as always-on-top (default:False, i.e., not always on top)
Return Value:
- None
Exceptions: None
clearHWND(wInfo)¶
Function Description: Releases GDI resources occupied during window screenshot operations to prevent resource leaks. Parameters:
wInfo: Window information dictionary that has completed screenshot operations and contains GDI resource handles requiring release
Return Value:
- None
Exceptions: None
WinInfo Class¶
Stores resources and dimensional information related to Windows window screenshot operations, providing foundational data structures for screen capture functionality.
__init__(hWnd, width, height, saveDC, saveBitMap, mfcDC, hWndDC)¶
Function Description: Initializes a window information object, storing various resource handles and dimensional parameters required for window screenshot operations. Parameters (Args):
| Parameter Name | Type | Required | Default | Description |
|---|---|---|---|---|
hWnd |
int |
Yes | - | Handle of the target window |
width |
int |
Yes | - | Width of the capture area (in pixels) |
height |
int |
Yes | - | Height of the capture area (in pixels) |
saveDC |
int |
Yes | - | Handle of the compatible device context |
saveBitMap |
int |
Yes | - | Handle of the bitmap object |
mfcDC |
int |
Yes | - | Handle of the MFC device context |
hWndDC |
int |
Yes | - | Handle of the target window's device context |
Return Value (Returns):
WinInfoinstance object
Exceptions (Raises):
- None
Advanced Usage Example¶
Demonstrates complex composite scenarios (e.g., multi-class collaboration, asynchronous control, batch operations)
This example demonstrates how to combine multi-window information retrieval with asynchronous batch screenshot processing to achieve synchronized acquisition and asynchronous storage of multiple drone simulation views—suitable for bulk generation of multi-agent cooperative visual datasets:
```python import asyncio from RflySimSDK.vision.ScreenCapApiV4 import WinInfo
async def capture_single_window(win_info, save_path): # Asynchronously capture frame data from the target window and save it frame = win_info.get_capture_frame() if frame is not None: await asyncio.to_thread(cv2.imwrite, save_path, frame) return f"Screenshot saved to {save_path}" return f"Failed to capture window {win_info.window_name}"
async def batch_capture_multiple_drones(window_name_list, save_dir): # Batch initialize multi-simulation window information win_info_list = [] for name in window_name_list: wi = WinInfo(name) if wi.is_window_valid(): win_info_list.append(wi) # Asynchronously execute multi-window screenshot tasks concurrently tasks = [capture_single_window(wi, f"{save_dir}/{wi.window_name}.png") for wi in win_info_list] results = await asyncio.gather(*tasks) for res in results: print(res)
Start batch asynchronous screenshot tasks¶
if name == "main": target_windows = ["RflySim Simulation 1", "RflySim Simulation 2", "RflySim Simulation 3"] asyncio.run(batch_capture_multiple_drones(target_windows, "./drone_vision_data"))
Notes and Pitfall Avoidance Guide¶
- Window Validity Validation: Before calling screenshot-related methods of the
WinInfoclass, you must first invoke theis_window_valid()method to verify that the target window exists and is capturable. If the simulation window is closed or minimized to the background, skipping this validation will directly cause the program to throw a null pointer exception. - Multiple Window Instance Conflicts: Do not instantiate multiple
WinInfoobjects for the same simulation window. Repeated instantiation will cause the window device context resources to be redundantly occupied, ultimately leading to screenshot lag or even program crashes. - Asynchronous Task Resource Limits: When performing batch asynchronous screenshots, the number of windows processed simultaneously should not exceed 8. Excessive concurrent tasks will consume substantial GPU memory and CPU resources, resulting in reduced simulation frame rates or dropped screenshot frames.
- Window Title Matching Rules: When instantiating
WinInfo, if a fuzzy window name is used, the first window matching the name pattern will be matched by default. In cases where multiple simulation windows share the same name, it is recommended to use the full window title to avoid incorrect matching.
Changelog¶
2024-08-05: fix: Added HTML version of API documentation2024-07-17: fix: UpdatedVisionCaptureApiinterface2023-10-23: feat: Added all Python common labs