3.3.5.YOLO World
YOLO World introduces a "prompt first, then detect" paradigm to achieve efficient user vocabulary inference. This method reparameterizes vocabulary embeddings as parameters within the model, enabling faster inference speeds.
After adding YOLO World as an Extension node, click the plus sign to add it to the workflow editing area. The Image input of the YOLO World node must be the image data collected by the upstream Media node. The things input must be an Array type parameter from upstream or a custom Array parameter. After inputting the data, the node will output an Array type result.

Result Testing
When the input detection object is window and a photo of a window is input, the output is
It can be seen that the YOLO World node will output whether the objects contained in the things array exist in the image and indicate the position and size of the objects in the image. Multiple detection objects can be input into things for simultaneous detection.