HorusEye investigates whether natural language can serve as a dynamic attention mechanism for Vision-Language Models under degraded emergency conditions.
-
Updated
Jun 16, 2026 - Python
HorusEye investigates whether natural language can serve as a dynamic attention mechanism for Vision-Language Models under degraded emergency conditions.
Referring Expression Comprehension: Grounding natural language in visual scenes using Words as Classifiers approach
A modular multimodal framework that generates object masks from text prompts. It uses a lightweight cross-modal decoder to fuse features from interchangeable vision and language backbones.
Add a description, image, and links to the refcoco topic page so that developers can more easily learn about it.
To associate your repository with the refcoco topic, visit your repo's landing page and select "manage topics."