Tags

Video Understanding

AI May 03, 2026

VILA: NVIDIA's Open-Source Vision Language Model Family from NVlabs

Vision Language Models (VLMs) that can reason about both images and text have become one of the most active areas in AI research. VILA (Visual …

AI May 02, 2026

LLaMA-VID: An Image is Worth 2 Tokens -- Efficient Long Video Understanding with LLMs

LLaMA-VID (Large Language and Video Assistant) is an ECCV 2024 research project that tackles the fundamental bottleneck in video understanding …