nvidia/llama-nemoretriever-colembed-3b-v1 Visual Document Retrieval • 4B • Updated 17 days ago • 569 • 35
Running on Zero 13 13 Explainable-Vision-Language-Model 🥶 Generate a video visualizing how a multimodal model attends to an image while generating text