--- pipeline_tag: image-text-to-text tags: - MLX - mlx base_model: - Qwen/Qwen3-VL-4B-Instruct --- # Qwen3-VL-4B-Instruct Run **Qwen3-VL-4B-Instruct** optimized for **Apple Silicon** on MLX with [NexaSDK](https://github.com/NexaAI/nexa-sdk). ## Quickstart 1. **Install [NexaSDK](https://github.com/NexaAI/nexa-sdk)** 2. Run the model locally with one line of code: ```bash nexa infer NexaAI/qwen3vl-4B-fp16-mlx ``` ## Model Description **Qwen3-VL-4B-Instruct** is a 4-billion-parameter instruction-tuned multimodal large language model from Alibaba Cloud’s Qwen team. As part of the **Qwen3-VL** series, it fuses powerful vision-language understanding with conversational fine-tuning, optimized for real-world applications such as chat-based reasoning, document analysis, and visual dialogue. The *Instruct* variant is tuned for following user prompts naturally and safely — producing concise, relevant, and user-aligned responses across text, image, and video contexts. ## Features - **Instruction-Following**: Optimized for dialogue, explanation, and user-friendly task completion. - **Vision-Language Fusion**: Understands and reasons across text, images, and video frames. - **Multilingual Capability**: Handles multiple languages for diverse global use cases. - **Contextual Coherence**: Balances reasoning ability with natural, grounded conversational tone. - **Lightweight & Deployable**: 4B parameters make it efficient for edge and device-level inference. ## Use Cases - Visual chatbots and assistants - Image captioning and scene understanding - Chart, document, or screenshot analysis - Educational or tutoring systems with visual inputs - Multilingual, multimodal question answering ## Inputs and Outputs **Input:** - Text prompts, image(s), or mixed multimodal instructions. **Output:** - Natural-language responses or visual reasoning explanations. - Can return structured text (summaries, captions, answers, etc.) depending on the prompt. ## License Refer to the [official Qwen license](https://huggingface.co/Qwen) for terms of use and redistribution.