About 525,000 results
Open links in new tab
  1. LLaVA

    We introduce LLaVA (L arge L anguage- a nd- V ision A ssistant), an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language …

  2. LLaVA: Large Language and Vision Assistant - GitHub

    With additional scaling to LLaVA-1.5, LLaVA-NeXT-34B outperforms Gemini Pro on some benchmarks. It can now process 4x more pixels and perform more tasks/applications than before.

  3. LLaVA: Large Language and Vision Assistant - Microsoft Research

    LLaVA is an open-source project, collaborating with research community to advance the state-of-the-art in AI. LLaVA represents the first end-to-end trained large multimodal model (LMM) that achieves …

  4. [2304.08485] Visual Instruction Tuning - arXiv.org

    Apr 17, 2023 · When fine-tuned on Science QA, the synergy of LLaVA and GPT-4 achieves a new state-of-the-art accuracy of 92.53%. We make GPT-4 generated visual instruction tuning data, our model …

  5. LLaVA Architecture: From Frozen ViT to Fine-Tuned LLM

    Jun 10, 2025 · A complete technical breakdown of the LLaVA-1.5 multimodal visual assistant. Explore its architecture, open-source training data, and how to use the model.

  6. LLaVa - Hugging Face

    ALIGN AltCLIP Aria AudioFlamingo3 AyaVision BLIP BLIP-2 BridgeTower BROS Chameleon Chinese-CLIP CLIP CLIPSeg CLVP Code World Model (CWM) Cohere2Vision ColPali ColQwen2 Data2Vec …

  7. haotian-liu/LLaVA | DeepWiki

    Apr 18, 2025 · LLaVA (Large Language and Vision Assistant) is an open-source project that combines vision and language capabilities to create a multimodal AI system. This document provides a high …

  8. LLaVA: Large Language and Vision Assistant Explained | Encord

    In this blog, we will delve into the evolution of visual instruction tuning and explore the specifics of LLaVA, along with its recent iterations, LLaVA-1.5 and LLaVA-1.6 (or LLaVA-NeXT).

  9. Understanding LLaVA: Large Language and Vision Assistant

    Dec 11, 2023 · One of the best places to start is a project that is making waves across all AI/ML communities: LLaVA. LLaVA or Large Language and Vision Assistant is a joint effort from …

  10. 一天训练即SOTA!LLaVA-1.5:多模态AI的“性价比之王”全解析

    Dec 21, 2025 · 3. LLaVA-1.5 如何解决“学术短回答”和“日常长对话”的冲突? 训练了两个不同的模型分别处理 使用了“响应格式提示词”(Response Format Prompting) 强制模型在所有情况下都只输出一个 …