ltl's picture

3 92 3

ltl

ltl

·

2793145003

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

upvoted a paper about 1 month ago

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

upvoted a paper about 1 month ago

Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model

View all activity

Organizations

None yet

upvoted 3 papers about 1 month ago

Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published Jun 19 • 86

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19 • 123

Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model

Paper • 2506.13642 • Published Jun 16 • 27

upvoted a paper 2 months ago

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 106

liked a model 2 months ago

ByteDance-Seed/BAGEL-7B-MoT

Any-to-Any • 15B • Updated Jun 23 • 1.14k • 1.09k

upvoted 2 papers 2 months ago

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21 • 94

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 28

upvoted a paper 3 months ago

Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning

Paper • 2505.07263 • Published May 12 • 30

liked a model 3 months ago

Skywork/Skywork-VL-Reward-7B

Image-Text-to-Text • 8B • Updated Jun 10 • 143 • 42

upvoted 11 papers 3 months ago

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Paper • 2505.03733 • Published May 6 • 17

Learning from Peers in Reasoning Models

Paper • 2505.07787 • Published May 12 • 46

Learning Dynamics in Continual Pre-Training for Large Language Models

Paper • 2505.07796 • Published May 12 • 19

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12 • 81

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 149

On Path to Multimodal Generalist: General-Level and General-Bench

Paper • 2505.04620 • Published May 7 • 83

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 79

Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play

Paper • 2505.02707 • Published May 5 • 86

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published Apr 24 • 113

AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference

Paper • 2504.10326 • Published Apr 14 • 26

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 74