Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Av's picture

Av

Avi66

·

AI & ML interests

ML Research , LLMs , Applications MultiModality

Organizations

Collections 5

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 63
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 277
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16 • 35
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 273

mervinpraison/llama-3.1-tamilan-8B

8B • Updated Aug 3, 2024 • 29
mradermacher/Llama-3.1-70B-Instruct-Tamil-GGUF

71B • Updated Jul 30, 2024 • 103
RichardErkhov/Hemanth-thunder_-_Tamil-Mistral-7B-v0.1-gguf

7B • Updated Jun 26, 2024 • 194
mradermacher/Tamil-Mistral-7B-Instruct-v0.1-i1-GGUF

7B • Updated Dec 17, 2024 • 379

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 63
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 277
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16 • 35
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 273

mervinpraison/llama-3.1-tamilan-8B

8B • Updated Aug 3, 2024 • 29
mradermacher/Llama-3.1-70B-Instruct-Tamil-GGUF

71B • Updated Jul 30, 2024 • 103
RichardErkhov/Hemanth-thunder_-_Tamil-Mistral-7B-v0.1-gguf

7B • Updated Jun 26, 2024 • 194
mradermacher/Tamil-Mistral-7B-Instruct-v0.1-i1-GGUF

7B • Updated Dec 17, 2024 • 379

View 5 collections

models 0

None public yet

datasets 0

None public yet

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs