Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
zucco 's Collections
Better LLM datasets
Efficient
MoE
Speed
Transformers
ViT
RAG
Transfer
LLM
Agents

LLM

updated May 11
Upvote
-

  • LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

    Paper • 2402.13753 • Published Feb 21, 2024 • 117

  • Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

    Paper • 2403.09629 • Published Mar 14, 2024 • 78

  • Larimar: Large Language Models with Episodic Memory Control

    Paper • 2403.11901 • Published Mar 18, 2024 • 34

  • Evolutionary Optimization of Model Merging Recipes

    Paper • 2403.13187 • Published Mar 19, 2024 • 56

  • InternLM2 Technical Report

    Paper • 2403.17297 • Published Mar 26, 2024 • 34

  • Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

    Paper • 2404.12387 • Published Apr 18, 2024 • 40

  • XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

    Paper • 2404.15420 • Published Apr 23, 2024 • 11

  • LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

    Paper • 2405.00732 • Published Apr 29, 2024 • 122

  • TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices

    Paper • 2410.00531 • Published Oct 1, 2024 • 35

  • Absolute Zero: Reinforced Self-play Reasoning with Zero Data

    Paper • 2505.03335 • Published May 6 • 180
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs