Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
LLM-Drop 's Collections
LLM-Drop

LLM-Drop

updated Oct 23, 2024

Model weights of paper "What Matters in Transformers? Not All Attention is Needed" (https://arxiv.org/abs/2406.15786)

Upvote
4

  • s1ghhh/Llama-2-13b-Drop8Block

    13B • Updated Sep 8, 2024 • 4 • 2

  • s1ghhh/Llama-2-13b-Drop4Block

    13B • Updated Sep 8, 2024 • 4 • 2

  • s1ghhh/Llama-2-13b-Drop4Attn

    13B • Updated Sep 8, 2024 • 4 • 2

  • s1ghhh/Llama-2-13b-Drop8Attn

    13B • Updated Sep 8, 2024 • 4 • 2

  • s1ghhh/Llama-2-13b-Drop4MLP

    13B • Updated Sep 8, 2024 • 3 • 2

  • s1ghhh/Llama-2-13b-Drop8MLP

    13B • Updated Sep 8, 2024 • 2 • 2

  • s1ghhh/Mistral-7B-v0.1-Drop4Block

    7B • Updated Sep 8, 2024 • 3 • 2

  • s1ghhh/Mistral-7B-v0.1-Drop8Block

    7B • Updated Sep 8, 2024 • 3 • 2

  • s1ghhh/Mistral-7B-v0.1-Drop4Attn

    7B • Updated Sep 8, 2024 • 3 • 2

  • s1ghhh/Mistral-7B-v0.1-Drop8Attn

    7B • Updated Sep 8, 2024 • 3 • 2

  • s1ghhh/Mistral-7B-v0.1-Drop4MLP

    7B • Updated Sep 8, 2024 • 3 • 2

  • s1ghhh/Mistral-7B-v0.1-Drop8MLP

    7B • Updated Sep 8, 2024 • 3 • 2

  • s1ghhh/Llama-2-70b-Drop

    Text Generation • Updated Oct 23, 2024 • 4 • 2

  • s1ghhh/Llama-3-70b-Drop

    Text Generation • 71B • Updated Oct 23, 2024 • 5 • 4
Upvote
4
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs