--- title: LLaMA 7B Server emoji: 🤖 colorFrom: blue colorTo: purple sdk: docker sdk_version: "1.0.0" app_file: app.py pinned: false --- # LLaMA 7B Server A FastAPI-based server for interacting with the LLaMA 7B model. ## Features - [x] Text generation - [x] Model parameters configuration - [x] REST API interface ## API Usage Make a POST request to `/generate` with the following JSON body: ```json { "prompt": "your prompt here", "max_length": 2048, "num_beams": 3, "early_stopping": true, "no_repeat_ngram_size": 3 } ``` Example using curl: ```bash curl -X POST http://localhost:7860/generate \ -H "Content-Type: application/json" \ -d '{"prompt": "Hello, how are you?"}' ``` Example using Python: ```python import requests url = "http://localhost:7860/generate" data = { "prompt": "Hello, how are you?", "max_length": 2048, "num_beams": 3, "early_stopping": True, "no_repeat_ngram_size": 3 } response = requests.post(url, json=data) result = response.json() print(result["generated_text"]) # This will contain your generated text ``` ## Model Details - Model: LLaMA 7B - Parameters: 7 billion - Language: Multilingual ## Technical Details - Framework: Gradio - Python Version: 3.9+ - Dependencies: See requirements.txt