Instructions to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge") model = AutoModelForMultimodalLM.from_pretrained("h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge
- SGLang
How to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with Docker Model Runner:
docker model run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge
DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge
- Fixed EOS Token
- Better merge and less yapping.
- Can use emoji somehow? Like almost like Gemma-3 i don't know how or why but its just cute sometimes.
Settings
"temperature": 0.8,
"top_k": 40,
"top_p": 0.95,
"min_p": 0.05,
"repeat_last_n": 40,
"repeat_penalty": 1.2,
Run on Ollama
These are non-imatrix. I'll release the imatrix version later.
GGUF 3-bit Q3_K_M about 27 GB of vRAM/RAM:
ollama run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge-GGUF:Q3_K_M
GGUF 4-bit Q4_K_M about 30 GB of vRAM/RAM:
ollama run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge-GGUF:Q4_K_M
GGUF 5-bit Q5_K_M about 33 GB of vRAM/RAM:
ollama run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge-GGUF:Q5_K_M
Output Sample
Using GGUF Q5-K-M default Ollama settings. FP-16 here.
User: hi
Assistance: Hello! It's nice to meet you. How are you doing today? 😊
User: alright tell me some joke
Assistance: Sure, here's a lighthearted joke for you:
What do you call a fake noodle?
An impasta! 🍝🤪
Hope that made you chuckle. If you'd like another one or need any other assistance, just let me know!
User: Give me 5 non-formal ways to say 'See you later' in French.
Assistance: Certainly! Here are five casual ways to say "see you later" in French:
1. À plus tard - This is a very common and informal way to bid farewell.
2. Salut à plus - A more relaxed version that combines "hello/goodbye" (salut) with "see you later."
3. Ciao à plus - Using the Italian "ciao" gives it an even more casual feel.
4. Bisous à plus tard - This one adds a friendly touch by including "kisses" before saying see you later.
5. À la prochaine - Literally means "until next time," but is often used casually as well.
Remember that French people generally use "tu" (informal) with friends and family, while "vous" is more formal for strangers or in professional settings. These phrases work best when using the informal "tu." 🇫🇷✨
Merge Details
This is a merge of pre-trained language models created using mergekit.
Merge Method
This model was merged using the TIES merge method using Gryphe/Pantheon-RP-1.8-24b-Small-3.1 as a base.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
models:
- model: Gryphe/Pantheon-RP-1.8-24b-Small-3.1
parameters:
density: 0.7
weight: 0.7
- model: PocketDoc/Dans-PersonalityEngine-V1.2.0-24b
parameters:
density: 0.5
weight: 0.5
merge_method: ties
base_model: Gryphe/Pantheon-RP-1.8-24b-Small-3.1
parameters:
normalize: false
int8_mask: true
dtype: float16
tokenizer:
source: union
My System
Motherboard: Asus Maximus Z790
CPU: Intel i9-13900KS
RAM: 96 GB 6400mhz
GPU: RTX-4090 OC 24GB
- Downloads last month
- 32