Instructions to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge")
model = AutoModelForMultimodalLM.from_pretrained("h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge

SGLang

How to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge with Docker Model Runner:
```
docker model run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge

Fixed EOS Token
Better merge and less yapping.
Can use emoji somehow? Like almost like Gemma-3 i don't know how or why but its just cute sometimes.

Settings

"temperature": 0.8,
"top_k": 40,
"top_p": 0.95,
"min_p": 0.05,
"repeat_last_n": 40,
"repeat_penalty": 1.2,

Run on Ollama

These are non-imatrix. I'll release the imatrix version later.

GGUF 3-bit Q3_K_M about 27 GB of vRAM/RAM:

ollama run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge-GGUF:Q3_K_M

GGUF 4-bit Q4_K_M about 30 GB of vRAM/RAM:

ollama run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge-GGUF:Q4_K_M

GGUF 5-bit Q5_K_M about 33 GB of vRAM/RAM:

ollama run hf.co/h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge-GGUF:Q5_K_M

Output Sample

Using GGUF Q5-K-M default Ollama settings. FP-16 here.

User: hi
Assistance: Hello! It's nice to meet you. How are you doing today? 😊

User: alright tell me some joke
Assistance: Sure, here's a lighthearted joke for you:
What do you call a fake noodle?
An impasta! 🍝🤪
Hope that made you chuckle. If you'd like another one or need any other assistance, just let me know!

User: Give me 5 non-formal ways to say 'See you later' in French.
Assistance: Certainly! Here are five casual ways to say "see you later" in French:
1. À plus tard - This is a very common and informal way to bid farewell.
2. Salut à plus - A more relaxed version that combines "hello/goodbye" (salut) with "see you later."
3. Ciao à plus - Using the Italian "ciao" gives it an even more casual feel.
4. Bisous à plus tard - This one adds a friendly touch by including "kisses" before saying see you later.
5. À la prochaine - Literally means "until next time," but is often used casually as well.
Remember that French people generally use "tu" (informal) with friends and family, while "vous" is more formal for strangers or in professional settings. These phrases work best when using the informal "tu." 🇫🇷✨

Merge Details

This is a merge of pre-trained language models created using mergekit.

Merge Method

This model was merged using the TIES merge method using Gryphe/Pantheon-RP-1.8-24b-Small-3.1 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Gryphe/Pantheon-RP-1.8-24b-Small-3.1
    parameters:
      density: 0.7
      weight: 0.7
  - model: PocketDoc/Dans-PersonalityEngine-V1.2.0-24b
    parameters:
      density: 0.5
      weight: 0.5
      
merge_method: ties
base_model: Gryphe/Pantheon-RP-1.8-24b-Small-3.1
parameters:
  normalize: false
  int8_mask: true
dtype: float16
tokenizer: 
 source: union

My System

Motherboard: Asus Maximus Z790
CPU: Intel i9-13900KS
RAM: 96 GB 6400mhz
GPU: RTX-4090 OC 24GB

Downloads last month: 32

Safetensors

Model size

24B params

Tensor type

F16

Model tree for h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge

Gryphe/Pantheon-RP-1.8-24b-Small-3.1

PocketDoc/Dans-PersonalityEngine-V1.2.0-24b

Merge model

this model

Merges

1 model

Quantizations

3 models

Collection including h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge

H34v7 - Models Merge and Collections FP16/BF16

Collection

My merge collections. • 6 items • Updated Jun 11, 2025

Paper for h34v7/DansXPantheon-RP-Engine-V1.2-24b-Small-Instruct-Ties-Merge

Resolving Interference When Merging Models

Paper • 2306.01708 • Published Jun 2, 2023 • 19