How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Delta-Vector/Plesio-32B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Delta-Vector/Plesio-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker
docker model run hf.co/Delta-Vector/Plesio-32B
Quick Links

Plesio-32B

Model banner

Model Information

Plesio-32B

32B parameters GLM-4 32B Creative / Fresh Prose Co-writing/Roleplay/Adventure Generalist

Another Series of Merges! Since i could never beat Archaeo-32B-KTO! This time starting off with a GLM merge between Rei and Neon (thanks auri!!!)

Using the Oh-so-great 0.2 Slerp merge weight with Neon as the Base.

Support me on Ko-Fi: https://ko-fi.com/deltavector

Quantized Versions

Available Downloads

Prompting

Model has been tuned with the GLM-4 formatting.

Samplers

For testing of this model, I used Temp=1, 0.1 Min-P.

See Merging Config
https://files.catbox.moe/j9kyfy.yml
            

Credits

Thank you to Lucy Knada, Auri, Ateron, Alicat, Intervitens, Cgato, Kubernetes Bad and the rest of Anthracite.

Downloads last month
10
Safetensors
Model size
33B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Delta-Vector/Plesio-32B

Merge model
this model
Finetunes
1 model
Quantizations
3 models

Collection including Delta-Vector/Plesio-32B