Instructions to use ZeroXClem/Qwen3-4B-Sky-High-Hermes with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- HERMES
How to use ZeroXClem/Qwen3-4B-Sky-High-Hermes with HERMES:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
🦅 ZeroXClem/Qwen3-4B-Sky-High-Hermes
ZeroXClem/Qwen3-4B-Sky-High-Hermes is a highly distilled and precision-merged Qwen3-based 4B parameter model optimized for ultra-high reasoning, neutral-aligned autonomy, and elevated conversational depth.
This model is an advanced evolution of ZeroXClem/Qwen3-4B-Hermes-Axion-Pro, combining multiple state-of-the-art Heretic Abliterated reasoning experts with Claude 4.5, Gemini 3, Opus 3, and Haiku distillations — all under a finely tuned 262,144 token context window.
🧠 Merge Methodology
- Merge Method:
model_stock - Base Model:
ZeroXClem/Qwen3-4B-Hermes-Axion-Pro - Precision:
bfloat16 - Tokenizer Source:
Qwen/Qwen3-4B-Thinking-2507 - Context Length:
262,144 tokens
name: ZeroXClem/Qwen3-4B-Sky-High-Hermes
base_model: ZeroXClem/Qwen3-4B-Hermes-Axion-Pro
dtype: bfloat16
merge_method: model_stock
models:
- model: DavidAU/Qwen3-4B-Instruct-2507-Polaris-Alpha-Distill-Heretic-Abliterated
- model: DavidAU/Qwen3-4B-Thinking-2507-Gemini-3-Pro-Preview-High-Reasoning-Distill-Heretic-Abliterated
- model: DavidAU/Qwen3-4B-Claude-Sonnet-4-Reasoning-Distill-Heretic-Abliterated
- model: TeichAI/Qwen3-4B-Thinking-2507-Claude-Haiku-4.5-High-Reasoning-Distill
- model: TeichAI/Qwen3-4B-Instruct-2507-Claude-Opus-3-Distill
- model: TeichAI/Qwen3-4B-Thinking-2507-MiMo-V2-Flash-Distill
- model: TeichAI/Qwen3-4B-Thinking-2507-MiniMax-M2.1-Distill
tokenizer_source: Qwen/Qwen3-4B-Thinking-2507
⚔️ Purpose & Strengths
Sky-High-Hermes is designed to soar above traditional instruction-tuned models by harmonizing long context, deep reasoning, and full-spectrum capability.
✨ Features
- 🔍 High-Reasoning Ability — Claude, Gemini, and MiMo distills support deep multi-hop and abstract reasoning tasks.
- 🧠 262K Context Window — Massive memory capacity ideal for longform tasks, coding agents, or narrative tracking.
- 🔓 Abliterated via Heretic Protocol — Uncensored, refusal-resistant generation with KL divergence near-zero.
- 💬 Sublime Dialogue & RP — High immersion support for fictional characters, story continuations, and dynamic roleplay.
- ⚙️ Engineering-Grade — Code writing, debugging, and architectural reasoning.
- 🧬 Unsloth Optimized — Fast inference and training via TRL and Unsloth acceleration.
🛠️ Recommended Use
This is a Class 1 model suitable for:
- Creative Writing
- Autonomous Agents
- Code Assistance
- Science & Research
- Story & Worldbuilding
- Philosophical Discourse
- NSFW (Uncensored Use)
For optimal inference:
- Set
smoothing_factor = 1.5(in KoboldCpp, WebUI, or SillyTavern) - Enable
enable_thinking=Truefor structured prompt reasoning
🚀 Example Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = "ZeroXClem/Qwen3-4B-Sky-High-Hermes"
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model, torch_dtype="auto", device_map="auto")
prompt = "Explain the philosophical connection between memory and time."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
🔐 Licensing & Alignment
- License: Apache 2.0
- Censorship: Abliterated via Heretic v1.0.1
- KL Divergence: ~0.00 – 0.06
- Refusal Rate: ≤ 8/100
- Content Moderation: Recommended for production use
💌 Acknowledgements
Gratitude to:
- DavidAU – For crafting and liberating high-reasoning uncensored blends
- TeichAI – For high-resolution Claude/Gemini distillations
- Unsloth Team – For inference-accelerated Qwen3 tuning frameworks
- Mergkit Acree Team - For providing such a fantastic framework to merge these LLMS
- Alibaba Qwen Team - For providing such amazing open source base models
🕊️ Soar higher. Think freer. Respond truer. Built with love by ZeroXClem | 2026
- Downloads last month
- -
