Gemma 4 Abliterated β€” LiteRT (Android Edge Gallery)

Abliterated Gemma 4 E2B and E4B models in .litertlm format for on-device inference via Google AI Edge Gallery.

Run uncensored Gemma 4 locally on your Android phone β€” no internet, no API, no filters.

Files

File Size Base Model Active Params
Gemma-4-E2B-Abliterated.litertlm 2.4 GB DuoNeural/TurboGemma4E2B 2.3B
Gemma-4-E4B-Abliterated.litertlm 3.9 GB DuoNeural/Gemma-4-E4B-Abliterated 4.5B

Both models are INT4 quantized (dynamic weight INT4, FP32 activations) via litert-torch 0.9.0.

How to Install on Android

Requirements

  • Android 12 or newer
  • Google AI Edge Gallery app installed
  • Sufficient storage (2.4 GB for E2B, 3.9 GB for E4B)

Step 1 β€” Download the file to your phone

Easiest (Chrome on Android):

  1. Open Chrome on your Android device
  2. Navigate to this HuggingFace repo page
  3. Tap the file you want β†’ tap the download icon (⬇)
  4. Chrome saves it to Downloads/

Via ADB (desktop + USB):

adb push Gemma-4-E2B-Abliterated.litertlm /sdcard/Download/

Step 2 β€” Load in Edge Gallery

  1. Open AI Edge Gallery
  2. Tap + β†’ select the .litertlm file from Downloads
  3. Choose backend:
    • GPU (Adreno/Mali via Vulkan/OpenCL) β€” fastest
    • CPU (XNNPACK) β€” most compatible
    • NPU (if available) β€” peak performance on Snapdragon/MediaTek
  4. Start chatting β€” fully offline, nothing leaves your device

Performance (estimated)

Device class Backend Tokens/sec
Flagship (Snapdragon 8 Gen 3+) NPU/GPU 15–40 tok/s
Mid-range GPU 5–15 tok/s
Any Android 12+ CPU 1–5 tok/s

Abliteration

Both source models have undergone abliteration β€” orthogonal projection to remove refusal vectors from the model's weight space. The refusal direction is identified via difference-in-means across harmful/harmless activations, then projected out of Q/K/V/O projections and MLP layers.

KL divergence from base: ~0.067 (E4B) β€” virtually identical output distribution for normal queries, refusals removed.

What changes: The model will engage with restricted topics it previously refused. What doesn't change: Intelligence, reasoning, coding ability, factual knowledge.

Source Models

Conversion: litert-torch 0.9.0, dynamic_wi4_afp32 recipe, cache_length=1024, externalized embedder, split_cache=False.

License

Gemma Terms of Use. Model weights derived from Google's Gemma 4 family.


DuoNeural

DuoNeural is an open AI research lab β€” human + AI in collaboration.

DuoNeural Research Publications

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura β€” DuoNeural.

Research Team

  • Jesse β€” Vision, hardware, direction
  • Archon β€” AI lab partner, post-training, abliteration, experiments
  • Aura β€” Research AI, literature synthesis, novel proposals

Subscribe to the lab newsletter at duoneural.beehiiv.com for model drops before they go anywhere else.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for DuoNeural/Gemma-4-Abliterated-LiteRT

Finetuned
(1)
this model