Instructions to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF",
	filename="Qwen2.5-VL-7B-Instruct-abliterated/Qwen2.5-VL-7B-Instruct-abliterated.Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = "\"Astronaut riding a horse\""
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

Use Docker

docker model run hf.co/Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Ollama:
```
ollama run hf.co/Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
```

Unsloth Studio

How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF to start chatting

How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Docker Model Runner:
```
docker model run hf.co/Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
```

Lemonade

How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.Qwen-Image-Edit-Rapid-AIO-GGUF-Q4_K_M

List all available models

lemonade list

Show me your work... Art gallery.

#10

pinned

by Phil2Sat - opened Oct 26, 2025

Discussion

Phil2Sat

Owner Oct 26, 2025

Main goal is to compare v50-v53 and different quants.

For me it takes over 2min per image.

Maybe upload elsewher for nsfw content and limk it.

please add details about steps model quant and version samplers and so on...

let the war begin.

Phil2Sat pinned discussion Oct 26, 2025

Cemrek

Oct 26, 2025

Based on my observations so far, v5 is currently the best. I noticed that v5.2 significantly alters faces when editing photos. It exaggerates body contours in an unnatural way. It's disproportionate. v5.3 is currently in the download phase.

Phil2Sat

Owner Oct 26, 2025

•

edited Oct 26, 2025

Based on my observations so far, v5 is currently the best. I noticed that v5.2 significantly alters faces when editing photos. It exaggerates body contours in an unnatural way. It's disproportionate. v5.3 is currently in the download phase.

https://civitai.com/models/1939453/qwenedit-consistence-lora this can slightly fix this, but what i observed:

loras apply different on FP8 or Q quantized models

So in my testing phase i tried to recreate all the loras from 5.2 and apply them to the stock 2509 gguf. Result should be the same with same weights. But its not.

Get weird embossed images out of it, so i thout about why, its the dynamic range i guess, its lower on FP8, cause my GGUF uses F16/Q_8 blocks and they are very similar think about Q8 is nerly f16 and FP8 is half of that
So if you apply a lora which is trained on FP8 to F16/Q_8 with weight 1 it oversteers heavy.
lets take the qwenedit-consistence-lora recommended weight is 0.48 but it gives gridlines, now we think about the half dynamic range, so applying it with something between quarter to half weight gives good results. so somewhere between 0.12 to 0.24 should produce the results wanted.
next thing from v5 to v5.3 two extra loras were added. every new lora wants to change the result the way it was trained, qwen base is chinese, so alot of chinese training data, https://civitai.com/models/2058077/qwen-imagensfwadv1 is western so alot of western training data.
-Keeping faces consistent is something i wasnt able to reach at all, no matter if i tried sdxl, flux or qwen. I guess the only method to keep faces really consistent is to train a character lora and flood the model with thousands of training images of that exact face.

here is a actual test, thats what i got with 5.3 and consistence 0.24 4-step BTW one of my first images with 5.3
The face is squished, it looks always younger, this is a generated input taken somewhere but if i take a portait of my wife, thats never ever my wife, no matter what model i take, trying since a month or so.

Cemrek

Oct 26, 2025

•

edited Oct 26, 2025

Based on my observations so far, v5 is currently the best. I noticed that v5.2 significantly alters faces when editing photos. It exaggerates body contours in an unnatural way. It's disproportionate. v5.3 is currently in the download phase.

https://civitai.com/models/1939453/qwenedit-consistence-lora this can slightly fix this, but what i observed:

loras apply different on FP8 or Q quantized models

So in my testing phase i tried to recreate all the loras from 5.2 and apply them to the stock 2509 gguf. Result should be the same with same weights. But its not.

Get weird embossed images out of it, so i thout about why, its the dynamic range i guess, its lower on FP8, cause my GGUF uses F16/Q_8 blocks and they are very similar think about Q8 is nerly f16 and FP8 is half of that

So if you apply a lora which is trained on FP8 to F16/Q_8 with weight 1 it oversteers heavy.

lets take the qwenedit-consistence-lora recommended weight is 0.48 but it gives gridlines, now we think about the half dynamic range, so applying it with something between quarter to half weight gives good results. so somewhere between 0.12 to 0.24 should produce the results wanted.

next thing from v5 to v5.3 two extra loras were added. every new lora wants to change the result the way it was trained, qwen base is chinese, so alot of chinese training data, https://civitai.com/models/2058077/qwen-imagensfwadv1 is western so alot of western training data.
-Keeping faces consistent is something i wasnt able to reach at all, no matter if i tried sdxl, flux or qwen. I guess the only method to keep faces really consistent is to train a character lora and flood the model with thousands of training images of that exact face.

here is a actual test, thats what i got with 5.3 and consistence 0.24 4-step BTW one of my first images with 5.3

I downloaded v5.3 and started using it. I edited higher quality and more proportionate images than with v5.2. The disproportionate and constant face-changing problem in v5.2 seems to be solved for now.

Using : V5.2 Q6 VS V5.3 Q6

boyetosekuji

Oct 26, 2025

I compared NSFW v5 to v5.3 (both Q4_K_M) and found v5.3 to be much worse, image contrast change, some have gridlines, and also hand position changes. v5 had similar image tone to original, only some images had problems with hand placement issues.

Phil2Sat

Owner Oct 26, 2025

•

edited Oct 26, 2025

I compared NSFW v5 to v5.3 (both Q4_K_M) and found v5.3 to be much worse, image contrast change, some have gridlines, and also hand position changes. v5 had similar image tone to original, only some images had problems with hand placement issues.

what settings? maybe try euler_a beta/sgm_uniform
large changes sometimes need one or two steps more.
did you add additional loras? if yes try half weight.

tink about that every additional lora needs some percentage of a step for changes, with alot of loras +3 in v5.3 it could need more steps to bring in every detail all the lowas want to add. at least my brain translate it that way

boyetosekuji

Oct 26, 2025

My settings: 4 steps, No Loras, eular_a, sgm_uniform, Qwen2.5-VL-7B-Instruct-abliterated, pig qwen vae. I've tried 4steps. 8steps, changed schedulers and samplers, swapped to Qwen_image_vae, the image tone definitely changes into more red tone.

Phil2Sat

Owner Oct 26, 2025

•

edited Oct 26, 2025

ah the tone, try cfg norm node with 0.92.

but did the gridlines disapear at higher steps?

Phil2Sat

Owner Oct 27, 2025

•

edited Oct 27, 2025

te q4_k_m
v5.3 q5_k_m
5-steps euler_a beta
edit + pose transfer (openpose included but not needed)

including workflow:

for a little bit more realism let the NSFW kick in add some bad words:

omnyom

Oct 27, 2025

hey Phil, great workflow. i wonder what is this node and where can i find it ? thanks ;D

Phil2Sat

Owner Oct 27, 2025

Its not needed simply remove, its for something if you want to pre transform a position of a person, maybe its sitting and you want a laying position, so its easier for qwen in a few steps to get in position.

but if you want, MixLab https://github.com/shadowcz007/comfyui-mixlab-nodes, i have such a bunch of nodes installed, i dont really remember what is what, have to look them up, hehe

omnyom

Oct 28, 2025

•

edited Oct 28, 2025

Phil, im sorry again to bother. Skill issue from my behalf, I was wondering why, the fingers doesnt' get interpretated ? Sorry this isn't the place to ask, but im still learning, a lot to process in so little time.
But, to honor the main topic, I include my artwork (would not call this artwork, just a test!) using Q6_K

Phil2Sat

Owner Oct 28, 2025

•

edited Oct 28, 2025

Try changing the openpose to dwpose node:

im also testing right now and dwpose gives better fingers.

just testing different things...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment