Instructions to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF", filename="Qwen2.5-VL-7B-Instruct-abliterated/Qwen2.5-VL-7B-Instruct-abliterated.Q4_K_M.gguf", )
llm.create_chat_completion( messages = "\"Astronaut riding a horse\"" )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Ollama:
ollama run hf.co/Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
- Unsloth Studio
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF to start chatting
- Pi
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Docker Model Runner:
docker model run hf.co/Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
- Lemonade
How to use Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Phil2Sat/Qwen-Image-Edit-Rapid-AIO-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Qwen-Image-Edit-Rapid-AIO-GGUF-Q4_K_M
List all available models
lemonade list
Show me your work... Art gallery.
Main goal is to compare v50-v53 and different quants.
For me it takes over 2min per image.
Maybe upload elsewher for nsfw content and limk it.
please add details about steps model quant and version samplers and so on...
let the war begin.
Based on my observations so far, v5 is currently the best. I noticed that v5.2 significantly alters faces when editing photos. It exaggerates body contours in an unnatural way. It's disproportionate. v5.3 is currently in the download phase.
Based on my observations so far, v5 is currently the best. I noticed that v5.2 significantly alters faces when editing photos. It exaggerates body contours in an unnatural way. It's disproportionate. v5.3 is currently in the download phase.
https://civitai.com/models/1939453/qwenedit-consistence-lora this can slightly fix this, but what i observed:
- loras apply different on FP8 or Q quantized models
So in my testing phase i tried to recreate all the loras from 5.2 and apply them to the stock 2509 gguf. Result should be the same with same weights. But its not.
Get weird embossed images out of it, so i thout about why, its the dynamic range i guess, its lower on FP8, cause my GGUF uses F16/Q_8 blocks and they are very similar think about Q8 is nerly f16 and FP8 is half of that
So if you apply a lora which is trained on FP8 to F16/Q_8 with weight 1 it oversteers heavy.
lets take the qwenedit-consistence-lora recommended weight is 0.48 but it gives gridlines, now we think about the half dynamic range, so applying it with something between quarter to half weight gives good results. so somewhere between 0.12 to 0.24 should produce the results wanted.
next thing from v5 to v5.3 two extra loras were added. every new lora wants to change the result the way it was trained, qwen base is chinese, so alot of chinese training data, https://civitai.com/models/2058077/qwen-imagensfwadv1 is western so alot of western training data.
-Keeping faces consistent is something i wasnt able to reach at all, no matter if i tried sdxl, flux or qwen. I guess the only method to keep faces really consistent is to train a character lora and flood the model with thousands of training images of that exact face.
here is a actual test, thats what i got with 5.3 and consistence 0.24 4-step BTW one of my first images with 5.3
The face is squished, it looks always younger, this is a generated input taken somewhere but if i take a portait of my wife, thats never ever my wife, no matter what model i take, trying since a month or so.
Based on my observations so far, v5 is currently the best. I noticed that v5.2 significantly alters faces when editing photos. It exaggerates body contours in an unnatural way. It's disproportionate. v5.3 is currently in the download phase.
https://civitai.com/models/1939453/qwenedit-consistence-lora this can slightly fix this, but what i observed:
- loras apply different on FP8 or Q quantized models
So in my testing phase i tried to recreate all the loras from 5.2 and apply them to the stock 2509 gguf. Result should be the same with same weights. But its not.
Get weird embossed images out of it, so i thout about why, its the dynamic range i guess, its lower on FP8, cause my GGUF uses F16/Q_8 blocks and they are very similar think about Q8 is nerly f16 and FP8 is half of that
So if you apply a lora which is trained on FP8 to F16/Q_8 with weight 1 it oversteers heavy.
lets take the qwenedit-consistence-lora recommended weight is 0.48 but it gives gridlines, now we think about the half dynamic range, so applying it with something between quarter to half weight gives good results. so somewhere between 0.12 to 0.24 should produce the results wanted.
next thing from v5 to v5.3 two extra loras were added. every new lora wants to change the result the way it was trained, qwen base is chinese, so alot of chinese training data, https://civitai.com/models/2058077/qwen-imagensfwadv1 is western so alot of western training data.
-Keeping faces consistent is something i wasnt able to reach at all, no matter if i tried sdxl, flux or qwen. I guess the only method to keep faces really consistent is to train a character lora and flood the model with thousands of training images of that exact face.here is a actual test, thats what i got with 5.3 and consistence 0.24 4-step BTW one of my first images with 5.3
I downloaded v5.3 and started using it. I edited higher quality and more proportionate images than with v5.2. The disproportionate and constant face-changing problem in v5.2 seems to be solved for now.
Using : V5.2 Q6 VS V5.3 Q6
I compared NSFW v5 to v5.3 (both Q4_K_M) and found v5.3 to be much worse, image contrast change, some have gridlines, and also hand position changes. v5 had similar image tone to original, only some images had problems with hand placement issues.
I compared NSFW v5 to v5.3 (both Q4_K_M) and found v5.3 to be much worse, image contrast change, some have gridlines, and also hand position changes. v5 had similar image tone to original, only some images had problems with hand placement issues.
what settings? maybe try euler_a beta/sgm_uniform
large changes sometimes need one or two steps more.
did you add additional loras? if yes try half weight.
tink about that every additional lora needs some percentage of a step for changes, with alot of loras +3 in v5.3 it could need more steps to bring in every detail all the lowas want to add. at least my brain translate it that way
My settings: 4 steps, No Loras, eular_a, sgm_uniform, Qwen2.5-VL-7B-Instruct-abliterated, pig qwen vae. I've tried 4steps. 8steps, changed schedulers and samplers, swapped to Qwen_image_vae, the image tone definitely changes into more red tone.
ah the tone, try cfg norm node with 0.92.
but did the gridlines disapear at higher steps?
Its not needed simply remove, its for something if you want to pre transform a position of a person, maybe its sitting and you want a laying position, so its easier for qwen in a few steps to get in position.
but if you want, MixLab https://github.com/shadowcz007/comfyui-mixlab-nodes, i have such a bunch of nodes installed, i dont really remember what is what, have to look them up, hehe
Phil, im sorry again to bother. Skill issue from my behalf, I was wondering why, the fingers doesnt' get interpretated ? Sorry this isn't the place to ask, but im still learning, a lot to process in so little time.
But, to honor the main topic, I include my artwork (would not call this artwork, just a test!) using Q6_K







