File size: 3,330 Bytes
00cd928 34ebd85 51e82cd ee7ff50 34ebd85 51e82cd 953b1ae 34ebd85 953b1ae 00cd928 a3ca91d 34ebd85 00cd928 34ebd85 40f5d6e 34ebd85 40f5d6e 34ebd85 953b1ae 4039872 63a206c 34ebd85 5edea0f 953b1ae 34ebd85 953b1ae 00cd928 34ebd85 a21c3c8 34ebd85 953b1ae 34ebd85 953b1ae a21c3c8 34ebd85 00cd928 34ebd85 40f5d6e 34ebd85 40f5d6e 34ebd85 51e82cd 34ebd85 51e82cd 34ebd85 51e82cd 34ebd85 51e82cd a21c3c8 40f5d6e 34ebd85 a21c3c8 34ebd85 a21c3c8 34ebd85 51e82cd 34ebd85 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
---
title: DiffSketcher
emoji: 🎨
colorFrom: blue
colorTo: purple
sdk: custom
app_file: handler.py
pinned: false
license: mit
tags:
- text-to-svg
- vector-graphics
- diffusion
- sketch
- art
pipeline_tag: text-to-image
---
# DiffSketcher: Text Guided Vector Sketch Synthesis
DiffSketcher is a novel method for generating high-quality vector sketches from text prompts using latent diffusion models. This model can create artistic SVG representations based on natural language descriptions.
## Model Description
DiffSketcher leverages the power of Stable Diffusion to guide the generation of vector graphics. The model optimizes SVG paths to match the semantic content described in the input text while maintaining the artistic quality of hand-drawn sketches.
## Usage
### Direct API Call
```python
import requests
API_URL = "https://api-inference.huggingface.co/models/jree423/diffsketcher"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "a beautiful mountain landscape",
"parameters": {
"num_paths": 96,
"num_iter": 500,
"guidance_scale": 7.5,
"width": 224,
"height": 224,
"seed": 42
}
})
```
### Using the Inference Client
```python
from huggingface_hub import InferenceClient
client = InferenceClient("jree423/diffsketcher")
result = client.post(
json={
"inputs": "a cat sitting on a windowsill",
"parameters": {
"num_paths": 128,
"guidance_scale": 8.0
}
}
)
```
## Parameters
- **num_paths** (int, default: 96): Number of SVG paths to generate. More paths create more detailed sketches.
- **num_iter** (int, default: 500): Number of optimization iterations. More iterations improve quality but take longer.
- **guidance_scale** (float, default: 7.5): Controls how closely the generation follows the text prompt.
- **width** (int, default: 224): Output SVG width in pixels.
- **height** (int, default: 224): Output SVG height in pixels.
- **seed** (int, default: 42): Random seed for reproducible results.
## Output Format
The model returns a JSON object containing:
- `svg`: The generated SVG content as a string
- `svg_base64`: Base64 encoded SVG for easy transmission
- `prompt`: The input text prompt
- `parameters`: The parameters used for generation
## Examples
### Simple Objects
- "a red apple"
- "a flying bird"
- "a vintage car"
### Complex Scenes
- "a mountain landscape with trees"
- "a city skyline at sunset"
- "a garden with flowers and butterflies"
### Artistic Styles
- "a portrait in the style of Van Gogh"
- "minimalist line drawing of a face"
- "abstract geometric patterns"
## Technical Details
- **Base Model**: Stable Diffusion 2.1
- **Framework**: PyTorch + Diffusers
- **Vector Rendering**: DiffVG (differentiable vector graphics)
- **Optimization**: Adam optimizer with custom learning rates for different SVG parameters
## Citation
```bibtex
@inproceedings{xing2023diffsketcher,
title={DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models},
author={Xing, XiMing and others},
booktitle={NeurIPS},
year={2023}
}
```
## License
This model is released under the MIT License. |