SmolLM2-1.7B-Instruct-Prompt-Enhancer

Model Description

SmolLM2-1.7B-Instruct-Prompt-Enhancer is a fine-tuned version of unsloth/SmolLM2-1.7B-Instruct specifically trained for converting simple image descriptions into SVG-friendly prompts. This model specializes in transforming basic concepts into detailed, vector-optimized descriptions that emphasize geometric shapes, flat design principles, and SVG-compatible visual elements.

Key Innovation: SVG-Optimized Prompt Engineering

This model addresses a critical gap in vector graphics generation:

Input: Simple, casual image descriptions ("a lighthouse overlooking the ocean")
Output: Detailed SVG-friendly prompts with geometric precision and flat design specifications
Purpose: Optimize text-to-SVG generation by providing vector-appropriate prompts

Intended Use

This model transforms simple descriptions into SVG-friendly prompts by:

Preserving all original elements while expanding description detail
Adding geometric precision for complex shapes and arrangements
Specifying SVG constraints (no gradients, no shadows, clean edges)
Emphasizing flat design principles for vector compatibility
Providing spatial arrangements and compositional guidance

Model Details

Base Model: unsloth/SmolLM2-1.7B-Instruct
Model Size: 1.7B parameters
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Training Framework: Transformers + TRL + PEFT + Unsloth
License: apache-2.0

Training Details

Training Configuration

Training Method: Supervised Fine-Tuning (SFT) with LoRA
LoRA Configuration:
- r: 24
- lora_alpha: 48
- lora_dropout: 0.05
- Target modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
Training Parameters:
- Epochs: 5
- Learning Rate: 8e-5
- Batch Size: 8 (per device)
- Gradient Accumulation Steps: 2
- Max Sequence Length: 2048
- LR Scheduler: Cosine
- NEFTune Noise Alpha: 5 (for improved generalization)
- Validation: 10% holdout with early stopping

Enhanced Dataset

Size: 13,000 examples of simple→SVG-friendly transformations
Sources: Generated using Claude Sonnet 3.5 and Gemini Flash 2.0
Quality: High-quality prompt engineering examples
Coverage: Diverse visual concepts, geometric patterns, everyday objects, and complex compositions

Usage

Installation

pip install transformers torch

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

model_path = "kawchar85/SmolLM2-1.7B-Instruct-Prompt-Enhancer"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,
    trust_remote_code=True,
    device_map="auto"
)

# Create pipeline
chat_pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    return_full_text=False,
)

def get_message(simple_prompt):
    system_msg = """\
You are an expert prompt engineer specializing in converting simple image concepts into SVG-friendly prompts.
When given a short description, output exactly one enhanced prompt that:
1. PRESERVES ALL DETAILS mentioned in the simple prompt - every element must appear in the enhanced version
2. NEVER omits or changes any objects, colors, or shapes mentioned in the simple prompt
3. Uses geometric precision to represent complex elements
4. Includes terms like "vector illustration," "flat design," "minimalist," "geometric shapes," "solid colors"
5. Specifies "no gradients", "no shadows", "clean edges", "hard edges" 
6. Mentions "2D perspective" or "flat style" to avoid 3D rendering
7. Emphasizes "solid fill colors" and "uniform stroke weight"
8. Always includes specific spatial arrangement of elements (positioned, arranged, distributed)
9. Keeps the background simple (solid color, transparent, or minimal)
10. Does NOT just paste the simple prompt with minor additions - truly EXPANDS and DESCRIBES it
11. Is written in 30-70 words as one fluid, descriptive sentence.
"""
    
    user_msg = f'Transform this into an SVG-friendly prompt with geometric shapes and flat design: {simple_prompt}'
    return [
        {"role": "system", "content": system_msg},
        {"role": "user", "content": user_msg}
    ]

# Generate SVG-friendly prompt
description = "a lighthouse overlooking the ocean"
messages = get_message(description)

output = chat_pipe(
    messages, 
    max_new_tokens=128,
    do_sample=False,
)

print(f"Original: {description}")
print(f"Enhanced: {output[0]['generated_text']}")

Example Transformations

Example 1: Abstract Patterns

Simple: "overlapping magenta and cyan trapezoids"

Enhanced: "Flat vector design with minimalist styling featuring magenta and cyan trapezoidal shapes of varying sizes arranged to create overlapping geometric patterns, solid fill colors, no gradients, no textures, clean hard edges, 2D perspective for efficiency, uniform stroke weight, ideal for SVG format."

Example 2: Transport

Simple: "ambulance driving to hospital"

Enhanced: "Vector depiction of an ambulance defined by a rectangular body, circular wheels, and a red cross symbol displayed on the side as simple intersecting rectangles, driving towards a stylized hospital depicted with a rectangular building and a smaller cross, solid color fills, no gradients or shadows, clean hard edges, minimalist emergency vehicle design viewed in 2D."

Example 3: Household

Simple: "simple bed with two pillows"

Enhanced: "Flat vector design of a minimalist bed, constructed from a rectangular frame, supporting a rectangular mattress and two square pillows positioned symmetrically at the head, solid fill colors in calming blues and whites, clean edges, no gradients or shadows, 2D top-down view, emphasizing simple geometric forms."

Example 4: Landscape

Simple: "a lighthouse overlooking the ocean"

Enhanced: "Vector illustration of coastal scene featuring tall cylindrical structure with white light element positioned above flat blue water surface, geometric lighthouse construction with light positioning, solid fill colors with no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective, oceanic silhouette with beacon indication."

Example 5: Abstract Patterns

Simple: "khaki triangles and azure crescents"

Enhanced: "Vector illustration of geometric celestial pattern featuring triangular khaki shapes with crescent curve silhouettes, geometric crescents with triangle elements, solid fill colors with no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective, uniform stroke weight."

Example 6: Fashion

Simple: "burgundy corduroy pants with patch pockets and silver buttons"

Enhanced: "Vector illustration of minimalist clothing design featuring burgundy trapezoidal leg shapes with textured pattern indication through subtle geometric lines, rectangular patch pockets positioned symmetrically, circular silver button elements arranged vertically, flat design aesthetic with solid fill colors, no gradients or shadows, clean hard edges, 2D perspective for apparel visualization."

Example 7: Landscape

Simple: "three trees on a hill"

Enhanced: "Vector illustration of minimalist landscape featuring three geometric tree structures with triangular canopies and rectangular trunks positioned on an elevated curved hill shape, solid earth tones and greens, no gradients or shadows, clean hard edges, flat design aesthetic, 2D perspective with simplified natural forms."

Example 8: Technology

Simple: "a robot with gears and buttons"

Enhanced: "Vector design of mechanical figure composed of rectangular torso, circular head, and cylindrical limbs, featuring geometric gear wheels as circular elements with radiating teeth, square and circular button interfaces distributed across the surface, solid metallic colors, no gradients or shadows, clean hard edges, minimalist robotic aesthetic in 2D perspective."

Key Features

SVG Optimization Principles

Geometric Precision: Complex shapes described through basic geometric forms
Flat Design Emphasis: Consistent specification of 2D perspective and flat styling
Technical Constraints: Always mentions "no gradients," "no shadows," "clean edges"
Vector Terminology: Uses "vector illustration," "solid fill colors," "uniform stroke weight"
Spatial Awareness: Detailed positioning and arrangement descriptions

Content Preservation

Element Fidelity: All original objects, colors, and shapes are preserved
Detail Expansion: Simple concepts are elaborated with geometric precision
Contextual Enhancement: Spatial relationships and compositions are clarified
Style Consistency: Maintains coherent SVG-friendly vocabulary throughout

Performance

Inference Speed: ~2-3 seconds per transformation
Output Length: Optimized for 30-70 words (concise yet comprehensive)
Consistency: Reliable SVG-specific terminology and constraint specification
Quality: High-quality prompt engineering with geometric precision

Limitations

Specialized Domain: Optimized for SVG/vector use cases, may not suit other prompt types
Length Constraints: Designed for concise enhancements (30-70 words)
Style Specificity: Focused on flat design aesthetic rather than diverse art styles
Vector Focus: May over-emphasize geometric precision for organic/natural subjects

Technical Specifications

Architecture: Transformer-based language model (1.7B parameters)
Context Length: 2048 tokens (supports detailed prompt transformations)
Training: Validation-based with NEFTune noise for improved generalization
Optimization: LoRA fine-tuning (r=24, alpha=48) with cosine scheduling
Inference: Optimized for short, precise outputs with deterministic generation

Citation

@misc{smollm2-prompt-enhancer-2025,
  title={SmolLM2-1.7B-Instruct-Prompt-Enhancer: Specialized Model for SVG-Friendly Prompt Generation},
  author={kawchar85},
  year={2025},
  url={https://huggingface.co/kawchar85/SmolLM2-1.7B-Instruct-Prompt-Enhancer}
}

Downloads last month: -

Safetensors

Model size

2B params

Tensor type

F16

Model tree for kawchar85/SmolLM2-1.7B-Instruct-Prompt-Enhancer

Base model

HuggingFaceTB/SmolLM2-1.7B

Quantized

HuggingFaceTB/SmolLM2-1.7B-Instruct

Finetuned

unsloth/SmolLM2-1.7B-Instruct

Adapter

(182)

this model