Usage Examples
Image-to-Image (Default Mode)
- Connect a TOP to the first input of StreamDiffusionTD
- SD Mode is set to
img2imgby default - Use Step Sliders to control denoising:
- Higher values (40-49) = closer to input image
- Lower values (1-20) = more AI transformation
- Modify prompt and adjust settings to refine output
Text-to-Image Mode
Generate images from text without an input image:
- Set SD Mode to
txt2imgin Settings 2 - Configure your prompt
- Uses seed-based generation
Model Selection
Built-in Options
stabilityai/sd-turbo- Best supported, under 8GB VRAMstabilityai/sdxl-turbo- Higher quality, needs ~24GB VRAM for all featuresprompthero/openjourney-v4- Alternative style
LoRA Compatibility Warning
LoRA support is experimental/untested. Use v0.2.99 for confirmed LoRA support.
sd-turbo is SD 2.1 based so SD 1.5 LoRAs will NOT work with it.
| Model | LoRA Version Needed |
|---|---|
| sd-turbo | SD 2.1 LoRAs (rare) |
| SD 1.5 models | SD 1.5 LoRAs |
| SDXL models | SDXL LoRAs |
IP Adapter Workflow
Use reference images to guide generation style:
- Enable IP Adapter in Settings 1 (before starting stream)
- Connect reference image to IP Adapter Image parameter
- Adjust IP Adapter Scale (0.0-1.0) to control influence
- Click IP Adapter Update pulse to apply new reference image
FaceID Mode
- Enable FaceID toggle in IP Adapter section
- Provide face reference image
- Generated output will preserve facial features
Requirements:
- Requires
insightfacepackage (installed automatically) - Uses
buffalo_lmodel for face detection - Only works with SD 1.5 and SDXL models, NOT sd-turbo (SD 2.1 architecture)
Note: IP Adapter must be decided before TensorRT engine build. Cannot toggle after.
StreamV2V / Cached Attention (New in v0.3.1)
Video-to-video temporal consistency is back in v0.3.1 using cached attention maps. This smooths frame-to-frame transitions for video input.
Setup
- Go to Settings 2
- Enable Cached Attention (
Cattenable) - Set Max Frames (how many frames to cache, default 3)
- Set Interval (how often the cache updates, default 1)
- On Models page, set Acceleration to
tensorrt(required) - Start stream
Tips
- Resolution is locked to the TRT engine build dimensions (e.g., a 512x512 engine only works at 512x512 with V2V)
- Lower max frames = less VRAM, less temporal smoothing
- Higher max frames = more VRAM, more consistency between frames
- Works with img2img mode for best results
FX Processors
Two built-in FX processors ship with the operator. Add them from the FX Processors parameter page. All parameters update live without restarting.
Quick Start
- Go to the FX Processors parameter page
- Click + to add a processor
- Select from the dropdown (
feedback_looporfeedback_grade) - Adjust parameters to taste
Common Setup: Infinite Zoom with Color Grading
- Add
feedback_loopwithzoom: 1.02,feedback_strength: 0.7 - Add
feedback_gradewithstrength: 0.5and adjust brightness/contrast/hue as needed - Both run in image_pre, so their effects compound through the feedback loop each frame
- Small values go a long way since everything accumulates. Try
hue_degrees: 2for slow color cycling
You can also create your own custom processors and drop them into the custom_processors/ folder. See the FX Processors page for full parameter reference and custom processor guide.
ControlNet
Input Routing
| Backend | ControlNet Input |
|---|---|
| Local | TOP input 2 |
| Daydream | TOP input 1 |
This switches automatically based on Backend selection.
Setup
- Enable ControlNet on the ControlNet page before starting stream
- Select ControlNet model matching your base model:
- For SDXL:
xinsir/controlnet-depth-sdxl-1.0,xinsir/controlnet-canny-sdxl-1.0,xinsir/controlnet-tile-sdxl-1.0
- For SDXL:
- Adjust weight to control influence
Preprocessor Options
canny- Edge detectiondepth- Depth estimation (CPU, higher VRAM usage)depth_tensorrt- Depth estimation (GPU accelerated, auto-builds on first use, ~60% faster)hed- Holistically-nested edge detectionexternal- Use pre-processed inputfeedback- Use previous output as conditioning
Multi-ControlNet (New in v0.3.1)
Dual ControlNet streaming is verified (e.g., depth_tensorrt + canny). To use multiple ControlNets, add additional CN blocks with the + button.
Note: Dual ControlNet on 24GB GPUs runs near the VRAM ceiling (~23.5 GB on a 4090) with reduced FPS (4-9 FPS). Single ControlNet is recommended for most workflows.
Depth TRT Auto-Build
When using depth_tensorrt as a preprocessor, the TRT engine builds automatically the first time (~2 minutes on a 4090). After that, it loads instantly. This gives roughly 60% better FPS and uses a fraction of the VRAM compared to the regular depth preprocessor.
Loading Custom Models
Via HuggingFace ID
- Find model on huggingface.co
- Copy the path (e.g.,
stabilityai/sd-turbo) - Paste into “Model Id” parameter
- Start stream - downloads automatically on first use
Via Local File
Local .safetensors path support is unverified. Use HuggingFace IDs for reliable model loading.
Working Models List
After successful streaming (200+ frames), models are saved to: StreamDiffusion/streamdiffusionTD/working_models.json
These appear in the “My Models” dropdown.
Cloud Mode (Daydream)
Zero installation workflow:
- Set Backend to “Daydream”
- Enter API key
- Click Start Stream
Cloud Features
- No GPU required locally
- Works on Mac
- Multi-ControlNet support
- IP Adapter with FaceID
Note: FX Processors, StreamV2V, and custom processors are only available with the Local backend.
VRAM Budget Reference (RTX 4090 / 24 GB)
All values are nvidia-smi (includes Windows baseline ~5 GB).
| Config | VRAM | FPS | Notes |
|---|---|---|---|
| Base SDXL-turbo TRT (512x512) | ~18 GB | ~26 | Baseline |
| + Single ControlNet (canny) | ~18 GB | ~15 | +2.4 GB per CN TRT engine |
| + Depth preprocessor (PyTorch) | ~25 GB | ~9 | PyTorch depth adds ~5 GB |
| + Depth preprocessor (depth_tensorrt) | ~18 GB | ~14.5 | TRT engine ~52 MB, auto-builds first use |
| + Dual CN (depth_tensorrt + canny) | ~23.5 GB | 4-9 | At VRAM ceiling on 4090 |
| + IPAdapter | +4.3 GB | ~19 | |
| + StreamV2V (cached attention) | ~13 GB | ~21 | Requires peft package |