Background generator
The Background generator setting controls which AI model draws the background image for each section (“act”) of your song. The app ships with a cloud default that works out of the box, plus two optional local SDXL models for users who want everything to stay on their own machine.

What this setting does
Every render needs one background image per act. The Background generator setting picks which model produces those images. It doesn’t change anything else about the render — prompts, scene analysis, and the act-per-section structure are identical regardless of which generator you pick.
The three options
Pollinations.ai (default)
A hosted cloud service. Prompts are sent over the internet and finished 4K images come back a few seconds later. No download, no GPU required, no setup.
- Where it runs: cloud (pollinations.ai).
- Output: 4K, direct from the service.
- Cost: free, no account needed.
- Speed: ~5–15 seconds per act, depending on load.
- Requirements: an internet connection.
SDXL-Lightning (local)
A Stable Diffusion XL Lightning model that runs on your own GPU. Weights are downloaded once (~7.3 GB) and every image from then on is generated offline. Fast 4-step inference via the Lightning LoRA — great for quick iteration.
- Where it runs: on your machine (Apple Silicon via Metal, or NVIDIA via CUDA).
- Output: 4K image with better detail; accepts more detailed prompts.
- Cost: free after the one-time download.
- Speed: typically faster than the cloud on a capable GPU, and not rate-limited.
- Requirements: a GPU with at least 6 GB of VRAM (8 GB recommended) and the model weights on disk.
- Trade-off: faces and skin are softer than RealVisXL — pick RealVisXL when acts feature people prominently.
RealVisXL V5 (local)
A photoreal SDXL fine-tune that also runs on your own GPU. Weights are downloaded once (~6.5 GB) and every image from then on is generated offline. Slower than Lightning (~20 inference steps instead of 4) but produces noticeably stronger faces and skin.
- Where it runs: on your machine (Apple Silicon via Metal, or NVIDIA via CUDA).
- Output: 1344×768 native, full 20-step inference.
- Cost: free after the one-time download.
- Speed: ~45–90 seconds per act on a capable GPU — slower than Lightning, faster than older baseline diffusion.
- Requirements: a GPU with at least 8 GB of VRAM (10 GB recommended) and the model weights on disk.
- Best for: songs where acts feature people, performance shots, or anything where faces matter.
Which one should I pick?
| If you care about… | Pick |
|---|---|
| Zero setup, no download | Pollinations.ai |
| Native 4K with no upscaler | Pollinations.ai |
| Working offline | SDXL-Lightning or RealVisXL V5 |
| Nothing leaving your machine | SDXL-Lightning or RealVisXL V5 |
| Not hitting cloud rate limits on big batches | SDXL-Lightning or RealVisXL V5 |
| Fast iteration on a capable GPU | SDXL-Lightning |
| Strong faces and skin (people in your acts) | RealVisXL V5 |
| Lowest VRAM footprint among local options | SDXL-Lightning |
Short version: stick with Pollinations.ai unless you specifically want local-only rendering. If you go local, pick SDXL-Lightning for speed, or RealVisXL V5 when acts feature people.
Hardware requirements
For both local generators (SDXL-Lightning and RealVisXL V5):
- macOS: Apple Silicon (M1 or newer) with Metal. Integrated GPUs on Intel Macs are not supported.
- Windows: NVIDIA GPU with CUDA. Minimum VRAM depends on the model — ≥6 GB for SDXL-Lightning, ≥8 GB for RealVisXL V5.
- CPU-only machines are explicitly rejected. Local SDXL on CPU takes minutes per image, so the app refuses to start either model and shows a message pointing you back to Pollinations.ai.
Pollinations.ai has no local hardware requirements beyond a working internet connection.
Switching generators
- Open Settings → Background generator.
- You’ll see a card for each option. The active one is marked In use.
- Click Use this on the card you want.
The change applies to the next act rendered — already-rendered backgrounds stay as they are.
Installing a local generator
Both SDXL-Lightning and RealVisXL V5 use the same install flow on the Settings → Backgrounds card. The first time you pick a local generator, you’ll need its weights.
Download from inside the app
Click Download on the generator card. The weights stream from HuggingFace into <app data>/models/sdxl/<generator-id>/ (about 7.3 GB for SDXL-Lightning, 6.5 GB for RealVisXL V5). The card’s button switches to a disabled Downloading… label, and a progress bar appears at the bottom of the app window — visible on every page — showing the percentage complete. Cancel on that bottom bar aborts the download and cleans up any partial file.
When the download finishes, the bottom bar disappears, the app fires a “Background generator ready” notification, and the card shows an Installed badge plus a Delete button. Clicking Delete removes the weights from disk (useful if you want to reclaim the space and go back to Pollinations.ai or the other local option).
”I already have the weights”
If you already have the weights on disk — from diffusers, ComfyUI, a previous app install, etc. — click I already have the weights on the card and paste the path to the model file or folder. The app links it into its models dir without re-downloading.
Retrying a failed download
If the download drops mid-stream, the card keeps the error message visible. Click Download again to resume or retry.
Prompt enrichment
The app shapes prompts differently for each generator so you get the most out of whichever model you’re using. The SCENE prompt the LLM writes for each act is generator-agnostic — what gets added to it depends on which generator is active.
Pollinations.ai
Pollinations is sent the SCENE prompt verbatim. Its URL-based API can’t take a negative prompt and prefers shorter inputs. No camera/medium tokens, no negative — short and direct.
SDXL (local generators)
Both SDXL-Lightning and RealVisXL V5 get the same enriched treatment because they can both use it:
- Positive prompt — the SCENE prompt plus camera/lens language (medium shot / 50mm / anamorphic depending on the song’s mood), a medium descriptor (vivid film, muted grain, soft palette, etc. based on the palette), and a fixed composition hint that reserves negative space for lyric overlay. The SCENE prompt itself is anchored in the song’s Setting — a single location chosen once at analysis time. Each act’s subject (the person, vehicle, or object the lyrics are about) gets placed inside that setting, so the camera pans across one consistent world instead of jumping locations every act.
- Negative prompt — built from three sources:
- Technical defaults (watermarks, extra limbs, blurriness, JPEG artefacts).
- Auto-derived avoid tokens — when the AI Lyric Generator builds the visual direction for a song, it lists 3–5 things this song should not depict (e.g. for a moody late-night track: “bright daylight, crowds, smiling faces”). These are added automatically.
- Per-act override — if you’ve edited the negative prompt for a specific act in the Generate dialog, your edit replaces both the technical defaults and the auto avoid tokens for that one act only.
Editing the negative prompt per act
When a local SDXL generator (Lightning or RealVisXL V5) is active, the per-act AI button on the Create page opens a dialog that has a Negative Prompt textarea below the positive prompt.
- It’s pre-filled with whatever the model would use right now — technical defaults plus song-level avoid tokens, or your saved override if you’ve edited it before.
- Edit it and hit Generate. Your edit is saved to the act and used for this and any future re-renders of that act.
- A Reset to default link appears when your textarea differs from the auto value; clicking it reverts to whatever the model would emit on its own and clears the saved override.
The textarea is hidden entirely when Pollinations is active because there’s nothing to send.
Memory and the idle release
Local SDXL pipelines (both Lightning and RealVisXL V5) are several GB of weights and stay resident in VRAM/RAM after each render so back-to-back regenerates don’t pay the cold-load cost again. Holding onto that much memory makes typing in the per-act prompt textarea sluggish if you’ve stopped generating, so the app drops the pipeline automatically after 90 seconds of inactivity. The next time you hit Generate it’s reloaded transparently — that first call after an idle period takes the usual ~30 seconds (longer for RealVisXL), every call after that is fast again.
Renders that walk through multiple acts back-to-back keep the pipeline warm; the timer only fires once nothing’s been generated for a full 90 s. The render queue also drops the pipeline immediately when it finishes the last waiting song, so a freshly-drained queue doesn’t sit on memory waiting for the timer.
Switching between SDXL-Lightning and RealVisXL V5 unloads the previous pipeline first — only one local generator is resident at a time.
When a local generator refuses to run
A couple of conditions will cause the app to fall back to Pollinations.ai or surface an error instead of silently producing garbage. These apply to both SDXL-Lightning and RealVisXL V5:
- Not available in this build — the Python runtime that powers local diffusion (diffusers / transformers) wasn’t bundled with the version of the app you’re running. The card shows a Not available in this build notice and the Download button is hidden so you can’t start a multi-GB download you wouldn’t be able to use. If the weights are already on disk from a previous install, the Delete button stays available so you can reclaim the space. Update the app to a build that includes the runtime, then the card returns to normal.
- No supported GPU detected — the card stays disabled and the Use this button is replaced with an explanation of what’s missing (no Metal, no CUDA, or a CUDA GPU with less VRAM than the model needs — 6 GB for Lightning, 8 GB for RealVisXL V5).
- CPU-only fallback attempted — the app refuses with a message recommending Pollinations.ai instead. Image generation on CPU is too slow to be useful.
- Weights missing or corrupt — the card shows a clear error and the Download button reappears.
In every case the error is shown on the card itself, not buried in logs, and switching back to Pollinations.ai is one click.
See also
- Create — where backgrounds are generated as part of the render.
- File locations — where the SDXL weights are stored on disk.