# No-GPU Image Slideshow Video Production

When AI image generation APIs are unavailable (rate-limited, blocked, or unreachable) and the server has no GPU, use **Python PIL** to generate infographic-style images, then **espeak** for offline TTS.

## Workflow

```
Write Script → PIL Generate Images → espeak TTS → ffmpeg Image+Audio → Concat → HTTP Serve
```

## Step 1: Generate Infographic Images with PIL

```bash
pip3 install Pillow --break-system-packages
```

### Key PIL Patterns

```python
from PIL import Image, ImageDraw, ImageFont

width, height = 1280, 720
img = Image.new('RGB', (width, height), (10, 10, 40))  # dark bg
draw = ImageDraw.Draw(img)
font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf", 36)
font2 = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 24)
```

### ⚠️ MULTILINE TEXT: Anchor not supported

`anchor="mt"` raises `ValueError: anchor not supported for multiline text`. Use separate single-line calls:

```python
draw.text((x, y), "line1", fill=(R,G,B), font=font)
draw.text((x, y+30), "line2", fill=(R,G,B), font=font)
```

### Useful Drawing Primitives

| Shape | Code |
|-------|------|
| Rectangle | `draw.rectangle([x1,y1,x2,y2], fill=(R,G,B), outline=(R,G,B), width=N)` |
| Ellipse/Circle | `draw.ellipse([x1,y1,x2,y2], fill=(R,G,B), outline=(R,G,B), width=N)` |
| Line | `draw.line([x1,y1,x2,y2], fill=(R,G,B), width=N)` |
| Arc | `draw.arc([x1,y1,x2,y2], start, end, fill=(R,G,B), width=N)` |
| Text | `draw.text((x,y), "text", fill=(R,G,B), font=font)` |

**Color scheme for tech infographics:** bg=(10,10,40), cyan=(0,200,255), red=(200,50,50), green=(100,255,100), gold=(255,200,50), white=(255,255,255), light=(200,200,255).

## Step 2: Offline TTS with espeak

When `edge-tts` is too slow (network-dependent), use `espeak` offline:

```bash
apt-get install -y espeak
espeak -vzh "你的文字" -w output.wav
ffmpeg -y -i output.wav -codec:a libmp3lame -b:a 128k output.mp3
```

**espeak limitations:** Chinese voice is mechanical but functional. Fast and fully offline.

**edge-tts** (better quality, network-dependent): takes ~10-15s per short segment. Generate one-by-one with individual terminal() calls (60s timeout each).

## Step 3: Create Video from Image + Audio

```bash
ffmpeg -y -framerate 30 -loop 1 -i image.png -i audio.mp3 \
  -c:v libx264 -preset fast -crf 23 \
  -c:a aac -b:a 128k -shortest -pix_fmt yuv420p \
  segment.mp4
```

**Key flags:** `-framerate 30` before `-loop 1` (required for PNG inputs in recent ffmpeg). `-shortest` to match audio duration. `-pix_fmt yuv420p` for compatibility.

## Step 4: Concat All Segments

```bash
for i in 00 01 02 03 04 05; do echo "file 'tech_vid_$i.mp4'" >> list.txt; done
ffmpeg -y -f concat -safe 0 -i list.txt \
  -c:v libx264 -preset fast -crf 22 \
  -c:a aac -b:a 128k -movflags +faststart \
  final.mp4
cp final.mp4 /tmp/tech_video.mp4  # serves at :8080
```

## When to Use This Workflow

- **No GPU** on server (most VPS have no GPU)
- **Free AI image APIs are rate-limited** (common from shared hosting IPs)
- **Subject is abstract/scientific** — PIL infographics explain concepts better than generic AI art
- **Need quick turnaround** — PIL + espeak is 10x faster than waiting for API rate limits

## Known Free Image Generation API Status

| API | Status | Issue |
|-----|--------|-------|
| Pollinations.ai | ✅ Works | Rate-limited: 1 concurrent req/IP |
| HuggingFace Inference | ❌ | DNS: api-inference.huggingface.co may not resolve |
| Replicate | ❌ | Requires API key (401) |
| Clipdrop | ❌ | Requires API key |
| Craiyon | ❌ | API changed, returns HTML |