Generative Image Pipeline
Teaching myself AI image generation by shipping a book-writing agent
Role
Self-directed experiment
Tech Stack
I wanted to learn AI image generation properly — not toy prompts in a playground, but something with a hard, unforgiving output: a physical book someone would actually pay for. So I set the constraint and built an agent to hit it. Give it a title, get back a print-ready picture book. No human in the loop.
The constraint
A real book is a brutal test for image models. Thirty pages that have to feel like one book, not thirty unrelated images. Text and art that line up. Resolution and bleed that survive a printer. If any stage is sloppy, you can see it — in your hands.
So the bar was simple: type a title, get a finished, print-ready PDF — cover and all — good enough to send straight to a print-on-demand service.
How it worked
The agent ran the whole pipeline:
- Plan the book. From just a title, generate the structure — what goes on every page.
- Meta-prompt the art. For each page, the agent writes the image prompt — AI prompting AI. Getting this layer right mattered more than any single image model.
- Generate in parallel. Pages render concurrently through the image model, so a full book takes minutes, not hours.
- QA and retry. Each page is checked automatically and regenerated when it misses the bar.
- Assemble for print. Pages and cover are composed into a print-ready PDF — correct trim, bleed, and resolution.
A full thirty-page book, start to finish, in about ten minutes.
What I learned
Most of the difficulty wasn't the image model — it was everything around it. Consistency across pages is the real problem; one great image is easy, thirty that belong together is not. Meta-prompting — having the model write its own image prompts — was the highest-leverage layer; small changes there moved quality more than swapping models did. And print is unforgiving: DPI, bleed, and trim turn "looks fine on screen" into "wrong in your hands." The automated QA-and-retry loop is what made the output usable instead of a pile of near-misses.
What transfers
The goal was never to become a publisher. I pushed a handful of titles through print-on-demand purely to close the loop — proof the pipeline produced real, sellable artifacts, each one reviewed by hand before it shipped, not just nice-looking screen demos.
What transfers is the part that isn't the model at all: wrapping a generative-AI capability in the planning, meta-prompting, QA, and assembly that turn a clever demo into something that comes out right every time.
Want to ship something like this?
Book a 30-minute consult. No pitch - just a fit conversation.