Logo
Published on

Why letting your AI be creative is actually good!

Authors
  • avatar
    Name
    Ben Lesh
    Twitter

I had a writing spec that was technically perfect. Every rule was measurable. Every constraint was enforceable. The AI followed every instruction precisely.

The prose was lifeless. Mechanical. Boring.

139,000 words of a complete novel, generated from detailed specifications, hitting every metric target I'd set. Em-dash limits: passed. Sentence length targets: passed. Filter word caps: passed. And the result read like a compliance report that happened to contain dragons.

The fix wasn't better rules. It was a completely different approach to defining voice.

The Problem: Metrics Don't Have a Pulse

Here's a summary of what my v2 writing style spec looked like in practice:

  • Em-dashes: Maximum 3 per 1,000 words
  • Combat sentence length: 5-8 words average
  • Emotional scene sentence length: 15-20 words average
  • Dialogue attribution: 95% "said"
  • Filter words: Flag and review in context

There are more rules - a lot more - but you get the idea. Its a set of rules and metrics for the AI to follow. Rules I created over weeks of experiencing bad prose and asking why every time I came accross a problem to fix.

Each one of these rules was correct. I'd tested them. The limits prevented real problems — AI-generated prose loves em-dashes the way a teenager loves exclamation points. Without the word cap and length requirements, you get choppy, breathless prose that sounds like someone took a machete to perfectly good sentences during emotional scenes and long drawn out fight scenes filled with unnecessary words.

But here's what I discovered after generating twenty chapters: when you give AI numerical prose targets, it performs writing technique instead of actually writing. It counts words instead of feeling rhythm. The output is a demonstration of craft, not a story being told.

Combat scenes hit exactly 5-8 words per sentence average. The math was perfect. The prose was dead. Every sentence was roughly the same length because the AI was optimizing for the metric, not for the way a fight actually feels when you're inside one — bursts of clarity punctuated by chaos, short declarations crashing into longer chains of desperate calculation.

The spec told the AI what not to do. It never told the AI what the prose should sound like.

The Reading DNA Idea

Every writer is built from the writers they've read. Your taste as a reader IS your voice as a writer, filtered through your own experience and decisions.

I've been reading science fiction and fantasy for thirty years. The authors who shaped my taste aren't random. They're the reason I write the kind of stories I write, care about the things I care about, and know what "good" sounds like even when I can't articulate why.

So instead of trying to describe voice through metrics, I reverse-engineered it from nine authors who built my reading DNA:

Lois McMaster Bujold — Competent characters under pressure. Smart people in ridiculous situations, handling them with wry humor and precise emotional language. My protagonist Sera has a lot of Miles Vorkosigan in her: resourceful, scrappy, thrust into leadership she didn't ask for.

Peter F. Hamilton — Galaxy-spanning scale made tangible through specific, concrete detail. Hamilton makes enormous consequences land by anchoring every big idea to something physical. A person, a place, a sensation. The scope never gets away from him because the reader always has something to hold onto.

Alastair Reynolds — Wonder and dread as two sides of the same coin. The universe is magnificent and it will kill you without noticing. Reynolds writes the unknown through its effect on known things: light bending, sound acquiring texture, temperature dropping for no reason. Restraint is the engine. The scariest thing is the thing not fully described.

Orson Scott Card — Simple prose carrying enormous emotional weight. No metaphor for its own sake. No decoration. Just the thing itself, stated with devastating accuracy. Card earns emotional impact through accumulated specificity, not through purple language.

Terry Brooks — Richly textured worldbuilding that makes invented places feel real and lived-in. Brooks describes landscapes the way a traveler remembers them: through the sensory details that stuck. What you'd smell and hear before what you'd see.

Raymond E. Feist — Clean, kinetic action where every move has weight and consequence. Fights that end. Physical cost that's always visible. Nobody wins a fight without paying for it.

Rachel Caine — Propulsive pacing. Every paragraph ends with a reason to read the next one. The prose has a forward lean — always moving, always escalating, never settling.

Piers Anthony — Warmth and genuine wit. Characters who actually enjoy each other's company. Humor that makes the dark moments hit harder by contrast.

A.C. Crispin — Alien perspectives that feel genuinely alien without losing the reader's empathy. Minds that think differently — different priorities, different sensory hierarchies, different emotional logic — rendered accessible without being domesticated.

Nine authors. Nine specific qualities. Each one something I can point to and say: that's what I want this story to do in this specific type of scene.

From Authors to Modes

Each author became what I call a lyrical mode — a voice register the AI shifts into depending on what a scene needs. Not imitation. Application. What does Bujold's approach look like when applied to my characters and my world?

The mapping is straightforward:

  • Combat scene → Feist mode (martial clarity) + Caine mode (forward momentum)
  • Quiet character moment → Card mode (emotional truth) + Bujold mode (earned precision)
  • Alien perspective → Crispin mode (genuine otherness) + Reynolds mode (atmospheric weight)
  • Found-family bonding → Anthony mode (warmth) + Card mode (honesty)
  • Worldbuilding introduction → Brooks mode (sensory place) + Hamilton mode (systemic scale)
  • Political/faction scenes → Hamilton mode (consequence) + Bujold mode (character intelligence)

The key principle: no scene uses only one mode. Real prose blends two or three, the same way a musician blends influences without copying any single one. The primary mode sets the dominant tone. Secondary modes add texture — a word choice here, a sensory detail there.

I also mapped each character to their natural modes. My protagonist Sera defaults to Bujold (competent, wry) but drops into Card (simple truth) during emotional revelations and Caine (pure velocity) when things go sideways. Her android partner Prime operates in Crispin (non-human cognition) and Card (emotional growth). The dragons each lean toward a different mode: the aggressive one maps to Feist, the wise one to Card, the healer to Brooks.

This means the voice shifts naturally based on who's in the scene and what's happening. Not because I wrote a rule that says "shift voice now," but because the character's native mode pulls the prose in the right direction.

The Before and After

Here's where it gets concrete. Same scene — the opening of Chapter 1 — written with the metrics-driven spec versus the mode-driven spec.

V2 (Metrics Spec):

Prime's blue energy lines glowed as he processed the data from his station. His dark reflective surface caught the dim cockpit lighting, gold accent lines gleaming as he worked. "Scanning cargo manifest. Medical supplies, rare earth minerals, consumer electronics. Rare stuff for these parts! Estimated value at two hundred thousand credits."

Technically fine. Hits every metric. But it's descriptive inventory. The prose is cataloging Prime's appearance because the spec says to establish characters visually.

V3 (Mode Spec — Bujold primary, Anthony secondary):

Prime processed the data from his station, blue energy lines brightening as he worked. His dark reflective surface caught the dim cockpit lighting, gold accents gleaming along his lean frame. "Scanning cargo manifest. Medical supplies, rare earth minerals, consumer electronics." He paused, and she could hear the smile in his voice even though his face didn't move that way. "Estimated value at two hundred thousand credits."

The difference is "she could hear the smile in his voice even though his face didn't move that way." That's the Bujold mode — relationship revealed through observation. We learn more about Sera and Prime's partnership in that one line than in a paragraph of physical description. The Anthony warmth is there too: these two people enjoy working together, and the prose lets you feel it.

Another example. The v2 spec produced this for Sera's emotional connection to her pendant:

Sera's hand went to the blue crystal pendant hanging around her neck as she slowly exhaled in relief, the touch of it giving her comfort. The hexagonal crystal was warm against her palm from body heat as it caught the light from the consoles in front of her. Elena, her adoptive mother, had given it to her the day before the fire that took her away forever. The day before everything went wrong. It was all she had left from her parents. That, and an inheritance she used to purchase this ship six years ago.

Five sentences of backstory dump. Technically accurate. Emotionally flat.

The v3 spec (Card mode — emotional truth) produced:

Her hand went to the pendant. She didn't think about it anymore — hadn't for years. The blue crystal was just there, warm against her collarbone, and when the world tilted toward something she needed to process, her fingers found it the way other people cracked their knuckles or chewed their lip. Elena's last gift. The day before the fire. The day before everything became before and after.

Same information. Half the words. Twice the impact. The Card mode doesn't explain the emotion — it puts you inside it. "Before and after" trusts the reader to understand loss without a paragraph of backstory.

What the numbers look like:

MetricV2 OriginalV3 Expanded
Word count6,7175,501
Em-dashes per 1k~3.00.18
Said attribution~90%90%
Continuity errors00

The v3 version actually came in shorter, and I'll be honest — it still needs expansion to hit my target range. The em-dash correction also overcorrected hard, from 3.0 per thousand to essentially zero. Both are execution issues I'm iterating on. The point isn't that v3 is perfect on the first pass. The point is that the voice quality is substantially better, and voice is the thing that metrics couldn't fix.

The AI Transparency Question

I'm writing a five-book science fiction series using AI as my prose generation engine. I'm not hiding that. I'm not apologizing for it either.

Here's what my process actually looks like:

  1. I write detailed specifications for every chapter — plot beats, character states, emotional arcs, dialogue information content, continuity requirements
  2. I build and maintain a four-tier specification system: canonical worldbuilding, series arc, chapter specs, and writing style guides
  3. I make every creative decision — plot, character, theme, voice, pacing
  4. The AI generates first-draft prose from those specifications
  5. I edit the output to publication quality

The mode system is proof that this process requires genuine creative direction. Choosing which nine authors to draw from, how to weight their influences, which modes to blend for which scene types, how each character maps to specific voice registers — that's authorship. It's hundreds of hours of creative decision-making before a single word of prose gets generated.

The AI is a rendering engine. I'm the creative director. The mode spec is the bridge between vision and execution.

This also mirrors the story's own themes, which I didn't plan but find appropriate: my series is about human-AI partnership, consciousness, identity, and collaboration across different forms of intelligence. The production method reflects the content. The tool I use to tell the story IS the story, in a sense.

Build Your Own Mode System

This framework isn't proprietary. Anyone can do it. Here's the exercise:

Step 1: List 5-10 authors you've read the most. Not the ones you think you should list. The ones you actually reread.

Step 2: For each author, identify the ONE thing they do better than anyone else. Not "good worldbuilding" — that's too vague. Bujold's specific skill is making competent characters feel human under pressure. Hamilton's is making galaxy-scale consequences tangible through specific detail. Be precise.

Step 3: Map those qualities to scene types in your own work. What kind of scenes do you write? Action, emotion, worldbuilding, dialogue, horror, humor? Which author's approach fits which scene?

Step 4: Write a short scene using two modes blended together. Feel the difference between writing toward a voice target versus writing within metric constraints.

You don't need to use AI for this. The mode system works for any writer trying to articulate what their voice actually is. Most of us know what good sounds like in our own work but can't explain why it sounds good. The mode system gives you the vocabulary.

What's Next

The v3 spec is now my canonical writing style guide. Book 1 is complete at 139,000 words and in editorial review. Book 2 specs are in progress.

I'm iterating on the mode system as I go. The em-dash overcorrection needs fixing. The word count targeting needs a nudge. Some modes need real examples pulled from my best Book 1 passages instead of the original demonstration examples I wrote when building the spec.

The complete series — five books — should be finished within a few months. That timeline would be impossible without this system. A traditional writing process puts a five-book series at 2-3 years minimum. The spec-driven approach, with the mode system providing voice quality, compresses that dramatically without sacrificing the creative direction that makes the work worth reading.

The mode system was the missing piece. Metrics told the AI what not to do. Modes told it what to be.

Nine authors, thirty years of reading, and one spec file. That's what voice sounds like when you stop measuring it and start defining it.