Most developers believe clean code is about readability — making the next engineer's job easier. In the AI-assisted building era, I think that's the wrong frame. Clean code is about context window management, and if you're building solo with an LLM in the loop, every architectural choice you make is really a choice about how much of your own codebase the AI can hold in its head at once.

I've been running a strict 200-line-per-file rule across every solo app I build. It sounds like an arbitrary constraint. It isn't. It's the single highest-leverage rule I've adopted since I started shipping multiple apps in parallel, and it's the reason I can hand a broken file to an AI and get it fixed in one prompt instead of arguing with it through five rounds of hallucinations.

This is the piece nobody writing about "AI coding" is talking about, because most of them haven't actually tried to ship a dozen production apps with AI as the primary author.

The Assumption That Breaks Solo Building

Senior developers default to building robust, deeply integrated systems. The mental model is "platform" — shared libraries, massive utility files, tightly-coupled modules that all speak the same internal dialect. In traditional team environments this is correct. You optimize for ten engineers being able to work on the same codebase without stepping on each other.

When you're solo and the AI is your teammate, that model becomes a liability. Every line of shared code is a gravity well. When you ask an AI to modify a feature inside a 1,000-line file, it loses the thread of the original architecture almost instantly. It starts guessing. It overwrites edge cases you put there for a reason. You end up spending 20% of your time building and 80% of your time babysitting the AI to keep it from breaking what already works.

The assumption that "good architecture scales with team size" is wrong in the solo + AI context. Solo AI architecture scales down, not up. The smaller and more isolated your modules, the better the AI performs.

Why Large Files Break AI-Assisted Building

The failure mode is technical but driven by a specific behavior of current LLMs: attention decay in long contexts. When a file is large, the AI's "attention" is diluted across the whole thing, and three predictable things happen.

The context anchor problem. The AI prioritizes the top and bottom of the file when generating changes. It reads the imports, it reads the exports, and it pattern-matches against both. Middle-of-file logic — where most of your real business rules live — is frequently skimmed or ignored. This isn't laziness. It's how the underlying attention mechanism weights tokens in long inputs.

The semantic drift. As the file grows, the AI starts losing track of which function does what. It will confidently reference a helper that no longer exists, or mistake two similarly-named functions for each other. The longer the file, the faster this happens. Around 300-400 lines I see a noticeable quality cliff. By 800 lines the AI is essentially guessing.

The glue failure. The more interconnected your modules are, the more likely the AI is to create a breaking change in a dependency you didn't ask it to touch. It sees that two functions share state and decides to "refactor" one without realizing the other depends on a specific edge case. You run the test suite and watch things explode in places you weren't working on.

None of these failure modes are visible from a code-quality review. The file looks fine. It compiles. The tests pass on the first try. But the next time you ask the AI to modify it, something subtle breaks — and now you're debugging a ghost.

The 200-Line Rule And What It Actually Enforces

The rule is simple: no source file exceeds 200 lines. If it hits 150, I start planning the split. If it crosses 200, I split before the next feature.

This looks arbitrary. It isn't. 200 lines is roughly the threshold where the AI can hold 100% of a file's logic with 100% clarity in a single prompt. Below that, quality stays high. Above that, quality starts decaying fast. The exact number depends on the model, but 200 is a safe floor that works across Claude, GPT, Gemini, and local models.

When the rule is enforced strictly, several things happen automatically.

Files become single-purpose. You can't fit two responsibilities into 200 lines without the file feeling crowded. The constraint pushes you toward modules that each do exactly one thing. This is "good architecture" as a side effect of the line limit, not as a goal.

Dependencies become explicit. When every file is small, you can't hide cross-module coupling behind a big utility grab-bag. You're forced to import specific functions from specific places, which makes the data flow visible when the AI (or you) reads the code later.

Replacement becomes cheap. If a 120-line module is broken, I don't argue with the AI over a partial fix. I delete the file and ask the AI to rewrite it from scratch in one prompt. With a 200-line cap, this is a 30-second move. With a 1,000-line file, a full rewrite is a catastrophic loss of edge cases.

The replacement move is the real unlock. Once you accept that entire files can be disposable, your development loop changes. You stop fighting the AI. You start treating each module as something you'll probably throw away and regenerate several times during the project's lifetime.

Type Safety As A Physical Barrier

The 200-line rule handles context decay. The other half of solo-AI building is preventing the AI from hallucinating data shapes across module boundaries.

I use strict schema validation (Zod or the equivalent) at every entry and exit point of every module. If module A sends data to module B, the schema at the boundary is so rigid that the AI physically cannot invent a new field or format without the type checker screaming.

This sounds like overkill for solo building. It's the opposite. In team development, schemas are a communication tool — they tell other engineers what to expect. In solo-AI development, schemas are a restraint system — they tell the AI what it is not allowed to invent. Without them, the AI will fabricate plausible-looking data shapes that break your integrations in subtle ways you won't catch until production.

Every boundary gets a schema. Every function that crosses a module boundary validates its inputs and outputs. The type checker becomes your co-author, and its job is to veto every AI hallucination before it ships.

The "Bring Your Own Engine" Philosophy

I avoid platform-locked dependencies whenever I can. Not because I'm ideological about open source, but because solo building benefits enormously from being able to lift and shift the core logic to a different environment in minutes.

This has two practical consequences. First, I can switch LLMs without rewriting anything. If GPT-5 starts hallucinating more than Claude on a specific kind of task, I can swap the API call and keep moving. Second, if a platform changes its pricing or its terms of service, I don't care. The core logic lives in plain files on my machine, not inside a vendor's environment.

The tradeoff is that I give up some of the "magic" platform-specific features — the cloud-based IDEs, the automatic deployment, the one-click integrations. For solo builders shipping specialized utilities, those features rarely matter. For big teams shipping a single product, they might. Know which one you are before you pick.

A Realistic Example

Imagine building a tool for field contractors that captures photo evidence, GPS-tags it, and syncs to a database.

The wrong way: one massive App.tsx file handling camera access, photo uploads, GPS coordinates, database syncing, auth, and the UI all at once. It's 800 lines by the end of week one. The AI can't reliably modify any piece of it without breaking another piece. Every feature takes three rounds of prompt-fix-prompt to land correctly.

The right way: four distinct files of about 50 lines each. Capture.ts does nothing but handle the camera. Location.ts only talks to the GPS. Sync.ts handles the API. A tiny App.tsx glues them together through a strict interface. When the GPS logic breaks, you rewrite Location.ts from scratch in a single AI prompt without touching the camera logic at all.

Same feature set. Same eventual codebase size. Radically different maintenance cost. The difference isn't elegance — it's containment. When a bug shows up, you know exactly which file is responsible because the AI can't blur the boundaries between small files the way it can between big ones.

Failure Modes Worth Watching For

  • The "helper" bloat. Creating a utils.ts file that becomes a dumping ground. The AI starts pulling irrelevant logic into specialized components, slowing everything down and increasing token costs. Solution: split utils into single-purpose files the moment a second unrelated helper lands in the same file.
  • The refactor trap. Asking the AI to "clean up" a large file. The AI will often simplify by removing edge cases that were actually critical business logic. Solution: don't refactor large files. Delete them and regenerate from scratch against a clear spec.
  • Premature DRY abstraction. Creating shared abstractions too early to avoid code duplication. You sacrifice the ability to delete or change one module without breaking three others. In solo-speed building, duplication is better than bad abstraction. You can always extract a shared helper later once the pattern is obvious.
  • Ignoring the 150-line warning. If you wait until a file hits 200 to start planning the split, you're already behind. Split when you hit 150, before the growth gets painful.

Decision Rules For Solo AI Building

If a file hits 150 lines, stop building and plan the split. Don't wait for 200. The decay starts before the hard limit.

If the AI fails to fix a bug in two prompts, delete the file and ask the AI to recreate the specific logic in a new, smaller file. Never argue with the AI over a large, broken file. The replacement is almost always cheaper than the negotiation.

If you need "global state," use a physical database or a simple local file instead of a complex state management library. Keep the modules dumb and the data smart. Global state libraries tend to create implicit coupling that the AI can't reason about.

If the file is about to exceed 200 lines, the split is mandatory — not optional. The rule doesn't bend. If you bend it once, you'll bend it every time, and the constraint stops working.

Do not apply these rules if you're building a high-frequency trading system, a core banking kernel, or any system where millisecond latency and deep memory sharing are the primary requirements. For those, team-scale architecture patterns still win. Solo-AI rules are for specialized utilities, internal tools, and product experiments — the kind of thing one person ships in a month.

The Tradeoff You're Actually Making

You gain velocity, clean mental models, and the ability to hand any file to an AI and get useful work back on the first prompt. You sacrifice the elegant deep-integration that makes senior engineers nostalgic for the monolithic era. For solo builders who need to ship and iterate, this is the correct trade. For team leads running a product with five engineers, it's probably not.

There's a subtler tradeoff too: small files mean more files, which means more imports, which means more places to forget a dependency. You trade one kind of cognitive load (holding a big file in your head) for another (holding a lot of small files in your head). The second kind is much easier for both you and the AI, but it's not zero cost. A good editor with fast symbol navigation matters a lot in this world.

Hidden Assumptions

This whole approach depends on a few things being true. The AI has to be good enough at code generation to meaningfully produce 200-line modules on demand — it is, for current-generation models. The domain has to be amenable to decomposition — not everything is, but most product work is. And you have to be disciplined enough to actually enforce the rule — which is the part most developers fail at.

The rule also assumes you care more about shipping velocity than architectural elegance. If your goal is to build a beautiful codebase that impresses other engineers, ignore everything in this piece. If your goal is to ship a working product this month and another next month, the 200-line rule is the most useful constraint I know.

The Real Takeaway

Solo-scale software isn't about elegant architecture. It's about containment. By limiting the physical size of your files, you keep the AI operating as an efficient builder instead of a confused editor. High-velocity execution requires the discipline to keep things small enough to be replaceable — because "replaceable" is the only state in which AI-assisted code stays healthy over time.

The developers I know who are shipping the most AI-assisted apps in 2026 have all independently converged on some version of this rule. The exact line count varies — some use 150, some use 250 — but the principle is identical. Make files small enough that the AI can see all of them at once, and the AI becomes a competent co-author. Let files grow, and the AI becomes an unreliable intern who keeps breaking things.

FAQ

Why 200 lines specifically? It's empirical, not theoretical. Across every current-generation LLM I've tested, quality stays high up to roughly 200 lines and starts decaying noticeably past 300. 200 gives you a buffer. You could argue for 180 or 250 — the exact number isn't magic. What matters is that you pick one and enforce it.

What about tests? Test files can usually run longer (400-500 lines) because tests are more isolated by nature and the AI handles them better. But I still prefer splitting test files to match their source files one-to-one. It makes the relationship between source and test obvious.

Doesn't this create an explosion of tiny files? Yes. That's the point. A codebase of 40 small files is much easier for both you and an AI to reason about than a codebase of 10 huge files. File count is cheap. File complexity is expensive.

Share this article
LinkedIn (opens in new tab) X / Twitter (opens in new tab)
Atticus Li

Leads applied experimentation at NRG Energy. $30M+ in verified revenue impact through behavioral economics and CRO.