How I Built a Content Publishing Pipeline That Runs While I Sleep

Atticus Li

← Blog · content pipeline

How I Built a Content Publishing Pipeline That Runs While I Sleep

The technical architecture behind an automated content pipeline: from data source to published article, with quality gates that catch bad content.

Atticus Li April 7, 2026 6 min read

The Problem With Manual Publishing

When I started my blog, every article required the same tedious process: write in a text editor, format it for the CMS, add meta descriptions and tags, upload images, preview, and publish. Each article took thirty minutes of pure administrative work on top of the writing time.

At one article per week, that was manageable. At multiple articles per week, it was unsustainable. I needed a system that could take a finished article and publish it without my involvement, while still maintaining quality standards.

Here is how I built it.

The Architecture

The pipeline has four stages:

Content Input — Articles enter the system from multiple sources
Enhancement — AI improves and standardizes the content
Quality Gate — Automated scoring determines if the article is publish-ready
Publication — Approved articles are converted and published to the CMS

Each stage is independent and can be modified without affecting the others. This modularity was a deliberate design choice — it means I can swap out any component as better tools become available.

Stage 1: Content Input

Articles enter the pipeline through several channels:

Direct API — I send a JSON payload with title, markdown body, and metadata
Webhook — External tools push content that gets normalized into the standard format
Batch import — For bulk publishing, I send an array of articles

Every input channel normalizes to the same format: title, markdown body, description, tags, category, and intent. This normalization is crucial because it means the downstream stages do not need to know where the content came from.

Stage 2: Enhancement

Raw content goes through an AI enhancement pass that:

Standardizes voice and tone to match my brand guidelines
Adds proper heading hierarchy
Ensures SEO elements are in place (keyword in first paragraph, proper heading structure)
Generates a meta description if one was not provided
Suggests tags based on content analysis

The enhancement is not about rewriting. It is about polish. The ideas and structure come from the input. The enhancement handles the mechanical optimization that every article needs but is tedious to do manually.

Stage 3: Quality Gate

This is the most important stage. Not every article should be published, and the quality gate makes that decision automatically.

The scoring system evaluates:

Depth — Is the content substantive enough? Short, thin articles get rejected.
Originality — Does this add something new, or does it just restate common knowledge?
Structure — Are headings logical? Is there a clear flow?
SEO readiness — Are the technical elements in place?

Articles that score above the threshold get published automatically. Articles below the threshold get flagged for manual review. Articles far below the threshold get rejected.

The threshold is deliberately high. I would rather publish fewer articles than publish bad ones. The quality gate catches about one in five articles that need additional work.

Stage 4: Publication

Approved articles are converted from markdown to the CMS format (Portable Text for Sanity) and published with all metadata:

Slug is auto-generated from the title
Publication date is set
Category and tags are assigned
The article goes live and appears in the sitemap

The conversion handles all the formatting details: headings, bold text, links, lists, blockquotes, and inline code. It also validates that the converted output renders correctly.

The Technical Stack

The pipeline runs on:

API endpoints — Serverless functions that handle input and orchestration
AI API — For content enhancement and quality scoring
Sanity CMS — For storage and rendering
Markdown parser — Custom converter from markdown to Portable Text

Total infrastructure cost is minimal because everything is serverless and scales to zero when not in use.

What I Learned Building This

Quality Gates Are Non-Negotiable

The temptation with automation is to remove friction. But some friction is valuable. The quality gate adds friction on purpose — it slows down publication to ensure quality. Without it, the pipeline would publish everything, and the blog quality would degrade fast.

Markdown Is the Universal Format

Every stage of the pipeline uses markdown as the interchange format. This was the best design decision I made. Markdown is human-readable, easy to generate, and convertible to any output format. If I ever switch CMS platforms, only the final conversion stage needs to change.

Monitoring Matters More Than You Think

When publishing is automated, you need visibility into what is being published. I built monitoring that alerts me when:

An article fails the quality gate (so I can review it)
The publication rate changes significantly (could indicate a source problem)
An article is published with missing metadata (potential bug)

Without monitoring, automated publishing is a liability because problems compound silently.

Edge Cases Are the Hard Part

The happy path — a well-formatted article entering the pipeline and publishing successfully — was easy to build. The edge cases consumed most of the development time:

Articles with malformed markdown
Duplicate slugs that would overwrite existing articles
Images that fail to upload
Content that passes quality scoring but contains factual errors

Each edge case required a specific handler. The pipeline is more error handling than publishing logic.

The Results

Since building this pipeline:

Publishing time per article dropped from thirty minutes to under one minute of my time
Content output increased significantly without increasing my work hours
Quality remained consistent because the quality gate enforces standards
I can publish from anywhere — my phone, a script, a scheduled task

How to Build Your Own

You do not need to replicate this exact system. The principles transfer:

Standardize your input format. Pick one format (markdown is ideal) and make everything convert to it.
Build a quality gate. Automated publishing without quality control is a recipe for disaster.
Make it modular. Each stage should be replaceable without affecting the others.
Monitor everything. Automated systems need more monitoring, not less.
Start simple. Begin with a direct API that publishes markdown to your CMS. Add enhancement and quality scoring later.

FAQ

How long did it take to build this pipeline?

The basic version (API to CMS) took a day. The enhancement and quality gate took another week of iteration. The monitoring and edge case handling took another week. Total: about three weeks of part-time work.

Does the quality gate ever make mistakes?

Yes. It occasionally passes content that needs editing, and occasionally flags content that is fine. The threshold is tuned to err on the side of caution — better to review an article unnecessarily than to publish a bad one.

Can I use this with WordPress instead of Sanity?

Yes. The pipeline is CMS-agnostic. Only the final publication stage needs to change. WordPress has REST and XML-RPC APIs that serve the same function as Sanity's API.

How do you handle images?

Images are optional in my pipeline. When an image URL is provided, the pipeline downloads it, uploads it to the CMS asset storage, and attaches it to the article. Most of my articles publish without images initially and get them added later.

content pipeline automation Sanity CMS AI content publishing automation

Atticus Li

Experimentation and growth leader. Builds AI-powered tools, runs conversion programs, and writes about economics, behavioral science, and shipping faster.

About LinkedIn Newsletter