The Problem With Manual Publishing
When I started my blog, every article required the same tedious process: write in a text editor, format it for the CMS, add meta descriptions and tags, upload images, preview, and publish. Each article took thirty minutes of pure administrative work on top of the writing time.
At one article per week, that was manageable. At multiple articles per week, it was unsustainable. I needed a system that could take a finished article and publish it without my involvement, while still maintaining quality standards.
Here is how I built it.
The Architecture
The pipeline has four stages:
- Content Input — Articles enter the system from multiple sources
- Enhancement — AI improves and standardizes the content
- Quality Gate — Automated scoring determines if the article is publish-ready
- Publication — Approved articles are converted and published to the CMS
Each stage is independent and can be modified without affecting the others. This modularity was a deliberate design choice — it means I can swap out any component as better tools become available.
Stage 1: Content Input
Articles enter the pipeline through several channels:
- Direct API — I send a JSON payload with title, markdown body, and metadata
- Webhook — External tools push content that gets normalized into the standard format
- Batch import — For bulk publishing, I send an array of articles
Every input channel normalizes to the same format: title, markdown body, description, tags, category, and intent. This normalization is crucial because it means the downstream stages do not need to know where the content came from.
Stage 2: Enhancement
Raw content goes through an AI enhancement pass that:
- Standardizes voice and tone to match my brand guidelines
- Adds proper heading hierarchy
- Ensures SEO elements are in place (keyword in first paragraph, proper heading structure)
- Generates a meta description if one was not provided
- Suggests tags based on content analysis
The enhancement is not about rewriting. It is about polish. The ideas and structure come from the input. The enhancement handles the mechanical optimization that every article needs but is tedious to do manually.
Stage 3: Quality Gate
This is the most important stage. Not every article should be published, and the quality gate makes that decision automatically.
The scoring system evaluates:
- Depth — Is the content substantive enough? Short, thin articles get rejected.
- Originality — Does this add something new, or does it just restate common knowledge?
- Structure — Are headings logical? Is there a clear flow?
- SEO readiness — Are the technical elements in place?
Articles that score above the threshold get published automatically. Articles below the threshold get flagged for manual review. Articles far below the threshold get rejected.
The threshold is deliberately high. I would rather publish fewer articles than publish bad ones. The quality gate catches about one in five articles that need additional work.
Stage 4: Publication
Approved articles are converted from markdown to the CMS format (Portable Text for Sanity) and published with all metadata:
- Slug is auto-generated from the title
- Publication date is set
- Category and tags are assigned
- The article goes live and appears in the sitemap
The conversion handles all the formatting details: headings, bold text, links, lists, blockquotes, and inline code. It also validates that the converted output renders correctly.
The Technical Stack
The pipeline runs on:
- API endpoints — Serverless functions that handle input and orchestration
- AI API — For content enhancement and quality scoring
- Sanity CMS — For storage and rendering
- Markdown parser — Custom converter from markdown to Portable Text
Total infrastructure cost is minimal because everything is serverless and scales to zero when not in use.
What I Learned Building This
Quality Gates Are Non-Negotiable
The temptation with automation is to remove friction. But some friction is valuable. The quality gate adds friction on purpose — it slows down publication to ensure quality. Without it, the pipeline would publish everything, and the blog quality would degrade fast.
Markdown Is the Universal Format
Every stage of the pipeline uses markdown as the interchange format. This was the best design decision I made. Markdown is human-readable, easy to generate, and convertible to any output format. If I ever switch CMS platforms, only the final conversion stage needs to change.
Monitoring Matters More Than You Think
When publishing is automated, you need visibility into what is being published. I built monitoring that alerts me when:
- An article fails the quality gate (so I can review it)
- The publication rate changes significantly (could indicate a source problem)
- An article is published with missing metadata (potential bug)
Without monitoring, automated publishing is a liability because problems compound silently.
Edge Cases Are the Hard Part
The happy path — a well-formatted article entering the pipeline and publishing successfully — was easy to build. The edge cases consumed most of the development time:
- Articles with malformed markdown
- Duplicate slugs that would overwrite existing articles
- Images that fail to upload
- Content that passes quality scoring but contains factual errors
Each edge case required a specific handler. The pipeline is more error handling than publishing logic.
The Results
Since building this pipeline:
- Publishing time per article dropped from thirty minutes to under one minute of my time
- Content output increased significantly without increasing my work hours
- Quality remained consistent because the quality gate enforces standards
- I can publish from anywhere — my phone, a script, a scheduled task
How to Build Your Own
You do not need to replicate this exact system. The principles transfer:
- Standardize your input format. Pick one format (markdown is ideal) and make everything convert to it.
- Build a quality gate. Automated publishing without quality control is a recipe for disaster.
- Make it modular. Each stage should be replaceable without affecting the others.
- Monitor everything. Automated systems need more monitoring, not less.
- Start simple. Begin with a direct API that publishes markdown to your CMS. Add enhancement and quality scoring later.
FAQ
How long did it take to build this pipeline?
The basic version (API to CMS) took a day. The enhancement and quality gate took another week of iteration. The monitoring and edge case handling took another week. Total: about three weeks of part-time work.
Does the quality gate ever make mistakes?
Yes. It occasionally passes content that needs editing, and occasionally flags content that is fine. The threshold is tuned to err on the side of caution — better to review an article unnecessarily than to publish a bad one.
Can I use this with WordPress instead of Sanity?
Yes. The pipeline is CMS-agnostic. Only the final publication stage needs to change. WordPress has REST and XML-RPC APIs that serve the same function as Sanity's API.
How do you handle images?
Images are optional in my pipeline. When an image URL is provided, the pipeline downloads it, uploads it to the CMS asset storage, and attaches it to the article. Most of my articles publish without images initially and get them added later.