AI Will Write Your Microservices. It Cannot Design Them.
AI coding tools are remarkably good at implementing individual microservices. Define the API contract, specify the data model, describe the business logic, and AI generates a working service in hours. This efficiency is transformative.
It is also dangerous. Because building is now fast, teams skip the architecture decisions that determine whether the system works at scale. AI generates the bricks. You still need to design the building.
I have built and maintained microservice architectures for years. AI has changed how fast I can build individual services. It has not changed which decisions matter or why. Here are the architecture decisions that must remain human, and how AI can help with everything else.
Decision 1: Service Boundaries
Where you draw the boundaries between services is the most consequential architecture decision you will make. Get it wrong and you end up with either a distributed monolith (all the complexity of microservices, none of the benefits) or services so fine-grained that network calls dominate your performance.
Why AI Cannot Decide This
Service boundaries are business decisions, not technical ones. They should align with organizational boundaries, domain concepts, and deployment cadences. AI does not understand your org chart, your team structure, or which parts of the system change together.
How AI Can Help
Once you define the boundaries, AI excels at:
- Identifying dependencies between services that might suggest boundary adjustments
- Generating interface definitions based on your boundary decisions
- Analyzing existing code to suggest natural decomposition points
- Documenting the rationale for each boundary decision
- Creating dependency diagrams from your service definitions
The Rule I Follow
A service boundary is correct if the team that owns the service can deploy it independently without coordinating with other teams. If deploying your service requires another team to deploy theirs, the boundary is in the wrong place.
Decision 2: Communication Patterns
How services talk to each other defines the system's reliability, performance, and complexity profile. Synchronous HTTP, asynchronous messaging, event sourcing, gRPC -- each pattern has different tradeoffs that ripple through the entire system.
Why AI Cannot Decide This
Communication pattern selection depends on your specific reliability requirements, latency budgets, team expertise, and operational maturity. AI will suggest whatever pattern matches the keywords in your prompt, not whatever pattern matches your constraints.
How AI Can Help
Once you choose a communication pattern, AI is excellent at implementing it:
- Generating message schemas and event definitions
- Implementing retry logic with exponential backoff and jitter
- Building dead letter queue handlers
- Creating circuit breakers with configurable thresholds
- Writing integration tests that simulate network failures
- Generating client libraries for service-to-service communication
My Default Stack
Synchronous HTTP for queries that need immediate responses. Asynchronous messaging for operations that can tolerate latency. Event sourcing only when you genuinely need an audit trail or time-travel debugging. Start simple and add complexity only when the simple approach demonstrably fails.
Decision 3: Data Ownership
Which service owns which data? Can services share a database? How do you handle data that multiple services need?
Why AI Cannot Decide This
Data ownership decisions involve business logic, compliance requirements, consistency needs, and organizational politics. AI cannot know which data is regulated, which data changes frequently, or which data multiple teams need concurrent access to.
How AI Can Help
- Generating database schemas for each service based on ownership decisions
- Implementing data synchronization patterns between services
- Building API contracts for data access across service boundaries
- Creating data migration scripts when ownership changes
- Implementing eventual consistency patterns with proper conflict resolution
The Principle
Every piece of data has exactly one owning service. Other services access that data through the owner's API, never directly from the owner's database. This principle is simple to state and difficult to enforce. AI cannot enforce it. You must.
Violations of this principle are the single most common source of microservice failures I have seen. Shared databases create hidden coupling that defeats the entire purpose of the architecture.
Decision 4: Failure Handling Strategy
In a distributed system, partial failure is the norm. Your failure handling strategy determines whether a service outage cascades into a system outage.
Why AI Cannot Decide This
Failure handling involves understanding which failures your business can tolerate and for how long. Can your checkout process tolerate the recommendation service being down? For how long? What is the user experience during degradation? These are business decisions with technical implementations.
How AI Can Help
Once you define your failure tolerance, AI implements the mechanisms:
- Circuit breakers that trip at your specified thresholds
- Fallback behaviors for each degraded state
- Health check endpoints that accurately reflect service status
- Alerting rules based on your severity definitions
- Graceful degradation paths for each critical dependency
- Bulkhead patterns to isolate failure domains
Decision 5: Deployment and Orchestration
How services get deployed, scaled, and monitored is an architecture decision that affects everything from development velocity to incident response time.
Why AI Cannot Decide This
Deployment decisions depend on your team size, operational expertise, cost constraints, and compliance requirements. A three-person startup has different deployment needs than a fifty-person engineering organization. The right choice for your stage may be wrong a year later.
How AI Can Help
Once you choose your deployment platform:
- Generating Dockerfile configurations optimized for your services
- Creating infrastructure-as-code definitions
- Building CI/CD pipeline configurations
- Implementing health checks and readiness probes
- Writing deployment scripts with rollback capabilities
- Generating monitoring dashboards and alert configurations
What AI Handles Excellently
With architecture decisions made, AI accelerates everything else:
Individual service implementation: Given a clear API contract and data model, AI generates complete service implementations including handlers, middleware, database access, and tests.
Boilerplate generation: Service scaffolding, configuration files, Dockerfiles, CI/CD pipelines -- all the repetitive setup that is necessary but not intellectually challenging.
Test generation: Unit tests, integration tests, contract tests. AI generates comprehensive test suites that cover the happy path, edge cases, and failure scenarios.
Documentation: API documentation, architecture decision records, runbooks. AI generates clear documentation from your code and design decisions.
The Architect's Workflow
My workflow for building microservices with AI:
- Design the architecture (human): Service boundaries, communication patterns, data ownership, failure handling, deployment strategy.
- Document decisions (human + AI): Write architecture decision records. AI helps structure and articulate the reasoning.
- Define contracts (human + AI): API specifications, message schemas, data models. AI generates the formal definitions from natural language descriptions.
- Implement services (AI + human review): AI generates each service. I review for correctness, security, and adherence to architecture decisions.
- Test the system (AI + human): AI generates tests. I verify that tests cover the actual failure modes we care about, not just the happy path.
- Deploy and monitor (AI + human): AI generates deployment configurations. I verify they match our operational requirements.
The pattern: humans decide, AI implements, humans verify. This loop repeats at every level of the system.
FAQ
Can AI help decide between microservices and a monolith?
AI can outline the tradeoffs, but the decision depends on your team size, deployment needs, and scaling requirements. Most startups should start with a monolith and extract services when specific scaling needs arise. AI cannot make that judgment because it requires understanding your organizational context.
How many microservices should I start with?
As few as possible. Every service adds operational overhead for deployment, monitoring, debugging, and inter-service communication. Start with two or three services at natural domain boundaries and split further only when you have concrete reasons.
Should I use AI-generated code in production microservices?
Yes, with the same review rigor you would apply to human-written code. AI-generated code that passes code review and comprehensive testing is production-ready. The source of the code matters less than the quality of the review process.
What about service mesh and API gateway decisions?
These are infrastructure decisions that depend on your scale and operational maturity. Most startups do not need a service mesh. An API gateway becomes valuable once you have more than a handful of services. Do not add infrastructure complexity before you need it -- you will know when you need it because the pain of not having it becomes obvious.