Prompting for Architecture and Design — Prompt Engineering for Developers

AI as an Architecture Consultant

Architecture tasks are where AI tools shine in a different way than code generation. You are not asking the AI to write code -- you are asking it to think through design decisions, evaluate tradeoffs, and propose solutions to structural problems. The quality of your architectural prompts determines whether you get generic textbook advice or genuinely useful analysis tailored to your specific constraints.

The key difference between architecture prompts and code prompts is that architecture prompts are more about reasoning and less about output format. You want the AI to think through implications, compare alternatives, and surface issues you might have missed -- not just produce a file you can copy-paste.

The Architecture Prompt: Requirements + Constraints + Tradeoffs

Every architecture prompt should include three elements: what you need the system to do (requirements), what limits the solution (constraints), and what tensions exist between competing goals (tradeoffs).

Bad -- Requirements only:

Design a caching layer for our API

This tells the AI nothing about your scale, existing infrastructure, or what you are optimizing for. You will get a generic Redis tutorial.

Good -- Requirements + Constraints + Tradeoffs:

Design a caching layer for our REST API.

Requirements:
- Cache GET responses for /api/products and /api/categories
- Support cache invalidation when products are updated via the admin panel
- Handle approximately 10,000 requests per second at peak
- Cache hit should return in under 5ms

Constraints:
- Must use our existing Redis 7 cluster (3 nodes, 16GB each)
- Must fit into our Express.js middleware chain
- Cannot add new infrastructure (no Varnish, no CDN for this phase)
- Must work with our existing JWT authentication (some responses are user-specific)

Tradeoffs I am considering:
- Simple TTL-based expiration vs event-driven invalidation
- Caching at the route handler level vs middleware level
- Full response caching vs partial/fragment caching

Propose an architecture with your recommended approach for each tradeoff,
and explain why.

Comparing Alternatives

One of the most valuable uses of AI in architecture is comparing multiple approaches. The AI can lay out pros and cons faster than you can research them -- as long as you specify the comparison criteria.

Bad:

Should I use WebSockets or SSE?

Good:

Compare three approaches for adding real-time order status updates to our
Next.js 14 e-commerce app:

Option A: WebSockets (via Socket.io)
Option B: Server-Sent Events (SSE)
Option C: Short polling (fetch every 5 seconds)

Compare on these criteria:
1. Implementation complexity (we have 2 backend devs, 1 frontend dev)
2. Infrastructure cost (we run on AWS ECS with an ALB)
3. Scalability to 50,000 concurrent users
4. Reliability when users are on mobile networks
5. Compatibility with our existing Next.js API routes

Current setup: Next.js 14 App Router, deployed on AWS ECS behind ALB,
PostgreSQL database. Orders update status 3-5 times over 30 minutes
(placed -> confirmed -> preparing -> shipped -> delivered).

For each option, give me the implementation approach, estimated effort in
developer-days, and your recommended choice with reasoning.

This prompt specifies the exact options to compare, the criteria that matter to your team, and the context of your existing system. The AI can now give you a genuinely useful comparison instead of a generic "it depends."

The "Explain Tradeoffs" Pattern

Sometimes you do not want the AI to choose for you -- you want it to lay out the implications so you can make an informed decision.

We need to decide how to handle file uploads in our SaaS application.

Option A: Upload directly to S3 from the browser using presigned URLs
Option B: Upload through our API server, which then forwards to S3

For each option, explain:
- Security implications (who can upload what, how do we validate files?)
- Performance characteristics (latency, bandwidth usage on our server)
- Complexity of implementation
- Impact on our current Express.js API server (CPU, memory, bandwidth)
- How it handles large files (up to 500MB video uploads)
- Error handling and retry behavior

Do not recommend one over the other. I want to understand the full picture
before deciding.

The explicit "do not recommend" instruction is important. It forces the AI to present a balanced analysis rather than anchoring on one option.

Architecture Review Prompts

You can use AI to review existing architecture for potential issues. This is especially useful before scaling, before a major feature addition, or when joining a new project.

Review this database schema for our multi-tenant SaaS project management app.
Focus on scalability issues we might hit at 10,000+ tenants with an average
of 50 users each.

[paste schema or reference the file]

Specific concerns:
1. Query performance: Which queries will become slow as data grows?
2. Tenant isolation: Can data leak between tenants with our current design?
3. Index strategy: Are we missing indexes for common query patterns?
4. Migration risk: Which changes would require downtime to implement later?

Our current query patterns:
- List all projects for a tenant (most frequent, ~100 per tenant)
- Get all tasks for a project with assignee info (second most frequent)
- Dashboard aggregation: task counts by status per tenant
- Search tasks by title across all projects in a tenant

We use PostgreSQL 15 with Prisma ORM. Row-Level Security is not currently
enabled.

Design Document Generation

AI can draft Architecture Decision Records (ADRs) from your decisions, saving significant documentation time.

Generate an ADR (Architecture Decision Record) for our decision to migrate
from REST to GraphQL for our mobile API.

Context:
- Our mobile app currently makes 8-12 REST calls per screen
- Users on 3G connections experience 3-5 second load times
- We have 45 REST endpoints, growing by ~5 per quarter
- Backend team: 4 developers experienced with Express.js
- Mobile team: 2 developers (React Native)

Decision: Adopt GraphQL (Apollo Server) for new mobile-facing endpoints while
keeping REST for admin panel and webhooks.

Reasons: reduce over-fetching, allow mobile team to query exactly what they
need, reduce number of round-trips per screen.

Rejected alternatives:
1. BFF (Backend for Frontend) pattern -- would still require multiple endpoints
2. REST with sparse fieldsets -- clients would need to specify fields for every
   request, and our current REST framework does not support this well
3. Full GraphQL migration -- too risky, too much effort for the admin panel
   which works fine with REST

Format: Use the standard ADR format with Status, Context, Decision, Consequences
(positive and negative), and Compliance sections.

Designing for Unknown Scale

When you know your system will need to scale but do not know exactly how, use AI to map out a scaling strategy:

Our notification service currently handles 1,000 notifications per minute.
We expect this to grow to 100,000 per minute within 12 months.

Current architecture:
- Node.js service processes notifications synchronously
- Each notification: render template, call email/SMS/push API, update database
- PostgreSQL stores notification records and user preferences
- Deployed as a single instance on AWS ECS

Propose a phased scaling plan:
Phase 1 (now to 10K/min): What can we change without major architecture shifts?
Phase 2 (10K to 50K/min): What infrastructure changes are needed?
Phase 3 (50K to 100K/min): What architectural redesign is required?

For each phase:
- Specific technical changes
- Estimated implementation effort
- What breaks if we skip this phase
- How to measure when we need to move to the next phase

API Design Prompts

API design is a specialized architecture task where AI can help you think through contracts, versioning, and edge cases.

Design the REST API contract for a team collaboration feature in our project
management app.

Features:
- Create/edit/delete teams
- Add/remove team members (with roles: admin, member, viewer)
- Team-scoped project access
- Invite users by email (creates pending invitations)

Design decisions I need help with:
1. URL structure: /api/teams/:id/members or /api/memberships?teamId=X
2. How to handle the invitation flow (separate resource or state on membership?)
3. How to represent role changes (PATCH on membership or dedicated endpoint?)
4. Pagination strategy for team member lists
5. What to return when creating a team (just the team, or team + membership?)

Constraints:
- Must be consistent with our existing API patterns (RESTful, JSON:API-ish
  response format with data/meta/links)
- Must support our RBAC middleware (roles are checked via middleware, not in
  handlers)
- Team operations need audit logging (who did what, when)

Provide the endpoint list with HTTP method, URL, request body, response body,
and status codes for each.

The Decision Matrix Pattern

When facing a complex decision with many factors, ask the AI to create a structured comparison:

I need to choose a message queue for our microservices architecture. Create
a decision matrix comparing:

Options: RabbitMQ, AWS SQS, Apache Kafka, Redis Streams

Criteria (weighted by importance):
- Operational complexity (weight: 5) - we have no dedicated DevOps team
- Message ordering guarantees (weight: 4) - important for our event sourcing
- Throughput at 50K messages/sec (weight: 3)
- AWS integration (weight: 4) - we are all-in on AWS
- Dead letter queue support (weight: 5) - critical for our reliability SLA
- Cost at our scale (weight: 3) - ~10M messages per day
- Team familiarity (weight: 2) - team has used RabbitMQ before

For each option, rate each criterion on a 1-5 scale with a brief justification.
Calculate the weighted score and provide a final recommendation.

This structured approach forces a thorough comparison and makes the decision process transparent and documentable.

Common Pitfalls in Architecture Prompts

Avoid asking the AI to design an entire system from scratch in a single prompt. This produces surface-level answers for everything instead of depth on the parts that matter. Instead, break the architecture discussion into focused topics: data model first, then API design, then caching strategy, then deployment architecture.

Avoid prompts that assume the AI knows your business domain. "Design the billing system" means nothing without context about your pricing model, payment providers, tax requirements, and scale.

Always include what already exists. The AI's recommendation changes dramatically when it knows you already have Redis, already use PostgreSQL, already deploy on AWS. Without this context, it might recommend a completely different stack that would require months of migration.