Project: AI-Powered Features
AI Features in Your Apps
There is a crucial distinction that many people miss: using AI to build software and building software that uses AI are two different skills. Throughout this bootcamp, you have been mastering the first skill --- writing prompts that generate code, iterating with AI to build features, debugging with AI assistance. Now we are going to learn the second skill: integrating AI capabilities into the applications your users interact with.
FlowTask is about to get a lot smarter. Here is what we are adding:
- A chatbot that helps users manage tasks through conversation
- Text summarization that condenses long task descriptions and notes
- Semantic search that finds tasks by meaning, not just keyword matching
- Content generation that writes task descriptions, project summaries, and email drafts
- AI-powered form validation that goes beyond regex patterns
Each of these features follows the same architecture: your application sends a request to an AI API, receives a response, and presents it to the user. The complexity is in the details --- streaming, caching, context management, and cost control.
The AI API Basics
Before building features, let us understand how AI APIs work.
Making Your First API Call
The Anthropic API (Claude) and OpenAI API (GPT) follow similar patterns. Here is a basic call using the Anthropic SDK:
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [
{ role: "user", content: "Summarize this text in one sentence: ..." },
],
});
console.log(response.content[0].text);
And the equivalent with OpenAI:
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "user", content: "Summarize this text in one sentence: ..." },
],
});
console.log(response.choices[0].message.content);
Request/Response Structure
Every AI API call includes:
- Model: Which model to use (affects quality, speed, and cost)
- Messages: The conversation history (system, user, and assistant messages)
- Max tokens: Limits the response length
- Temperature: Controls randomness (0 = deterministic, 1 = creative)
The response includes the generated text, usage statistics (tokens consumed), and metadata.
Models and Pricing
Model selection is a trade-off between quality, speed, and cost:
| Model | Best For | Speed | Cost | |-------|----------|-------|------| | Claude Haiku / GPT-4o Mini | Simple tasks, classification | Fast | Low | | Claude Sonnet / GPT-4o | Most features | Medium | Medium | | Claude Opus / o1 | Complex reasoning | Slower | Higher |
For FlowTask, we will use a mid-tier model for most features and a smaller model for simple tasks like validation. This keeps costs manageable while maintaining quality.
Environment Setup
Add your API key to .env.local:
ANTHROPIC_API_KEY=sk-ant-your-key-here
Create a shared AI client:
// lib/ai.ts
import Anthropic from "@anthropic-ai/sdk";
export const ai = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
Building a Chatbot Component
A chatbot is the most visible AI feature you can add. Users type natural language, and the AI responds with helpful actions.
The Prompt
Build a chatbot for FlowTask that helps users manage their tasks through
conversation. It should:
1. Chat UI component:
- Floating button in the bottom-right corner (chat bubble icon)
- Clicking it opens a chat panel (400px wide, 500px tall)
- Message list showing user and assistant messages with different
styling (user = right-aligned blue, assistant = left-aligned gray)
- Text input at the bottom with a send button
- Typing indicator (three animated dots) while waiting for response
- Close button to minimize the chat
2. Chat API route at /api/chat:
- Accepts messages array in the request body
- Sends to Claude API with a system prompt that knows about FlowTask
- Streams the response back to the client
- The system prompt should include the user's recent tasks and projects
as context
3. System prompt:
"You are FlowTask Assistant, a helpful AI that helps users manage
their tasks and projects. You can answer questions about their tasks,
suggest prioritization, and provide productivity tips. The user's
current data is provided below."
4. Include the user's task summary in the system prompt: total tasks,
overdue count, list of project names.
Streaming Responses
Streaming is critical for chatbot UX. Without streaming, users stare at a blank screen for several seconds while the full response generates. With streaming, they see words appear in real time.
The API route streams the response:
// app/api/chat/route.ts
import { ai } from "@/lib/ai";
import { getCurrentUser } from "@/lib/auth";
import { prisma } from "@/lib/prisma";
export async function POST(request: NextRequest) {
const user = await getCurrentUser();
if (!user) return new NextResponse("Unauthorized", { status: 401 });
const { messages } = await request.json();
// Fetch user context
const taskCount = await prisma.task.count({
where: { project: { userId: user.id } },
});
const overdueCount = await prisma.task.count({
where: {
project: { userId: user.id },
status: { not: "DONE" },
dueDate: { lt: new Date() },
},
});
const projects = await prisma.project.findMany({
where: { userId: user.id },
select: { name: true },
});
const systemPrompt = `You are FlowTask Assistant. The user has ${taskCount} total tasks, ${overdueCount} overdue. Their projects: ${projects.map((p) => p.name).join(", ")}. Help them manage their work effectively.`;
const stream = await ai.messages.stream({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
system: systemPrompt,
messages,
});
// Return as a ReadableStream
const encoder = new TextEncoder();
const readable = new ReadableStream({
async start(controller) {
for await (const event of stream) {
if (
event.type === "content_block_delta" &&
event.delta.type === "text_delta"
) {
controller.enqueue(encoder.encode(event.delta.text));
}
}
controller.close();
},
});
return new NextResponse(readable, {
headers: { "Content-Type": "text/plain; charset=utf-8" },
});
}
Consuming the Stream on the Client
The React component reads the stream and updates the UI token by token:
const sendMessage = async (content: string) => {
const newMessages = [...messages, { role: "user", content }];
setMessages(newMessages);
setIsStreaming(true);
const response = await fetch("/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ messages: newMessages }),
});
const reader = response.body?.getReader();
const decoder = new TextDecoder();
let assistantMessage = "";
while (reader) {
const { done, value } = await reader.read();
if (done) break;
assistantMessage += decoder.decode(value);
setMessages([
...newMessages,
{ role: "assistant", content: assistantMessage },
]);
}
setIsStreaming(false);
};
Conversation History
The chatbot maintains context by sending the full message history with each request. But there is a limit --- models have a context window (number of tokens they can process). When the conversation gets too long, you need to trim older messages:
function trimMessages(
messages: Message[],
maxTokens: number = 4000
): Message[] {
// Rough estimation: 1 token ~ 4 characters
let totalChars = messages.reduce((sum, m) => sum + m.content.length, 0);
while (totalChars > maxTokens * 4 && messages.length > 2) {
const removed = messages.shift();
totalChars -= removed!.content.length;
}
return messages;
}
Keep the system prompt and the most recent messages. Drop older exchanges from the middle of the conversation.
Text Summarization Feature
Long task descriptions and project notes are common. Summarization makes them digestible.
The Prompt
Add a "Summarize" button to task descriptions that are longer than 200
characters. When clicked:
1. Send the description to the Claude API with the prompt: "Summarize
this task description in 2-3 bullet points. Be concise."
2. Show a loading spinner while processing
3. Display the summary below the original description in a highlighted box
4. Cache the summary in a new field on the Task model (summary: String?)
so the same description isn't summarized twice
5. Add a "Regenerate" button to create a new summary
The API Pattern
Every summarization request follows this pattern:
export async function summarizeText(text: string): Promise<string> {
const response = await ai.messages.create({
model: "claude-haiku-4-20250514",
max_tokens: 256,
messages: [
{
role: "user",
content: `Summarize this in 2-3 concise bullet points:\n\n${text}`,
},
],
});
return response.content[0].text;
}
Notice we use the smallest, cheapest model here. Summarization does not require advanced reasoning --- a fast, inexpensive model handles it perfectly. This is a key cost optimization strategy.
Caching Results
Caching prevents paying for the same summarization twice:
async function getOrCreateSummary(taskId: string): Promise<string> {
const task = await prisma.task.findUnique({
where: { id: taskId },
select: { description: true, summary: true },
});
if (task?.summary) return task.summary;
const summary = await summarizeText(task!.description!);
await prisma.task.update({
where: { id: taskId },
data: { summary },
});
return summary;
}
When the description changes, invalidate the cache by setting summary to null in the update handler.
Semantic Search with Embeddings
Keyword search finds tasks containing specific words. Semantic search finds tasks with similar meaning. The difference is enormous.
Keyword search for "deploy the website" misses a task titled "push the app to production." Semantic search catches it because the meaning is similar.
What Embeddings Are
An embedding is a list of numbers (a vector) that represents the meaning of text. Similar meanings produce similar vectors. The AI model converts text into these vectors, and then you compare vectors using mathematical distance.
Think of it like coordinates on a map. "Deploy the website" and "push the app to production" end up near each other on the meaning map, even though they share zero words.
Generating Embeddings
// lib/embeddings.ts
import OpenAI from "openai";
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
export async function generateEmbedding(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: "text-embedding-3-small",
input: text,
});
return response.data[0].embedding;
}
We use OpenAI's embedding model here because it is widely supported and cost-effective. The resulting vector has 1536 dimensions.
Storing Embeddings
For a simple implementation, store embeddings as JSON in your database:
Add an embedding field to the Task model as a JSON column. Create a
function that generates and stores embeddings for task titles when tasks
are created or updated. Create a search endpoint that:
1. Takes a search query
2. Generates an embedding for the query
3. Computes cosine similarity against all task embeddings
4. Returns the top 10 most similar tasks
For production, mention that pgvector is the recommended solution.
Cosine Similarity
Cosine similarity measures how similar two vectors are. A value of 1.0 means identical meaning, 0.0 means completely unrelated:
function cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0;
let normA = 0;
let normB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}
The Search Flow
export async function semanticSearch(
query: string,
userId: string
): Promise<Task[]> {
const queryEmbedding = await generateEmbedding(query);
const tasks = await prisma.task.findMany({
where: {
project: { userId },
embedding: { not: null },
},
select: {
id: true,
title: true,
status: true,
embedding: true,
},
});
const scored = tasks
.map((task) => ({
...task,
score: cosineSimilarity(queryEmbedding, task.embedding as number[]),
}))
.sort((a, b) => b.score - a.score)
.slice(0, 10);
return scored;
}
For production with thousands of tasks, you would use PostgreSQL with the pgvector extension, which handles similarity search at database level with proper indexing. The prompt to migrate:
Replace the in-memory cosine similarity search with pgvector. Add a vector
column to the Task model, create an ivfflat index, and use the <=> operator
for cosine distance in a raw SQL query.
Content Generation
AI can draft content that users then refine. This saves enormous time for repetitive writing tasks.
The Prompt
Add content generation features to FlowTask:
1. "Generate Description" button when creating a task: user enters a title,
clicks the button, and AI generates a detailed task description with
acceptance criteria.
2. "Generate README" button on the project page: AI analyzes all tasks in
the project and generates a project overview document.
3. "Draft Email" button on completed tasks: AI drafts a status update email
to stakeholders based on the task details and completion notes.
Each generated piece of content should appear in an editable text area so
the user can modify it before saving. Include a "Regenerate" button for
each.
Templated Generation
The key to useful content generation is good prompts with context variables:
export async function generateTaskDescription(
title: string,
projectName: string
): Promise<string> {
const response = await ai.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 512,
messages: [
{
role: "user",
content: `Generate a detailed task description for a task titled "${title}" in the project "${projectName}". Include:
- A clear objective (1-2 sentences)
- 3-5 acceptance criteria as a checklist
- Any relevant technical notes
Keep it concise and actionable.`,
},
],
});
return response.content[0].text;
}
The generated content is a starting point, not a final product. Always present it in an editable field so users can adjust it. This builds trust --- users feel in control rather than replaced.
AI-Powered Form Validation
Traditional validation checks format: is this a valid email? Is this field empty? AI validation checks meaning: does this task title actually describe a task? Is this description clear enough to be actionable?
The Prompt
Add AI-powered validation to the task creation form:
1. When the user blurs the title field, send it to a validation endpoint
2. The AI checks: is this a clear, actionable task title? If not, suggest
a better version.
3. Display the suggestion below the field as a clickable hint:
"Suggestion: [improved title]" — clicking it replaces the current title
4. Use the smallest model available (Haiku) for speed and cost
5. Debounce the validation to avoid excessive API calls (500ms delay)
6. Show a subtle loading indicator while validating
The Validation Pattern
export async function validateTaskTitle(title: string): Promise<{
isValid: boolean;
suggestion?: string;
}> {
if (title.length < 5) return { isValid: false };
const response = await ai.messages.create({
model: "claude-haiku-4-20250514",
max_tokens: 100,
messages: [
{
role: "user",
content: `Is this a clear, actionable task title? "${title}"
Reply with JSON: {"isValid": true/false, "suggestion": "improved version or null"}
Only suggest improvements if the title is vague or unclear.`,
},
],
});
return JSON.parse(response.content[0].text);
}
This is a lightweight but impressive feature. Users notice when an app gives smart suggestions, and it costs fractions of a cent per validation.
Streaming Responses in Detail
Streaming deserves a deeper look because it impacts every AI feature's user experience.
Why Streaming Matters
Without streaming, the flow is: user clicks, waits 3-5 seconds seeing nothing, then the full response appears at once. With streaming, the flow is: user clicks, first words appear within 200ms, the rest flows in over 2-3 seconds. The perceived latency drops dramatically.
The Server-Sent Events Pattern
For features beyond the chatbot, you can use Server-Sent Events (SSE):
// app/api/generate/route.ts
export async function POST(request: NextRequest) {
const { prompt } = await request.json();
const stream = await ai.messages.stream({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [{ role: "user", content: prompt }],
});
const encoder = new TextEncoder();
const readable = new ReadableStream({
async start(controller) {
for await (const event of stream) {
if (
event.type === "content_block_delta" &&
event.delta.type === "text_delta"
) {
const data = `data: ${JSON.stringify({ text: event.delta.text })}\n\n`;
controller.enqueue(encoder.encode(data));
}
}
controller.enqueue(encoder.encode("data: [DONE]\n\n"));
controller.close();
},
});
return new NextResponse(readable, {
headers: {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
Connection: "keep-alive",
},
});
}
Error Handling for Streams
Streams can fail mid-delivery. Handle this gracefully:
const consumeStream = async (response: Response) => {
const reader = response.body?.getReader();
const decoder = new TextDecoder();
try {
while (reader) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split("\n").filter((line) => line.startsWith("data: "));
for (const line of lines) {
const data = line.slice(6); // Remove "data: "
if (data === "[DONE]") return;
const parsed = JSON.parse(data);
setText((prev) => prev + parsed.text);
}
}
} catch (error) {
setError("Connection lost. Please try again.");
} finally {
reader?.releaseLock();
}
};
Cost Management and Token Optimization
AI API calls cost money. Without careful management, costs can spiral quickly.
Understanding Token Pricing
Tokens are the units of text that AI models process. Roughly, one token equals about four English characters or three-quarters of a word. Both input tokens (what you send) and output tokens (what the model generates) have costs, with output tokens typically costing more.
Strategies for Cost Control
Use the smallest model that works. Summarization, classification, and simple extraction do not need the most powerful model. Use Haiku or GPT-4o Mini for these tasks and reserve Sonnet or GPT-4o for complex reasoning.
Cache aggressively. If the same input produces the same output, cache it. Task description summarization is a perfect candidate --- the description rarely changes, so the summary can be stored.
Truncate context. Do not send the user's entire task history as context if the question only requires recent data. Trim to what is relevant:
function buildContext(tasks: Task[], maxChars: number = 2000): string {
let context = "";
for (const task of tasks) {
const line = `- ${task.title} (${task.status})\n`;
if (context.length + line.length > maxChars) break;
context += line;
}
return context;
}
Set max_tokens appropriately. A summarization that should produce 2-3 sentences does not need max_tokens: 4096. Set it to 256 or less. This limits both cost and response time.
Monitor usage. Track API spending per feature:
export async function trackUsage(
feature: string,
inputTokens: number,
outputTokens: number
) {
await prisma.apiUsage.create({
data: {
feature,
inputTokens,
outputTokens,
estimatedCost:
inputTokens * 0.000003 + outputTokens * 0.000015, // Example rates
timestamp: new Date(),
},
});
}
Build a simple dashboard showing daily costs by feature. This visibility helps you identify which features are expensive and where to optimize.
Setting Budgets
Implement a hard spending limit:
async function checkBudget(userId: string): Promise<boolean> {
const monthStart = new Date();
monthStart.setDate(1);
monthStart.setHours(0, 0, 0, 0);
const usage = await prisma.apiUsage.aggregate({
where: { userId, timestamp: { gte: monthStart } },
_sum: { estimatedCost: true },
});
const monthlyBudget = 50; // $50 per user per month
return (usage._sum.estimatedCost || 0) < monthlyBudget;
}
When the budget is exceeded, gracefully degrade: disable non-essential AI features while keeping core functionality working.
What's Next
FlowTask is now an intelligent application. It has a chatbot, summarization, semantic search, content generation, and smart validation --- all powered by AI APIs.
In the next lesson, we shift gears from building features to optimizing how you build. We are diving into advanced vibe coding techniques: MCP servers that connect your AI to external tools, Plan Mode for complex features, custom slash commands, subagents, hooks, and building your personal AI development environment. These techniques will make every future project faster.