Chapter 19: Integrating AI Capabilities (Vercel AI SDK)
In the first 18 chapters of this book, we've built an extremely solid SAAS foundation. We have all the core components for payment (Stripe), authentication (better-auth), database (Drizzle), DevOps (GitHub Actions), and feature releases (Feature Flags). Now, it's time to inject 'intelligence' into our SAAS...
Chapter 19: Integrating AI Capabilities (Vercel AI SDK)
In the first 18 chapters of this book, we've built an extremely solid SAAS foundation. We have all the core components for payment (Stripe), authentication (better-auth), database (Drizzle), DevOps (GitHub Actions), and feature releases (Feature Flags). Now, it's time to inject "intelligence" into our SAAS—this is the promise of "AI SAAS" in the book's subtitle.
For Python developers, you might be very familiar with using the requests library to call OpenAI API, or using boto3 to interact with AWS Bedrock. These operations typically happen in a Flask or Django backend, where you manually handle API requests, errors, retries, and return the final JSON result to the frontend (which then figures out how to handle it).
This workflow becomes extremely complex when dealing with AI streaming responses. How do you proxy a chunked HTTP response from a Python backend to a React frontend and update the UI in real-time? This requires complex WebSockets or Server-Sent Events (SSE) setup.
Vercel AI SDK was born to solve this problem. It's a framework-agnostic (but extremely Next.js-friendly) toolkit designed to build AI-powered, streaming-first user interfaces in JavaScript/TypeScript applications in the simplest way possible.
In this chapter, we'll practice with the Vercel AI SDK and explore how to build a pluggable backend architecture to support multiple AI providers.
19.1. [Skill Practice]: Quickly Implement Streaming with Vercel AI SDK
[Skill Practice: claude-nextjs-skills/vercel-ai-sdk/SKILL.md]
Vercel AI SDK is not an AI model. It's an adapter and UI helper. Its core value lies in providing two powerful React Hooks:
useChat(): For building ChatGPT-like multi-turn conversation interfaces.useCompletion(): For simple "input-output" style completions (e.g., "summarize this text").
These Hooks automatically handle all frontend complexity for you: managing message lists (messages), handling user input (input), tracking loading states (isLoading), and rendering streaming responses to the UI in real-time.
Let's build a classic chatbot interface. This requires two parts:
File 1: src/app/api/chat/route.ts (Backend API)
First, we need a Route Handler to receive requests from the frontend and securely call the AI API. Vercel AI SDK provides seamless integration with libraries like OpenAI, Google Gemini, Anthropic, etc.
// src/app/api/chat/route.ts
import { OpenAI } from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';
// Create OpenAI client from .env.local
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
// Key: Switch Vercel deployment to Edge runtime
// This makes responses extremely fast as it runs on global edge nodes
export const runtime = 'edge';
export async function POST(req: Request) {
try {
// Parse messages from the `useChat` Hook's request
const { messages } = await req.json();
// Check credits (from Chapter 12)
// (Pseudo-code: you need a method to get userId from request)
// const userId = ...;
// const creditResult = await consumeCredits(userId, 1, 'AI Chat Completion');
// if (creditResult.error) {
// return new Response(JSON.stringify(creditResult), { status: 402 }); // 402 Payment Required
// }
// 1. Call OpenAI API
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini',
stream: true, // Key: enable streaming response
messages: messages,
});
// 2. Convert OpenAI's response to Vercel AI SDK's stream
const stream = OpenAIStream(response);
// 3. Return streaming response to frontend
// This automatically handles all chunked transfer complexity
return new StreamingTextResponse(stream);
} catch (error) {
console.error('Chat API Error:', error);
return new Response(JSON.stringify({ error: 'Failed to get response' }), { status: 500 });
}
}Architecture Analysis:
runtime = 'edge': This is a major performance optimization. Our chat API is no longer a Node.js Serverless Function but an Edge Function distributed globally. This dramatically reduces the "time to first byte" (TTFB) of AI responses.OpenAIStreamandStreamingTextResponse: This is where Vercel AI SDK's magic lies. You don't need to manually manage chunks—just pipe OpenAI's raw stream throughOpenAIStream, then wrap it inStreamingTextResponseand return it.
File 2: src/components/chat/ChatInterface.tsx (Frontend UI)
Now for the frontend. In a client component, we only need to use the useChat Hook to drive the entire UI.
// src/components/chat/ChatInterface.tsx
'use client';
import { useChat } from 'ai/react';
import { useEffect } from 'react';
import { toast } from 'sonner'; // (Assuming using sonner for toasts)
export function ChatInterface() {
const { messages, input, handleInputChange, handleSubmit, isLoading, error } = useChat({
// `useChat` automatically sends POST requests to this API
api: '/api/chat',
// (Optional) When streaming response completes
onFinish: (message) => {
console.log('Stream finished. Final message:', message);
// You can trigger "final confirmation" of credit consumption here
},
// (Optional) Handle errors
onError: (err) => {
if (err.message.includes('402')) {
toast.error('Insufficient credits. Please upgrade your plan.');
} else {
toast.error('An error occurred. Please try again.');
}
}
});
return (
<div className="flex flex-col h-[500px] border rounded-lg">
{/* Message display area */}
<div className="flex-1 overflow-y-auto p-4 space-y-2">
{messages.map((m) => (
<div key={m.id} className={`p-2 rounded-lg ${
m.role === 'user' ? 'bg-blue-100 text-right' : 'bg-gray-100'
}`}>
<span className="font-bold">
{m.role === 'user' ? 'You' : 'AI'}:
</span>
<span className="whitespace-pre-wrap">{m.content}</span>
</div>
))}
{isLoading && <div className="text-gray-500">AI is typing...</div>}
</div>
{/* Input form */}
{/* `handleSubmit` automatically handles form submission,
adds user message to `messages` list,
and calls the API
*/}
<form onSubmit={handleSubmit} className="flex p-2 border-t">
<input
value={input}
onChange={handleInputChange}
placeholder="Ask me anything..."
className="flex-1 p-2 border rounded-lg"
disabled={isLoading}
/>
<button type="submit" disabled={isLoading} className="ml-2 p-2 bg-blue-500 text-white rounded-lg">
Send
</button>
</form>
</div>
);
}That's it! With just two files, we've built a fully functional, streaming-supported AI chat interface that handles loading and error states. The useChat Hook acts like a state machine, managing everything for us.
19.2. [Code Analysis]: Analyzing the AI Abstraction Layer (Avoiding Vendor Lock-in)
The solution in section 19.1 is excellent, but it has an architectural flaw: our src/app/api/chat/route.ts hardcodes openai.
- What if the OpenAI API goes down?
- What if Google's
Gemini-Prois cheaper and better for certain tasks? - What if we want to integrate
Replicate(Stable Diffusion) for agenerateImage()feature?
A mature AI SAAS must never lock itself into a single AI provider. We need an Abstraction Layer.
Let's design a hypothetical src/lib/ai-provider.ts (or src/ai-interface.ts).
// src/lib/ai-provider.ts
import { OpenAI } from 'openai';
import { GoogleGenerativeAI } from '@google/generative-ai';
import { OpenAIStream, GoogleGenerativeAIStream, StreamData } from 'ai';
import { ChatCompletionMessageParam } from 'openai/resources';
// 1. Define a unified provider interface
interface IAiChatProvider {
// Accept a generic message array, return a generic readable stream
createChatStream(
messages: ChatCompletionMessageParam[]
): Promise<ReadableStream<StreamData>>;
}
// 2. Implement OpenAI provider
class OpenAiProvider implements IAiChatProvider {
private openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async createChatStream(messages: ChatCompletionMessageParam[]) {
const response = await this.openai.chat.completions.create({
model: 'gpt-4o-mini',
stream: true,
messages: messages,
});
return OpenAIStream(response);
}
}
// 3. Implement Google Gemini provider
class GoogleGeminiProvider implements IAiChatProvider {
private genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!);
// (Note: Gemini's message format differs from OpenAI, we need a converter)
private convertMessagesToGemini(messages: ChatCompletionMessageParam[]) {
// ... conversion logic ...
return messages.map(m => ({ role: m.role, parts: [{ text: m.content as string }] }));
}
async createChatStream(messages: ChatCompletionMessageParam[]) {
const model = this.genAI.getGenerativeModel({ model: 'gemini-1.5-pro' });
// Gemini needs message format conversion
const geminiMessages = this.convertMessagesToGemini(messages);
const stream = await model.generateContentStream({
contents: geminiMessages,
});
// Use Vercel AI SDK's Google stream converter
return GoogleGenerativeAIStream(stream);
}
}
// 4. (Optional) Implement Replicate provider (for images)
// ... Similarly, can define a `createImageStream` interface for image generation ...
// 5. Create a factory function to select provider
function getAiProvider(): IAiChatProvider {
const provider = process.env.AI_PROVIDER || 'openai';
if (provider === 'google') {
return new GoogleGeminiProvider();
}
// Default to OpenAI
return new OpenAiProvider();
}
// 6. Export our main AI interface
export const ai = {
chat: getAiProvider(),
// image: getImageProvider(), // (For image features)
};Integration: Refactoring src/app/api/chat/route.ts
Now, our API route becomes extremely clean and "agnostic". It no longer cares who the AI provider is.
// src/app/api/chat/route.ts (refactored)
import { StreamingTextResponse } from 'ai';
import { ai } from '@/lib/ai-provider'; // Import our abstraction layer
export const runtime = 'edge';
export async function POST(req: Request) {
try {
const { messages } = await req.json();
// ... (credit check logic) ...
// 1. Call abstraction interface, not OpenAI
// ai.chat might be OpenAI or Google, this API doesn't care
const stream = await ai.chat.createChatStream(messages);
// 2. Return stream to frontend
return new StreamingTextResponse(stream);
} catch (error) {
console.error('Chat API Error:', error);
return new Response(JSON.stringify({ error: 'Failed to get response' }), { status: 500 });
}
}With this architecture, we can now switch our SAAS from OpenAI to Google Gemini in real-time (if using Vercel Edge Config) by simply changing one environment variable AI_PROVIDER="google" without redeploying code, achieving high availability and cost control.
Chapter 19 Summary
In this chapter, we finally injected core "intelligence" into our SAAS.
- Vercel AI SDK: We leveraged the
useChatHook (frontend) andStreamingTextResponse(backend) from theaipackage to build a fully functional, streaming-supported AI chat interface with minimal cost. This solves the massive pain point of handling streaming data in traditional architectures. - Edge Runtime: We deployed our AI API route to the
edgeruntime, dramatically reducing access latency for global users. - Architecture Decoupling (Abstraction Layer): We designed an
ai-provider.tsabstraction layer that defines a unifiedIAiChatProviderinterface. This decouples our API routes from specific AI providers (like OpenAI or Google). - Avoiding Vendor Lock-in: This abstraction layer architecture gives our SAAS extreme flexibility, allowing us to dynamically switch underlying AI models through a single environment variable based on cost, performance, or availability. This is key to building a robust, scalable AI SAAS.
Categories
src/app/api/chat/route.ts (Backend API)File 2: src/components/chat/ChatInterface.tsx (Frontend UI)19.2. [Code Analysis]: Analyzing the AI Abstraction Layer (Avoiding Vendor Lock-in)Integration: Refactoring src/app/api/chat/route.tsChapter 19 SummaryMore Posts
Chapter 1: Hello, JavaScript (A Python Developer's Perspective)
This is the first and most critical mindset shift you'll experience as a Python backend developer.
Chapter 10: Database Migrations and Operations
In Chapters 7 and 9, we defined our database table structure in the src/db/schema/ directory. But we left a critical question: when you modify the schema (like adding a bio field to usersTable), how do you safely apply this change to a production database that's already running live?
Chapter 11: Payments and Subscriptions (Stripe)
Welcome to Part Five, the core business logic of our SAAS application. In previous chapters, we built a solid application foundation—from frontend UI, RSC data flow, secure Server Actions to type-safe Drizzle ORM and 'better-auth' authentication. Now, it's time to transform our application from a 'project' into a 'product': implementing paid subscriptions.