2026/01/06

Chapter 19: Integrating AI Capabilities (Vercel AI SDK)

In the first 18 chapters of this book, we've built an extremely solid SAAS foundation. We have all the core components for payment (Stripe), authentication (better-auth), database (Drizzle), DevOps (GitHub Actions), and feature releases (Feature Flags). Now, it's time to inject 'intelligence' into our SAAS...

Chapter 19: Integrating AI Capabilities (Vercel AI SDK)

For Python developers, you might be very familiar with using the requests library to call OpenAI API, or using boto3 to interact with AWS Bedrock. These operations typically happen in a Flask or Django backend, where you manually handle API requests, errors, retries, and return the final JSON result to the frontend (which then figures out how to handle it).

This workflow becomes extremely complex when dealing with AI streaming responses. How do you proxy a chunked HTTP response from a Python backend to a React frontend and update the UI in real-time? This requires complex WebSockets or Server-Sent Events (SSE) setup.

Vercel AI SDK was born to solve this problem. It's a framework-agnostic (but extremely Next.js-friendly) toolkit designed to build AI-powered, streaming-first user interfaces in JavaScript/TypeScript applications in the simplest way possible.

In this chapter, we'll practice with the Vercel AI SDK and explore how to build a pluggable backend architecture to support multiple AI providers.

19.1. [Skill Practice]: Quickly Implement Streaming with Vercel AI SDK

[Skill Practice: claude-nextjs-skills/vercel-ai-sdk/SKILL.md]

Vercel AI SDK is not an AI model. It's an adapter and UI helper. Its core value lies in providing two powerful React Hooks:

useChat(): For building ChatGPT-like multi-turn conversation interfaces.
useCompletion(): For simple "input-output" style completions (e.g., "summarize this text").

These Hooks automatically handle all frontend complexity for you: managing message lists (messages), handling user input (input), tracking loading states (isLoading), and rendering streaming responses to the UI in real-time.

Let's build a classic chatbot interface. This requires two parts:

File 1: `src/app/api/chat/route.ts` (Backend API)

First, we need a Route Handler to receive requests from the frontend and securely call the AI API. Vercel AI SDK provides seamless integration with libraries like OpenAI, Google Gemini, Anthropic, etc.

// src/app/api/chat/route.ts
import { OpenAI } from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';

// Create OpenAI client from .env.local
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Key: Switch Vercel deployment to Edge runtime
// This makes responses extremely fast as it runs on global edge nodes
export const runtime = 'edge';

export async function POST(req: Request) {
  try {
    // Parse messages from the `useChat` Hook's request
    const { messages } = await req.json();

    // Check credits (from Chapter 12)
    // (Pseudo-code: you need a method to get userId from request)
    // const userId = ...;
    // const creditResult = await consumeCredits(userId, 1, 'AI Chat Completion');
    // if (creditResult.error) {
    //   return new Response(JSON.stringify(creditResult), { status: 402 }); // 402 Payment Required
    // }

    // 1. Call OpenAI API
    const response = await openai.chat.completions.create({
      model: 'gpt-4o-mini',
      stream: true, // Key: enable streaming response
      messages: messages,
    });

    // 2. Convert OpenAI's response to Vercel AI SDK's stream
    const stream = OpenAIStream(response);

    // 3. Return streaming response to frontend
    // This automatically handles all chunked transfer complexity
    return new StreamingTextResponse(stream);

  } catch (error) {
    console.error('Chat API Error:', error);
    return new Response(JSON.stringify({ error: 'Failed to get response' }), { status: 500 });
  }
}

Architecture Analysis:

runtime = 'edge': This is a major performance optimization. Our chat API is no longer a Node.js Serverless Function but an Edge Function distributed globally. This dramatically reduces the "time to first byte" (TTFB) of AI responses.
OpenAIStream and StreamingTextResponse: This is where Vercel AI SDK's magic lies. You don't need to manually manage chunks—just pipe OpenAI's raw stream through OpenAIStream, then wrap it in StreamingTextResponse and return it.

File 2: `src/components/chat/ChatInterface.tsx` (Frontend UI)

Now for the frontend. In a client component, we only need to use the useChat Hook to drive the entire UI.

// src/components/chat/ChatInterface.tsx
'use client';

import { useChat } from 'ai/react';
import { useEffect } from 'react';
import { toast } from 'sonner'; // (Assuming using sonner for toasts)

export function ChatInterface() {
  const { messages, input, handleInputChange, handleSubmit, isLoading, error } = useChat({
    // `useChat` automatically sends POST requests to this API
    api: '/api/chat',

    // (Optional) When streaming response completes
    onFinish: (message) => {
      console.log('Stream finished. Final message:', message);
      // You can trigger "final confirmation" of credit consumption here
    },

    // (Optional) Handle errors
    onError: (err) => {
      if (err.message.includes('402')) {
        toast.error('Insufficient credits. Please upgrade your plan.');
      } else {
        toast.error('An error occurred. Please try again.');
      }
    }
  });

  return (
    <div className="flex flex-col h-[500px] border rounded-lg">
      {/* Message display area */}
      <div className="flex-1 overflow-y-auto p-4 space-y-2">
        {messages.map((m) => (
          <div key={m.id} className={`p-2 rounded-lg ${
            m.role === 'user' ? 'bg-blue-100 text-right' : 'bg-gray-100'
          }`}>
            <span className="font-bold">
              {m.role === 'user' ? 'You' : 'AI'}:
            </span>
            <span className="whitespace-pre-wrap">{m.content}</span>
          </div>
        ))}
        {isLoading && <div className="text-gray-500">AI is typing...</div>}
      </div>

      {/* Input form */}
      {/* `handleSubmit` automatically handles form submission,
          adds user message to `messages` list,
          and calls the API
      */}
      <form onSubmit={handleSubmit} className="flex p-2 border-t">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Ask me anything..."
          className="flex-1 p-2 border rounded-lg"
          disabled={isLoading}
        />
        <button type="submit" disabled={isLoading} className="ml-2 p-2 bg-blue-500 text-white rounded-lg">
          Send
        </button>
      </form>
    </div>
  );
}

That's it! With just two files, we've built a fully functional, streaming-supported AI chat interface that handles loading and error states. The useChat Hook acts like a state machine, managing everything for us.

19.2. [Code Analysis]: Analyzing the AI Abstraction Layer (Avoiding Vendor Lock-in)

The solution in section 19.1 is excellent, but it has an architectural flaw: our src/app/api/chat/route.ts hardcodes openai.

What if the OpenAI API goes down?
What if Google's Gemini-Pro is cheaper and better for certain tasks?
What if we want to integrate Replicate (Stable Diffusion) for a generateImage() feature?

A mature AI SAAS must never lock itself into a single AI provider. We need an Abstraction Layer.

Let's design a hypothetical src/lib/ai-provider.ts (or src/ai-interface.ts).

// src/lib/ai-provider.ts
import { OpenAI } from 'openai';
import { GoogleGenerativeAI } from '@google/generative-ai';
import { OpenAIStream, GoogleGenerativeAIStream, StreamData } from 'ai';
import { ChatCompletionMessageParam } from 'openai/resources';

// 1. Define a unified provider interface
interface IAiChatProvider {
  // Accept a generic message array, return a generic readable stream
  createChatStream(
    messages: ChatCompletionMessageParam[]
  ): Promise<ReadableStream<StreamData>>;
}

// 2. Implement OpenAI provider
class OpenAiProvider implements IAiChatProvider {
  private openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

  async createChatStream(messages: ChatCompletionMessageParam[]) {
    const response = await this.openai.chat.completions.create({
      model: 'gpt-4o-mini',
      stream: true,
      messages: messages,
    });
    return OpenAIStream(response);
  }
}

// 3. Implement Google Gemini provider
class GoogleGeminiProvider implements IAiChatProvider {
  private genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!);

  // (Note: Gemini's message format differs from OpenAI, we need a converter)
  private convertMessagesToGemini(messages: ChatCompletionMessageParam[]) {
    // ... conversion logic ...
    return messages.map(m => ({ role: m.role, parts: [{ text: m.content as string }] }));
  }

  async createChatStream(messages: ChatCompletionMessageParam[]) {
    const model = this.genAI.getGenerativeModel({ model: 'gemini-1.5-pro' });

    // Gemini needs message format conversion
    const geminiMessages = this.convertMessagesToGemini(messages);

    const stream = await model.generateContentStream({
      contents: geminiMessages,
    });

    // Use Vercel AI SDK's Google stream converter
    return GoogleGenerativeAIStream(stream);
  }
}

// 4. (Optional) Implement Replicate provider (for images)
// ... Similarly, can define a `createImageStream` interface for image generation ...

// 5. Create a factory function to select provider
function getAiProvider(): IAiChatProvider {
  const provider = process.env.AI_PROVIDER || 'openai';

  if (provider === 'google') {
    return new GoogleGeminiProvider();
  }
  // Default to OpenAI
  return new OpenAiProvider();
}

// 6. Export our main AI interface
export const ai = {
  chat: getAiProvider(),
  // image: getImageProvider(), // (For image features)
};

Integration: Refactoring `src/app/api/chat/route.ts`

Now, our API route becomes extremely clean and "agnostic". It no longer cares who the AI provider is.

// src/app/api/chat/route.ts (refactored)
import { StreamingTextResponse } from 'ai';
import { ai } from '@/lib/ai-provider'; // Import our abstraction layer

export const runtime = 'edge';

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();

    // ... (credit check logic) ...

    // 1. Call abstraction interface, not OpenAI
    // ai.chat might be OpenAI or Google, this API doesn't care
    const stream = await ai.chat.createChatStream(messages);

    // 2. Return stream to frontend
    return new StreamingTextResponse(stream);

  } catch (error) {
    console.error('Chat API Error:', error);
    return new Response(JSON.stringify({ error: 'Failed to get response' }), { status: 500 });
  }
}

With this architecture, we can now switch our SAAS from OpenAI to Google Gemini in real-time (if using Vercel Edge Config) by simply changing one environment variable AI_PROVIDER="google" without redeploying code, achieving high availability and cost control.

Chapter 19 Summary

In this chapter, we finally injected core "intelligence" into our SAAS.

Vercel AI SDK: We leveraged the useChat Hook (frontend) and StreamingTextResponse (backend) from the ai package to build a fully functional, streaming-supported AI chat interface with minimal cost. This solves the massive pain point of handling streaming data in traditional architectures.
Edge Runtime: We deployed our AI API route to the edge runtime, dramatically reducing access latency for global users.
Architecture Decoupling (Abstraction Layer): We designed an ai-provider.ts abstraction layer that defines a unified IAiChatProvider interface. This decouples our API routes from specific AI providers (like OpenAI or Google).
Avoiding Vendor Lock-in: This abstraction layer architecture gives our SAAS extreme flexibility, allowing us to dynamically switch underlying AI models through a single environment variable based on cost, performance, or availability. This is key to building a robust, scalable AI SAAS.

All Posts

Chapter 19: Integrating AI Capabilities (Vercel AI SDK)

In this chapter, we'll practice with the Vercel AI SDK and explore how to build a pluggable backend architecture to support multiple AI providers.

19.1. [Skill Practice]: Quickly Implement Streaming with Vercel AI SDK

[Skill Practice: claude-nextjs-skills/vercel-ai-sdk/SKILL.md]

Vercel AI SDK is not an AI model. It's an adapter and UI helper. Its core value lies in providing two powerful React Hooks:

useChat(): For building ChatGPT-like multi-turn conversation interfaces.
useCompletion(): For simple "input-output" style completions (e.g., "summarize this text").

Let's build a classic chatbot interface. This requires two parts:

File 1: `src/app/api/chat/route.ts` (Backend API)

// src/app/api/chat/route.ts
import { OpenAI } from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';

// Create OpenAI client from .env.local
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Key: Switch Vercel deployment to Edge runtime
// This makes responses extremely fast as it runs on global edge nodes
export const runtime = 'edge';

export async function POST(req: Request) {
  try {
    // Parse messages from the `useChat` Hook's request
    const { messages } = await req.json();

    // Check credits (from Chapter 12)
    // (Pseudo-code: you need a method to get userId from request)
    // const userId = ...;
    // const creditResult = await consumeCredits(userId, 1, 'AI Chat Completion');
    // if (creditResult.error) {
    //   return new Response(JSON.stringify(creditResult), { status: 402 }); // 402 Payment Required
    // }

    // 1. Call OpenAI API
    const response = await openai.chat.completions.create({
      model: 'gpt-4o-mini',
      stream: true, // Key: enable streaming response
      messages: messages,
    });

    // 2. Convert OpenAI's response to Vercel AI SDK's stream
    const stream = OpenAIStream(response);

    // 3. Return streaming response to frontend
    // This automatically handles all chunked transfer complexity
    return new StreamingTextResponse(stream);

  } catch (error) {
    console.error('Chat API Error:', error);
    return new Response(JSON.stringify({ error: 'Failed to get response' }), { status: 500 });
  }
}

Architecture Analysis:

runtime = 'edge': This is a major performance optimization. Our chat API is no longer a Node.js Serverless Function but an Edge Function distributed globally. This dramatically reduces the "time to first byte" (TTFB) of AI responses.
OpenAIStream and StreamingTextResponse: This is where Vercel AI SDK's magic lies. You don't need to manually manage chunks—just pipe OpenAI's raw stream through OpenAIStream, then wrap it in StreamingTextResponse and return it.

File 2: `src/components/chat/ChatInterface.tsx` (Frontend UI)

Now for the frontend. In a client component, we only need to use the useChat Hook to drive the entire UI.

// src/components/chat/ChatInterface.tsx
'use client';

import { useChat } from 'ai/react';
import { useEffect } from 'react';
import { toast } from 'sonner'; // (Assuming using sonner for toasts)

export function ChatInterface() {
  const { messages, input, handleInputChange, handleSubmit, isLoading, error } = useChat({
    // `useChat` automatically sends POST requests to this API
    api: '/api/chat',

    // (Optional) When streaming response completes
    onFinish: (message) => {
      console.log('Stream finished. Final message:', message);
      // You can trigger "final confirmation" of credit consumption here
    },

    // (Optional) Handle errors
    onError: (err) => {
      if (err.message.includes('402')) {
        toast.error('Insufficient credits. Please upgrade your plan.');
      } else {
        toast.error('An error occurred. Please try again.');
      }
    }
  });

  return (
    <div className="flex flex-col h-[500px] border rounded-lg">
      {/* Message display area */}
      <div className="flex-1 overflow-y-auto p-4 space-y-2">
        {messages.map((m) => (
          <div key={m.id} className={`p-2 rounded-lg ${
            m.role === 'user' ? 'bg-blue-100 text-right' : 'bg-gray-100'
          }`}>
            <span className="font-bold">
              {m.role === 'user' ? 'You' : 'AI'}:
            </span>
            <span className="whitespace-pre-wrap">{m.content}</span>
          </div>
        ))}
        {isLoading && <div className="text-gray-500">AI is typing...</div>}
      </div>

      {/* Input form */}
      {/* `handleSubmit` automatically handles form submission,
          adds user message to `messages` list,
          and calls the API
      */}
      <form onSubmit={handleSubmit} className="flex p-2 border-t">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Ask me anything..."
          className="flex-1 p-2 border rounded-lg"
          disabled={isLoading}
        />
        <button type="submit" disabled={isLoading} className="ml-2 p-2 bg-blue-500 text-white rounded-lg">
          Send
        </button>
      </form>
    </div>
  );
}

19.2. [Code Analysis]: Analyzing the AI Abstraction Layer (Avoiding Vendor Lock-in)

The solution in section 19.1 is excellent, but it has an architectural flaw: our src/app/api/chat/route.ts hardcodes openai.

What if the OpenAI API goes down?
What if Google's Gemini-Pro is cheaper and better for certain tasks?
What if we want to integrate Replicate (Stable Diffusion) for a generateImage() feature?

A mature AI SAAS must never lock itself into a single AI provider. We need an Abstraction Layer.

Let's design a hypothetical src/lib/ai-provider.ts (or src/ai-interface.ts).

// src/lib/ai-provider.ts
import { OpenAI } from 'openai';
import { GoogleGenerativeAI } from '@google/generative-ai';
import { OpenAIStream, GoogleGenerativeAIStream, StreamData } from 'ai';
import { ChatCompletionMessageParam } from 'openai/resources';

// 1. Define a unified provider interface
interface IAiChatProvider {
  // Accept a generic message array, return a generic readable stream
  createChatStream(
    messages: ChatCompletionMessageParam[]
  ): Promise<ReadableStream<StreamData>>;
}

// 2. Implement OpenAI provider
class OpenAiProvider implements IAiChatProvider {
  private openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

  async createChatStream(messages: ChatCompletionMessageParam[]) {
    const response = await this.openai.chat.completions.create({
      model: 'gpt-4o-mini',
      stream: true,
      messages: messages,
    });
    return OpenAIStream(response);
  }
}

// 3. Implement Google Gemini provider
class GoogleGeminiProvider implements IAiChatProvider {
  private genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!);

  // (Note: Gemini's message format differs from OpenAI, we need a converter)
  private convertMessagesToGemini(messages: ChatCompletionMessageParam[]) {
    // ... conversion logic ...
    return messages.map(m => ({ role: m.role, parts: [{ text: m.content as string }] }));
  }

  async createChatStream(messages: ChatCompletionMessageParam[]) {
    const model = this.genAI.getGenerativeModel({ model: 'gemini-1.5-pro' });

    // Gemini needs message format conversion
    const geminiMessages = this.convertMessagesToGemini(messages);

    const stream = await model.generateContentStream({
      contents: geminiMessages,
    });

    // Use Vercel AI SDK's Google stream converter
    return GoogleGenerativeAIStream(stream);
  }
}

// 4. (Optional) Implement Replicate provider (for images)
// ... Similarly, can define a `createImageStream` interface for image generation ...

// 5. Create a factory function to select provider
function getAiProvider(): IAiChatProvider {
  const provider = process.env.AI_PROVIDER || 'openai';

  if (provider === 'google') {
    return new GoogleGeminiProvider();
  }
  // Default to OpenAI
  return new OpenAiProvider();
}

// 6. Export our main AI interface
export const ai = {
  chat: getAiProvider(),
  // image: getImageProvider(), // (For image features)
};

Integration: Refactoring `src/app/api/chat/route.ts`

Now, our API route becomes extremely clean and "agnostic". It no longer cares who the AI provider is.

// src/app/api/chat/route.ts (refactored)
import { StreamingTextResponse } from 'ai';
import { ai } from '@/lib/ai-provider'; // Import our abstraction layer

export const runtime = 'edge';

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();

    // ... (credit check logic) ...

    // 1. Call abstraction interface, not OpenAI
    // ai.chat might be OpenAI or Google, this API doesn't care
    const stream = await ai.chat.createChatStream(messages);

    // 2. Return stream to frontend
    return new StreamingTextResponse(stream);

  } catch (error) {
    console.error('Chat API Error:', error);
    return new Response(JSON.stringify({ error: 'Failed to get response' }), { status: 500 });
  }
}

Chapter 19 Summary

In this chapter, we finally injected core "intelligence" into our SAAS.

Vercel AI SDK: We leveraged the useChat Hook (frontend) and StreamingTextResponse (backend) from the ai package to build a fully functional, streaming-supported AI chat interface with minimal cost. This solves the massive pain point of handling streaming data in traditional architectures.
Edge Runtime: We deployed our AI API route to the edge runtime, dramatically reducing access latency for global users.
Architecture Decoupling (Abstraction Layer): We designed an ai-provider.ts abstraction layer that defines a unified IAiChatProvider interface. This decouples our API routes from specific AI providers (like OpenAI or Google).
Avoiding Vendor Lock-in: This abstraction layer architecture gives our SAAS extreme flexibility, allowing us to dynamically switch underlying AI models through a single environment variable based on cost, performance, or availability. This is key to building a robust, scalable AI SAAS.

All Posts

Chapter 19: Integrating AI Capabilities (Vercel AI SDK)

Chapter 19: Integrating AI Capabilities (Vercel AI SDK)

19.1. [Skill Practice]: Quickly Implement Streaming with Vercel AI SDK

File 1: `src/app/api/chat/route.ts` (Backend API)

File 2: `src/components/chat/ChatInterface.tsx` (Frontend UI)

19.2. [Code Analysis]: Analyzing the AI Abstraction Layer (Avoiding Vendor Lock-in)

Integration: Refactoring `src/app/api/chat/route.ts`

Chapter 19 Summary

Categories

More Posts

Chapter 1: Hello, JavaScript (A Python Developer's Perspective)

Chapter 10: Database Migrations and Operations

Chapter 11: Payments and Subscriptions (Stripe)

Chapter 19: Integrating AI Capabilities (Vercel AI SDK)

Chapter 19: Integrating AI Capabilities (Vercel AI SDK)

19.1. [Skill Practice]: Quickly Implement Streaming with Vercel AI SDK

File 1: `src/app/api/chat/route.ts` (Backend API)

File 2: `src/components/chat/ChatInterface.tsx` (Frontend UI)

19.2. [Code Analysis]: Analyzing the AI Abstraction Layer (Avoiding Vendor Lock-in)

Integration: Refactoring `src/app/api/chat/route.ts`

Chapter 19 Summary

Categories

More Posts

Chapter 1: Hello, JavaScript (A Python Developer's Perspective)

Chapter 10: Database Migrations and Operations

Chapter 11: Payments and Subscriptions (Stripe)