Expanding RAG Search with MCP Tools: Augmenting AI Responses with Function Calling

Introduction: The Problem with Basic RAG

When building [Kaiwakai](https://www.kaiwakai.com/), a Japanese learning application, I implemented a RAG (Retrieval-Augmented Generation) system to help users search through podcast transcripts and vocabulary. The initial implementation worked well - users could search for Japanese words and get relevant context from our database of transcripts and vocabulary entries.

However, I quickly realized a limitation: the RAG system could only provide information that existed in our database. When users searched for a word, they'd get transcript examples and basic definitions, but they were missing crucial learning context:

What JLPT level is this word?
How is the kanji structured?
What are the mnemonics to remember it?
What are example sentences from a structured curriculum?

This information exists in the WaniKani API - a comprehensive Japanese learning platform - but our RAG system had no way to access it. The solution? Model Context Protocol (MCP) tools combined with OpenAI's function calling.

The Solution: MCP Tools + Function Calling

What is MCP?

Model Context Protocol (MCP) is a standardized way to provide external context and tools to AI models. Instead of the AI being limited to its training data and your vector database, MCP allows the AI to call external APIs and services when it needs additional information.

Building the WaniKani MCP Tool

I created @kaiwakai/wanikani-mcp, an npm package that wraps the WaniKani API v2 and exposes it as an MCP tool. Here's how it works:

// The MCP tool is defined and provided to OpenAI
const toolDefinition = {
  name: "get_wanikani_data",
  description: "Fetches comprehensive learning data from WaniKani for a Japanese word",
  parameters: {
    type: "object",
    properties: {
      word: {
        type: "string",
        description: "The Japanese word to look up"
      }
    },
    required: ["word"]
  }
};

The Enhanced RAG Pipeline

The new architecture follows this flow:

1. User searches for a word (e.g., "勉強")

2. Generate embedding using OpenAI's text-embedding-3-small

3. Hybrid search in Neon Postgres with pgvector (exact match + vector similarity)

4. First AI call with function calling enabled - GPT-4o-mini decides if it needs WaniKani data

5. Tool execution (if requested) - Call the WaniKani MCP tool

6. Second AI call with enriched context - Generate explanation with both RAG results AND WaniKani data

7. Return comprehensive response to the user

Here's the complete architecture:

Key Benefits

1. Intelligent Decision Making

The AI doesn't blindly call external APIs for every query. It analyzes the user's search and the RAG results first, then decides if additional context from WaniKani would be helpful. This saves API calls and reduces latency.

2. Richer Context

Users now get:

Transcript examples from the RAG database
JLPT level and difficulty rating from WaniKani
Kanji breakdowns and radicals
Mnemonics and learning tips
Structured example sentences

3. Seamless Integration

From the user's perspective, it's just a single search. They don't need to know that behind the scenes, we're orchestrating multiple AI calls, vector searches, and API requests.

4. Cost-Effective

Using GPT-4o-mini for function calling keeps costs low:

Embeddings: $0.02 per 1M tokens
Function calling + generation: $0.15/1M input, $0.60/1M output
Only calls WaniKani when necessary (rate limited to 60 req/min anyway)

Example Response

When a user searches for "勉強", they now get:

From RAG:

5 relevant transcript segments where "勉強" appears
Vocabulary entries with definitions

From WaniKani (via MCP):

Level: JLPT N5, WaniKani Level 8
Kanji breakdown: 勉 (exertion) + 強 (strong)
Mnemonic: "You need strong exertion to study effectively"
Example sentences with English translations

All presented in a unified, AI-generated explanation that connects the dots.

Conclusion

Combining RAG with MCP tools and function calling transforms a basic search system into an intelligent learning assistant. The AI can now:

1. Search your proprietary data (transcripts, vocabulary)

2. Decide when external context is needed

3. Fetch that context intelligently

4. Synthesize everything into a coherent explanation

This pattern isn't limited to language learning - any RAG system can benefit from MCP tools:

E-commerce: RAG for product catalog + API for real-time inventory
Healthcare: RAG for medical literature + API for drug interactions
Finance: RAG for company docs + API for live market data

The key is giving the AI the ability to know what it doesn't know and the tools to fill those gaps.