Previously, I built a RAG Search System for a Japanese Learning App with Next.js and OpenAI, a platform that collects episodes from a Japanese TV show and enables search for Japanese words that you'd like to study by listening to the pronunciation inside real contexts. Using a similar structure, and in order to improve my knowledge of building RAG (Retrieval-Augmented Generation) systems, I created a second project for searching a catalog of manga (Japanese comics/graphic novels).
I wanted to create a system that understands natural language queries and provides intelligent recommendations.
Users can search with queries like "recommend me a manga like Death Note but with more action" or "I want something similar to One Piece but shorter", and the system returns relevant manga with AI-generated explanations for why each recommendation matches their request.
This project combines vector embeddings, semantic search, and generative AI to create a search experience that goes beyond keyword matching to truly understand user intent.

1. The User Interface

The interface is designed to be clean and intuitive, focusing the user's attention on discovering manga through natural language search.
Component Architecture
The application is built with a modular component architecture, separating concerns between presentation, state management, and data fetching. Each component has a specific responsibility: the Hero provides branding, the SearchBar handles user input and API interactions, while AIRecommendations and BrowseGrid manage different result displays. This separation makes the codebase maintainable and allows each piece to evolve independently.
AI Recommendations Section
The top section displays three carefully selected manga with detailed, personalized explanations. Each recommendation card features:
-
Cover Image: A visual preview of the manga
-
Title and Metadata: Basic information including score, publication status, and genres
-
AI Explanation: The key differentiator—2-3 sentences explaining specifically why this manga matches the user's query
-
Similarity Score: A numerical indicator of how closely it matches the search intent These AI-generated explanations transform the search from a simple matching exercise into a curated recommendation experience. Instead of wondering why a particular manga appeared in results, users immediately understand the connection to their query.
2. Understanding the Architecture
What is RAG?
RAG (Retrieval-Augmented Generation) is a technique that combines information retrieval with AI text generation. Instead of relying solely on a language model's training data, RAG first retrieves relevant information from a database, then uses that context to generate informed responses.
In this manga search platform, RAG works in two stages:
1. Retrieval: Find manga similar to the user's query using vector similarity search
2. Generation: Use GPT-4o-mini to generate personalized explanations for why each manga matches the query
This approach provides more accurate, contextual recommendations than either technique could achieve alone.
The Tech Stack
The system combines several technologies, each serving a specific role:
-
Next.js 15 with App Router handles the frontend and API routes
-
PostgreSQL with pgvector extension stores manga data and performs vector similarity searches
-
OpenAI text-embedding-3-small converts text into 1536-dimensional vectors
-
GPT-4o-mini generates natural language explanations for recommendations
-
Jikan API provides manga data from MyAnimeList
-
Vercel hosts the application and PostgreSQL database
Architecture Flow

How the Data Flows
When a user searches for manga, here's what happens:
1. The search query is converted into a vector embedding using OpenAI's embedding model
2. PostgreSQL's pgvector extension performs a cosine similarity search to find manga with similar embeddings
3. The top 12 results are retrieved based on similarity scores
4. The top 3 results are sent to GPT-4o-mini with the original query to generate contextual explanations
5. Results are returned to the frontend: 3 with AI explanations, 9 without
This architecture ensures fast searches (vector lookups are sub-100ms) while still providing rich, contextual recommendations through AI-generated explanations.
3. Data Pipeline: From API to Vector Database
Setting Up PostgreSQL with pgvector
The foundation of the search system is PostgreSQL with the pgvector extension, which adds support for vector similarity search directly in the database. The manga table stores both traditional fields (title, synopsis, genres, score) and a special embedding column of type vector(1536) that holds the OpenAI-generated embeddings.
To enable fast similarity searches across thousands of records, I created an IVFFlat index on the embedding column. This index uses approximate nearest neighbor search with 100 lists, providing an optimal balance between search speed and accuracy for datasets up to 100K records. The result is consistent sub-100ms query times even as the database grows.
Importing and Embedding Manga Data
The import process fetches manga data from the Jikan API (MyAnimeList's unofficial API) and generates embeddings for each entry. To create rich, semantically meaningful embeddings, I combine multiple fields—title (English and Japanese), synopsis, genres, themes, and demographics—into a single text string before sending it to OpenAI's text-embedding-3-small model.
Rate limiting was the biggest challenge during import. Jikan allows only 3 requests per second, and OpenAI has tier-based limits, so the import script includes deliberate delays (350ms between Jikan calls, 200ms between OpenAI calls) to stay within bounds. The script also uses idempotent inserts with ON CONFLICT clauses, allowing it to safely resume if interrupted. Importing 200 manga with embeddings takes approximately 15 minutes with these rate limits in place.
4. Building the Semantic Search Backend
The search API endpoint receives a user's natural language query and converts it into a vector embedding using the same OpenAI model used during import. This ensures the query vector exists in the same semantic space as the manga embeddings, making similarity comparisons meaningful.
PostgreSQL's pgvector extension then performs a cosine similarity search using the <=> operator, which calculates the distance between the query vector and every manga embedding in the database, returning results ordered by similarity.
Once the top 12 matches are retrieved, the first three are sent to GPT-4o-mini along with the original query to generate contextual explanations. The prompt asks the model to explain why each manga matches the user's specific request, considering the synopsis, genres, and themes. This takes 2-3 seconds but transforms raw similarity scores into human-readable recommendations that help users understand why these manga appeared in their results.
The API architecture uses Next.js 15's App Router with separate route handlers: /api/search for semantic search queries and /api/manga for fetching popular manga on initial page load. Both endpoints return structured JSON with manga metadata, but the search endpoint includes additional fields like aiExplanation and similarity scores for the top results. This clean separation makes the frontend logic simple—it just needs to display whatever data the API returns.
Conclusion
Building this manga search platform reinforced several key lessons about RAG systems: vector embeddings are surprisingly effective at capturing semantic meaning, pgvector makes production-ready vector search accessible without specialized infrastructure, and combining retrieval with generative AI creates experiences that neither technique could achieve alone. The gap between a working prototype and a polished user experience often comes down to thoughtful details—proper rate limiting, idempotent operations, and clean component architecture.