DeepSer is a weekend system-design project where I built a Perplexity-like web search engine from scratch.
Given a user query, the system generates search queries using an LLM, fetches relevant web pages, scrapes and summarizes them using browser automation, and finally produces a structured AI-generated report.
This project focuses on async systems, browser-based scraping, queues, and AI pipelines, and is built purely for learning and experimentation.
https://medium.com/@yashraj504300/building-my-own-perplexity-web-search-f6ce5cfa5d7c
- User enters a query in the frontend.
- An LLM converts the query into multiple web search queries.
- Brave Search API fetches URLs → URLs are pushed to RabbitMQ.
- Async Playwright-based scraper consumes URLs, extracts content, and summarizes it using LLMs.
- Redis tracks progress, and the final report is served back to the frontend.
- Native async support (perfect for FastAPI + asyncio)
- Faster startup and execution via DevTools Protocol
- Official Docker images with browsers preinstalled
- Lower memory usage when running parallel scraping tasks
The scraper container keeps the browser warm, reusing it across tasks for performance.
Concurrency is controlled using semaphores to limit the number of open tabs and avoid crashes in Docker.
- Frontend: Next.js
- Backend: FastAPI (API layer)
- Scraping: Playwright (async, Chromium)
- LLMs: Groq APIs
- Search: Brave Search API
- Queue: RabbitMQ
- State Tracking: Redis
- Database: PostgreSQL
- Infra: Docker & Docker Compose
git clone https://github.com/YashRaj1506/DeepSer.git
cd DeepSerCreate a .env file in the root directory and paste the following:
BRAVE_API_KEY=
GROQ_API_KEY_1=
GROQ_API_KEY_2=
GROQ_API_KEY_3=
GROQ_API_KEY_4=
GROQ_API_KEY_5=
# Database
POSTGRES_HOST=postgres
POSTGRES_PORT=5432
POSTGRES_DB=marketresearch
POSTGRES_USER=postgres
POSTGRES_PASSWORD=password
# RabbitMQ
RABBITMQ_HOST=rabbitmq
RABBITMQ_USER=guest
RABBITMQ_PASS=guest
# Frontend / Backend
NEXT_PUBLIC_API_URL=http://localhost:8000
FRONTEND_URL=http://localhost:3000
# Django
DJANGO_SECRET_KEY=your-django-secret-key-here
DJANGO_DEBUG=True
# Google OAuth (optional)
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=docker compose up --buildFrontend → http://localhost:3000
Batch URLs in RabbitMQ and perform atomic bulk upserts to reduce DB costs.
Replace frontend polling with Server-Sent Events (SSE).
Reuse RabbitMQ connections instead of creating one per publish.
Improve retries, failure handling, and observability.
Smarter deduplication and ranking of sources.
