Muhammad Rafi Arsya
Back to All Projects
RAG · AI · Self-Hosted · Ongoing · 2026

PaperMind

A local RAG-powered academic paper analyzer. Upload any PDF, ask questions, and get structured summaries — running 100% privately on self-hosted hardware. Zero cloud. Zero cost. Zero data leakage.

RAG LLM React + Vite FastAPI ChromaDB Ollama Self-Hosted
Actively developing & self-hosted — Local LLM running on a Linux mini PC 24/7. No cloud API. No cost per query.
Status
In Development
Year
2026
Role
Solo Developer
Platform
Web
Overview

PaperMind is a local RAG (Retrieval Augmented Generation) system I built to solve a real problem: reading dense academic papers is time-consuming, and existing AI tools like ChatGPT hallucinate when asked about documents they haven't seen.

PaperMind actually reads your PDF first — splitting it into chunks, converting each chunk to a vector embedding using nomic-embed-text, and storing them in ChromaDB. When you ask a question, it finds the most relevant chunks and sends them as context to Mistral 7B / Phi3 Mini running locally via Ollama. The LLM answers strictly from the paper — no hallucination, no guessing.

The entire stack — React frontend, FastAPI backend, ChromaDB, and local LLM — runs on a Linux mini PC via Docker Compose. No cloud dependency. No API cost. All data stays on-device.

Key Features
Natural Language Q&A
Ask anything about your paper in plain English. PaperMind retrieves the 5 most relevant chunks via cosine similarity and answers strictly from the document.
Structured Auto-Summary
One-click structured summary covering main topic, objectives, methodology, key findings, and conclusions — extracted from the paper, not guessed.
100% Local LLM
Runs Mistral 7B or Phi3 Mini via Ollama on local hardware. No OpenAI API key. No per-token cost. No data leaves the machine.
Multi-Paper Library
Upload and manage multiple PDFs simultaneously. Each paper has its own isolated vector store in ChromaDB with metadata filtering.
Privacy-First Design
Everything runs locally — ideal for unpublished research or sensitive academic work. No third-party API receives your documents.
RAG Architecture
Nginx — Reverse Proxy
Port 3001 · Routes /api/* → FastAPI
React 18 + Vite — Frontend
Static build · Lucide icons · Mobile responsive
FastAPI — Backend API
Python · /upload /ask /summarize /papers
ChromaDB — Vector Database
Persistent · Cosine similarity · Metadata filtering
Ollama — Local LLM Runtime
Mistral 7B / Phi3 Mini · nomic-embed-text · Host machine
PyMuPDF — PDF Processing
Text extraction · 800-char chunks · 100-char overlap
Build Progress
In Development
PDF Upload + Chunking100%
Vector Embedding100%
Q&A via RAG100%
Auto Summary100%
Streaming Responses~20%
Citation Highlighting0%
Tech Stack
React 18 Vite FastAPI Python ChromaDB Ollama Mistral 7B Phi3 Mini nomic-embed-text PyMuPDF LangChain Docker Nginx Ubuntu Linux
By the Numbers
4.4GB
Local LLM Size
RM 0
API Cost/Month
800
Chars per Chunk
5
Chunks per Query
Local AI. Zero cost. Zero cloud.
A 4.4GB language model running on my own hardware, answering questions about research papers without sending a single byte to any external server. Built from scratch — RAG pipeline, Docker stack, systemd config and all.
Read the Full Blog