Iβm a Software Engineer with a strong focus on data systems, backend engineering, and production AI/ML systems. I recently completed my MS in Computer Science at Indiana University Bloomington, where I combined solid CS fundamentals with hands-on experience building scalable, real-world systems.
I enjoy working close to the system boundary β thinking about performance, reliability, evaluation, and tradeoffs β and shipping solutions that hold up in production.
-
Data & Platform Engineering Designing and deploying Spark / Databricks pipelines, ingestion workflows, caching layers, and backend services with clear throughput, cost, and fault-tolerance goals.
-
Production AI & ML Systems Building NLP and ML systems end to end β from training and evaluation to deployment, monitoring, and iteration.
-
Retrieval & Search Systems Developing RAG pipelines, vector indexing, semantic search, and ranking systems over large document corpora.
-
Backend & API Development Shipping FastAPI / Flask services with batching, caching, observability, and clean interfaces for downstream users.
- Built a retrieval-augmented QA system over 10K+ documents
- Improved Recall@5 from 0.62 β 0.87 using HNSW indexing and cross-encoder reranking
- Deployed with FastAPI, Redis caching, and Docker
- Focused on latency, memory efficiency, and cost optimization
- Designed a tool-using agent with a structured plan β act β observe β reflect loop
- Achieved 19/20 task success with validation, retries, and timeouts
- Published to PyPI with logging, tracing, and documented failure modes
- Trained Transformer models on 50K sentence pairs
- Improved BLEU score from 28.4 β 32.7
- Implemented beam search decoding and deployed an inference API
-
Software Engineer β ML & Data Systems (Contract) Built Spark SQL ingestion pipelines on Databricks, evaluation frameworks, and batch / near-real-time workflows with caching and idempotent writes.
-
Computer Vision Research Assistant Trained and deployed 3D point cloud models (PointNet++, KPConv), built annotation tooling, and shipped inference APIs for large-scale datasets.
-
Software Engineer β ML Systems Developed multimodal NLP systems, fine-tuned large models, and engineered data pipelines processing 100K+ records.
Languages & Frameworks Python, SQL, FastAPI, Flask, Django, JavaScript
Data & Distributed Systems Spark (PySpark, Spark SQL), Databricks, Data Pipelines, Distributed Processing, Redis
AI / ML PyTorch, TensorFlow, Transformers, scikit-learn, NLP, RAG, Model Evaluation
Production & Cloud Docker, Git, CI/CD, AWS, GCP, MLflow, Weights & Biases
- Software Engineering (Backend / Full Stack)
- Data, Platform, and Infrastructure roles
- Production-focused AI / ML Engineering
Iβm especially excited by teams building systems, platforms, and developer-facing tools that scale to real users.
- Email: atharvagurav01@gmail.com
- LinkedIn: linkedin.com/in/atharvagurav01
Always open to conversations about engineering, systems, and building things that matter.


