Skip to content
View Guri10's full-sized avatar

Block or report Guri10

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Guri10/README.md

πŸ‘‹ Hi, I’m Atharva Gurav

I’m a Software Engineer with a strong focus on data systems, backend engineering, and production AI/ML systems. I recently completed my MS in Computer Science at Indiana University Bloomington, where I combined solid CS fundamentals with hands-on experience building scalable, real-world systems.

I enjoy working close to the system boundary β€” thinking about performance, reliability, evaluation, and tradeoffs β€” and shipping solutions that hold up in production.


🧠 What I Work On

  • Data & Platform Engineering Designing and deploying Spark / Databricks pipelines, ingestion workflows, caching layers, and backend services with clear throughput, cost, and fault-tolerance goals.

  • Production AI & ML Systems Building NLP and ML systems end to end β€” from training and evaluation to deployment, monitoring, and iteration.

  • Retrieval & Search Systems Developing RAG pipelines, vector indexing, semantic search, and ranking systems over large document corpora.

  • Backend & API Development Shipping FastAPI / Flask services with batching, caching, observability, and clean interfaces for downstream users.


πŸš€ Selected Projects

QueryGenie β€” RAG System for Research Papers

  • Built a retrieval-augmented QA system over 10K+ documents
  • Improved Recall@5 from 0.62 β†’ 0.87 using HNSW indexing and cross-encoder reranking
  • Deployed with FastAPI, Redis caching, and Docker
  • Focused on latency, memory efficiency, and cost optimization

AgentLoop β€” Autonomous AI Agent Framework

  • Designed a tool-using agent with a structured plan β†’ act β†’ observe β†’ reflect loop
  • Achieved 19/20 task success with validation, retries, and timeouts
  • Published to PyPI with logging, tracing, and documented failure modes

Neural Machine Translation System

  • Trained Transformer models on 50K sentence pairs
  • Improved BLEU score from 28.4 β†’ 32.7
  • Implemented beam search decoding and deployed an inference API

πŸ§‘β€πŸ’Ό Experience Highlights

  • Software Engineer – ML & Data Systems (Contract) Built Spark SQL ingestion pipelines on Databricks, evaluation frameworks, and batch / near-real-time workflows with caching and idempotent writes.

  • Computer Vision Research Assistant Trained and deployed 3D point cloud models (PointNet++, KPConv), built annotation tooling, and shipped inference APIs for large-scale datasets.

  • Software Engineer – ML Systems Developed multimodal NLP systems, fine-tuned large models, and engineered data pipelines processing 100K+ records.


πŸ› οΈ Tech Stack

Languages & Frameworks Python, SQL, FastAPI, Flask, Django, JavaScript

Data & Distributed Systems Spark (PySpark, Spark SQL), Databricks, Data Pipelines, Distributed Processing, Redis

AI / ML PyTorch, TensorFlow, Transformers, scikit-learn, NLP, RAG, Model Evaluation

Production & Cloud Docker, Git, CI/CD, AWS, GCP, MLflow, Weights & Biases


πŸ“Œ What I’m Interested In

  • Software Engineering (Backend / Full Stack)
  • Data, Platform, and Infrastructure roles
  • Production-focused AI / ML Engineering

I’m especially excited by teams building systems, platforms, and developer-facing tools that scale to real users.


πŸ“« Let’s Connect

Always open to conversations about engineering, systems, and building things that matter.


Pinned Loading

  1. QueryGenie QueryGenie Public

    A completely free, RAG chatbot using arXiv papers

    Python

  2. AgentLoop AgentLoop Public

    An autonomous agent system demonstrating LLM-based decision-making in a closed-loop control architecture

    Python

  3. Deepfake-Audio-Detection-with-XAI Deepfake-Audio-Detection-with-XAI Public

    This project focuses on detecting deepfake audio using advanced neural network architectures like VGG16, MobileNet, ResNet, and custom CNNs. It incorporates explainable AI (XAI) methods like LIME, …

    Jupyter Notebook 51 9

  4. neural-machine-translation neural-machine-translation Public

    This project implements a custom neural machine translation (NMT) system using a sequence-to-sequence architecture with attention.

    Python 2

  5. nyc_taxi_data_analytics nyc_taxi_data_analytics Public

    The goal of this project is to perform data analytics on NYC taxi data using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and …

    Jupyter Notebook

  6. thinkboard thinkboard Public

    JavaScript