Aditi Mishra · AI Engineer

AI / ML Engineer · Boston, MA

Aditi
Mishra

Building production AI systems that think, reason, and act — from fine-tuned LLMs to enterprise RAG pipelines at scale.

LangChain RAG Pipelines Fine-Tuning · LoRA PySpark Pinecone RAGAS · TruLens Kubernetes MCP Protocol

View Projects ↓ Resume Let's Connect

aditi_mishra · profile.json

$ →query_profile --full

name: "Aditi Mishra"

role: "AI / ML Engineer"

current: ScaleUp Labs · Boston

yoe: "3+ years production AI"

location: "Boston, MA"

$ →get_impact --verified

inference_cost: ↓ 70% at ScaleUp Labs

data_accuracy: 87% → 98.1% at IBM

students_mentored: 1,000+ at Google DSC

pipeline_uptime: 99.9% over 18 months

$ →_

70% Inference cost reduced at ScaleUp Labs· 90% Manual data entry eliminated via OCR+LLM pipeline· 99.9% Pipeline uptime over 18 months at IBM· 1,000+ Students mentored through AI/ML at Google DSC· 35% Retrieval accuracy improvement via optimized RAG· 87→98% Data accuracy improvement at IBM· 70% Inference cost reduced at ScaleUp Labs· 90% Manual data entry eliminated via OCR+LLM pipeline· 99.9% Pipeline uptime over 18 months at IBM· 1,000+ Students mentored through AI/ML at Google DSC· 35% Retrieval accuracy improvement via optimized RAG· 87→98% Data accuracy improvement at IBM·

01 credentials / education

Academic
foundation

🎓 Most Recent · 2025

2024 – 2025 · Graduate

Master of Science in Data Science

Clark University · Worcester, Massachusetts

3.7

GPA · Dean's List

2018 – 2022 · Undergraduate

B.Tech in Electronics Engineering

Dr. Abdul Kalam Technical University · India

3.4

GPA · Honors

→ Industry Certifications

☁️

AWS Certified Cloud Practitioner

⚡

Databricks Certified · Apache Spark 3.0

🐍

Machine Learning with Python · IBM

🪟

Microsoft MTA · Programming Using Python

02 work_history / experience

Where I've
built AI

🚀

ScaleUp Labs · Boston, MA

AI-first startup building enterprise automation tooling

May 2025 – Present● Current

AI Engineer

⚡ Cut inference costs by 70% — rebuilt RAG system (LangChain + Pinecone) with re-ranking, query compression, and chunking optimization, reducing monthly LLM spend from $14K → $4.2K while improving retrieval accuracy by 35%.

📄 Automated 90% of manual document processing — built OCR + LLM pipeline (Tesseract + GPT-4) that converts unstructured PDFs into structured PostgreSQL records, eliminating a 3-person manual review team.

🔗 Shipped enterprise MCP interface — designed Model Context Protocol layer enabling secure, auditable LLM ↔ enterprise data connectivity with role-based access controls.

📊 Reduced hallucination rate by 42% — built LLM evaluation harness using RAGAS and custom metrics; integrated semantic personalization layer for response grounding.

→ stack

LangChainPineconeRAGMCPTesseract OCRPostgreSQLLoRA/PEFTRAGASHugging Face

🔷

IBM · India

Global technology and consulting, data engineering division

Jun 2022 – Aug 20242 yrs 2 mos

Data Engineer

📈 Improved data accuracy from 87% to 98.1% — built automated validation pipeline using Great Expectations with 200+ rules, enabling the ML team to ship models 3× faster by eliminating manual data cleaning sprints.

🏗️ Maintained 99.9% uptime across 40+ daily ML pipeline jobs — re-architected PySpark ETL on Hadoop with Docker + Kubernetes orchestration and Jenkins CI/CD, cutting incident response from 4 hours → 22 minutes.

🚀 Reduced feature engineering time by 60% — redesigned data ingestion to publish real-time feature streams via Apache Kafka, replacing nightly batch jobs that blocked 6 downstream ML experiments daily.

→ stack

PySparkHadoopApache KafkaGreat ExpectationsDockerKubernetesJenkinsPostgreSQL

🎓

Google Developer Student Club · India

University-level AI/ML community leadership

Sep 2021 – Jun 2022Lead

Lead — AI / ML / Data Science

👩‍🏫 Mentored 1,000+ students through hands-on NLP & Computer Vision projects, achieving an 85% end-to-end project completion rate on GCP deployments — highest in the club's history.

📚 Designed and delivered 12-session ML curriculum covering model training to cloud deployment; curriculum adopted by 2 other chapters in the following semester.

→ stack

TensorFlowPyTorchGoogle CloudNLPComputer Vision

verified impact / by the numbers

Real numbers,
real systems

LLM Inference Cost

0%↓

$14K → $4.2K/month via RAG optimization at ScaleUp Labs

Source: ScaleUp Labs · 2025

Data Accuracy

87% → 98.1% via automated Great Expectations validation at IBM

Source: IBM · 2022–2024

Pipeline Uptime

Across 40+ daily ML jobs over 18 months at IBM via K8s + CI/CD

Source: IBM · 2022–2024

Students Mentored

85% project completion rate · Curriculum adopted by 2 chapters

Source: Google DSC · 2021–2022

03 technical_skills / expertise

What I
build with

3+ years applying these tools in production — shipping RAG systems, ETL pipelines, and LLM evaluations at IBM and ScaleUp Labs.

🧠

LLM & GenAI

Production · 2 years

RAG Pipelines

LangChain / LlamaIndex

Fine-Tuning · LoRA / PEFT

Prompt Engineering

LLM Eval · RAGAS · TruLens

Hugging Face Transformers

Used in production at

ScaleUp LabsPersonal Projects

⚡

Data Engineering

Production · 3 years

Python · SQL

PySpark · Apache Spark

ETL Pipeline Design

PostgreSQL · Data Modeling

Apache Kafka

Data Quality · Great Expectations

Used in production at

IBM · 2 yrsScaleUp Labs

☁️

Cloud & MLOps

Production · 2.5 years

AWS · GCP · Azure

Pinecone · FAISS · Weaviate

Docker · Kubernetes

MLOps · Model Serving

Jenkins · CI/CD

Terraform · IaC

Used in production at

IBM · K8s clusterScaleUp Labs

04 projects / deployed_systems

AI I've
built & shipped

Artha Savvy · v1.2

⌥ GitHub ↗ Demo

Artha Savvy
AI Finance Agent

Conversational agent that analyzes personal finances and generates risk-aware plans via LLM reasoning over private financial documents — zero data leaves the user's environment.

→ Architecture

User query → Mistral-7B (self-hosted) → FAISS retriever over PDF corpus → Re-ranker → Plan generator → Structured JSON → React UI. All inference on-device.

→ Stack

LangChainMistral-7BFAISSReactFastAPI

80%Planning tasks automated

100%On-device · zero data egress

1.2sAvg response latency

artha-savvy.vercel.app · Finance Chat

▶

Watch Demo · 2m 14s

EU AI Act Scanner · v1.0

⌥ GitHub ↗ Demo

EU AI Act
Compliance Scanner

Automated auditing system that classifies AI services against EU AI Act risk tiers, maps requirements, and generates structured compliance reports — replacing a 2-week manual audit process.

→ Architecture

Service descriptor → GPT-4 risk classifier → EU Act requirement mapper → Gap analysis → PDF report generator. Processes 1 service in <4 minutes vs. 2-week manual baseline.

→ Stack

OpenAI GPT-4LangChainReportLabFastAPIStreamlit

60%Audit time reduced

<4 minPer service vs. 2 weeks

3 tiersRisk classification

eu-scanner.streamlit.app · Compliance Report

▶

Watch Demo · 1m 48s

RAG Eval Harness · Open Source

LLM Evaluation · Developer Tool

⌥ GitHub ↗ Docs

RAG Evaluation Harness

Open-source toolkit for measuring RAG system quality across faithfulness, answer relevancy, context precision, and hallucination rate — built from production experience optimizing the ScaleUp Labs pipeline.

→ Architecture

Test dataset loader → Multi-metric evaluator (RAGAS, TruLens, custom embedding checks) → Regression detector → HTML/JSON report exporter. Plugs into any LangChain or LlamaIndex pipeline with 3 lines of code.

→ Stack

PythonRAGASTruLensDeepEvalLangChainPytestJinja2

42%Hallucination reduction at ScaleUp

6Evaluation metrics tracked

3 linesTo integrate into any RAG pipeline

Academic
foundation

Where I've
built AI

Real numbers,
real systems

What I
build with

AI I've
built & shipped

How I architect
AI systems

Let's build
something
intelligent.

Academicfoundation

Where I'vebuilt AI

Real numbers,real systems

What Ibuild with

AI I'vebuilt & shipped

How I architectAI systems

Let's buildsomethingintelligent.

Academic
foundation

Where I've
built AI

Real numbers,
real systems

What I
build with

AI I've
built & shipped

How I architect
AI systems

Let's build
something
intelligent.