Lahari Karrotu
Software engineer
Backend & cloud systems · Healthcare technology
I gravitate toward the messy middle of engineering — where specifications meet legacy data, where latency and correctness trade off, and where "it works on my machine" is not the bar. Professionalism, to me, is taking that middle seriously: documentation, triage, and the judgment to know when to stop tuning and ship.

Where my attention goes
The work I seek out: boundaries, reliability, and problems that do not fit a single Jira label.
Regulated data & API shape
Making FHIR payloads, legacy sources, and client expectations line up — including what never leaves the server, and why.
Production and the story logs tell
Tracing failures across Lambda, databases, and gateways; caring about the trend line, not a single hero fix.
Infrastructure you can reason about
Terraform, CloudFormation, and IAM that a teammate can audit — not magic that only one person understands.
Side engineering with the same bar
Multimodal and vision projects where models, APIs, and error handling are designed together, not bolted on.
What I build
Capabilities I return to — at work and in my own time
About
Context, not a pitch
Where I've worked
Roles where scope, compliance, and on-call reality mattered
Software Engineer
Cigna
Aug 2024 – Present
- ▸Build and maintain FHIR R4 REST APIs that translate legacy claims and eligibility into HL7 resources for the member app — serialization, bundle pagination, and versioned profiles at large member scale
- ▸Own a Golang-based masking pipeline in API Gateway that redacts PII/PHI (SSNs, diagnosis codes, insurance IDs) before responses reach clients, aligned with HIPAA minimum-necessary practice
- ▸Design and operate AWS serverless infrastructure for claims microservices: Lambda, DynamoDB key design, SQS DLQs, API Gateway throttling — Terraform and CloudFormation with least-privilege IAM across environments
- ▸Implement Pega BPM decision tables for adjudication and routing so business teams can update insurance rules independently
- ▸Lead production incident triage with Splunk; addressed HikariCP pool exhaustion and DynamoDB hot partitions — member login 5xx rate from 3.2% to under 0.5% over two sprint cycles
- ▸Partner across product, compliance, and engineering in SAFe Agile, including architecture reviews with distributed teams
Software Engineering Intern
Zoho
Aug 2021 – May 2022
- ▸Developed production J2EE SaaS backend modules in layered MVC, shipping features from database to API within Agile sprints
- ▸Designed a YAML-to-table JDBC data access layer replacing per-entity DAO boilerplate across 14 entities
- ▸Built authentication and sessions: BCrypt (work factor 12), OTP 2FA via JavaMail with SMTP TLS options, idle timeout for multi-tenant production
- ▸Optimized MySQL by fixing N+1 patterns with indexed JOINs — page loads from ~4.8s to under 200ms on realistic datasets
GitHub
Public rhythm — commits when I have something worth pushing
Commit Activity
ActiveReal-time commit activity over the past year
Workflow orchestration
A live sketch of how layered pipelines fit together — useful for intuition, not a deployment diagram
LLM Processing
Vector Search
API Gateway
Frontend
Live Workflow: This visualization demonstrates real-time AI pipeline orchestration—from LLM processing and vector search to API gateway and frontend delivery. Each step shows actual metrics and processing states, reflecting production system behavior with fault tolerance, monitoring, and scalable architecture.
System design snapshot
One detailed thread — e-commerce + AI — showing how I structure services and trade-offs
SmartBuy AI: Production E-Commerce Platform with AI Integration
A scalable full-stack eCommerce platform with AI-powered navigation, real-time inventory management, and optimized performance.
Architecture Overview
- •Frontend: Next.js (SSR/SSG) with React, deployed on Vercel Edge Network for global CDN distribution
- •Backend: FastAPI microservices with async/await for scalable request handling
- •Database: PostgreSQL with connection pooling, fact/dimension tables for analytics, Redis cache layer (95% hit rate)
- •Data Pipelines: Python + SQL ETL workflows with Airflow orchestration, data quality checks, and analytics-ready schemas
- •AI Service: OpenAI API with request batching, response caching, and fallback mechanisms
- •Vector Search: Pinecone for semantic product search with 50K+ product embeddings
Scalability & Performance
- •Load Handling: Scalable architecture with horizontal scaling, connection pooling, and caching strategies
- •Response Times: API latency <300ms (p95), AI queries <2s, page loads <1.2s
- •Caching Strategy: Multi-layer (CDN, Redis, in-memory) reducing DB load by 80%
- •Database: Read replicas for scaling, connection pooling (max 100 connections), query optimization
- •Monitoring: Real-time metrics (Prometheus), error tracking (Sentry), APM (New Relic)
Key Design Decisions & Trade-offs
1. Microservices vs Monolith
Chose FastAPI microservices for independent scaling of AI service vs. core eCommerce logic, accepting added complexity for better resource utilization.
2. Vector DB Selection
Pinecone over self-hosted (FAISS/Milvus) for managed scalability and lower ops overhead, trading cost for reliability.
3. Caching Strategy
Multi-layer caching (CDN → Redis → DB) prioritized read performance, accepting eventual consistency for product data.
4. Failure Handling
Circuit breakers for AI API, graceful degradation to keyword search, retry logic with exponential backoff for improved reliability during service outages.
Scaling Strategy
Horizontal Scaling
FastAPI instances behind load balancer, auto-scaling based on CPU/memory (2-10 instances), stateless design for easy scaling.
Database Scaling
Read replicas for query distribution, connection pooling, query optimization, and planned sharding for 100K+ products.
Cost Optimization
AI request batching, response caching (60% cache hit rate), CDN for static assets, reducing infrastructure costs by 40%.
How I work
Habits and standards — not a sales sheet
End-to-end ownership
I prefer owning a slice from schema or contract through deployment and metrics. Side projects are how I keep that muscle — HealthScan, Blinds & Boundaries, SmartBuy, Resume Tailor — whole systems, not diagram-only ideas.
Stay with the failure mode
Production work taught me to read traces and logs before opinions. I care about why something broke for real users, not only that a graph went green again.
Depth across languages
Java, Python, Golang, TypeScript — each for what it is good at. Professionalism is knowing when FHIR and AWS are the story and when a small FastAPI service is the right tool.
Projects
Work I have carried from idea to something runnable — some experimental, all intentional

SmartBuy v2 — Multimodal AI Shopping Agent
A real multimodal shopping agent with persistent WebSocket connectivity to Gemini Multimodal Live API for simultaneous voice, webcam context, and two-way interaction.

HealthScan — AI Healthcare Assistant
Multimodal assistant that reads prescription photos, extracts medications with vision LLMs, checks interactions via RxNorm, and returns diet and lifestyle guidance — JWT auth, PII scrubbing, and audit-minded API design.

AI Resume Tailor — Full-Stack Job Application Platform
A Vercel-only Next.js App Router platform that parses job descriptions with Claude AI, tailors resume DOCX output via XML manipulation, and logs applications to Google Sheets.

Blinds & Boundaries — AI Virtual Try-On Platform
Virtual try-on application for window blinds using computer vision and 3D rendering. Enables customers to visualize products in their actual space with realistic lighting and perspective. Deployed on Azure with real-time image processing.
Railway Predictive Maintenance System
Real-time predictive maintenance system for railway equipment using machine learning. Processes sensor data streams with TensorFlow LSTM models to predict equipment failures, enabling proactive maintenance and reducing downtime.

Get in touch
If you are working on a hard systems problem, a regulated product, or something at the edge of useful AI, I am glad to read a thoughtful note. No need for a perfect subject line.
Get In Touch
I'd love to hear from you! Let's discuss your next project.
Or email me directly at laharikarrotu@gmail.com
Currently turning over
- • Agentic flows that earn their complexity — tool use, guardrails, and when not to call a model
- • Event-driven boundaries and backpressure — the boring parts of "scale"
- • Interop in healthcare — standards that help when they are implemented with care
- • Small, sharp open-source contributions when a repo's problem statement is clear
Case Study: SmartBuy AI
How I shipped an AI shopping assistant end-to-end
Overview
SmartBuy AI embeds a conversational assistant inside a full-stack eCommerce experience. It handles search, recommendations, and navigation with a lightweight vector index and FastAPI services.
Problem
Users struggled to find products quickly and dropped off after slow search or poor recommendations.
Solution
Conversational assistant + semantic search backed by FastAPI, embeddings, and cached intents for sub-3s responses.
Tech Stack
Next.js, TypeScript, FastAPI, PostgreSQL, OpenAI API, Tailwind, Vercel
My Role
Architecture, API design, AI integration, frontend UX, deployment.
Results
- • AI query responses in <3s with cached intents
- • Faster product discovery via semantic search and voice/text navigation
- • Production-ready deployment with monitoring and uptime focus
Architecture (high level)
Architecture: ScanX Agent Loop
Tech stack
What I reach for regularly — familiarity earned by use
Technical breadth
Grouped for clarity — not a contest of how many logos fit on a slide
🧩Full-Stack & Backend
🤖AI / ML & Agents
📡Data Engineering & Pipelines
☁️Cloud & Infrastructure
Education
Academic background in Computer Science
M.S. Computer Science — Systems, Data Engineering & Applied AI
Florida Institute of Technology
Aug 2022 – May 2024
GPA: 3.6 / 4.0
B.Tech Computer Science
KL University
Aug 2018 – May 2022
8.7 / 10
Certifications & Badges
Professional certifications and verified achievements
ServiceNow Certified System Administrator
ServiceNow
Cisco CCNA: Switching, Routing & Wireless
Cisco
I Write Too
Thoughts on AI, distributed systems, and what I'm learning
How I think about engineering
Trade-offs and patterns I have actually pushed through — not generic interview fodder
Scaling FastAPI for High-Throughput AI Workloads
How I optimized FastAPI microservices for scalable request handling, implementing connection pooling, async/await patterns, and request batching for AI API calls.
Vector Database Optimization for Semantic Search
Deep dive into optimizing Pinecone vector search for 50K+ product embeddings, achieving 99.5% retrieval accuracy with <500ms query time through embedding compression and indexing strategies.
Building Fault-Tolerant LLM Pipelines
Designing resilient AI workflows with circuit breakers, graceful degradation, retry logic, and fallback mechanisms for improved reliability during AI service outages.
Multi-Layer Caching Strategy for E-Commerce
Implementing CDN → Redis → Database caching layers, achieving 95% cache hit rate and reducing database load by 80% while maintaining data consistency.
Real-Time Monitoring for Production AI Systems
Building comprehensive monitoring with Prometheus, Sentry, and custom metrics for LLM pipelines, enabling real-time alerting and performance tracking with <100ms overhead.
ETL Pipeline Design for Analytics at Scale
Architecting Python + SQL ETL pipelines processing 1,000+ structured records/day with data quality checks, fact/dimension table design, and Airflow orchestration, improving query performance by 35% and reducing data issues by 40%.



