Lahari Karrotu

Software engineer

Backend & cloud systems · Healthcare technology

At Cigna I work on member-facing healthcare APIs and the infrastructure behind them — translation into FHIR, protection of sensitive fields, and the operational habits that keep large populations from seeing your mistakes. That same bias toward clarity shows up in how I build my own projects.

I gravitate toward the messy middle of engineering — where specifications meet legacy data, where latency and correctness trade off, and where "it works on my machine" is not the bar. Professionalism, to me, is taking that middle seriously: documentation, triage, and the judgment to know when to stop tuning and ship.

📄Download Resume Get in Touch View Projects

Where my attention goes

The work I seek out: boundaries, reliability, and problems that do not fit a single Jira label.

Regulated data & API shape

Making FHIR payloads, legacy sources, and client expectations line up — including what never leaves the server, and why.

Production and the story logs tell

Tracing failures across Lambda, databases, and gateways; caring about the trend line, not a single hero fix.

Infrastructure you can reason about

Terraform, CloudFormation, and IAM that a teammate can audit — not magic that only one person understands.

Side engineering with the same bar

Multimodal and vision projects where models, APIs, and error handling are designed together, not bolted on.

What I build

Capabilities I return to — at work and in my own time

Healthcare APIs — FHIR R4, bundle pagination, PHI masking, HIPAA-minded design

AWS Serverless — Lambda, DynamoDB, SQS, API Gateway, Terraform & CloudFormation

AI Healthcare Side Projects — prescription vision, RxNorm checks, audit-friendly pipelines

Computer Vision — virtual try-on, Azure CV → OpenCV fallbacks, realistic overlays

Production backends — Java, Python, Golang; Pega BPM rules; Splunk & CloudWatch triage

Full-stack tools — Next.js, FastAPI, PostgreSQL, GitHub Actions, Vercel & Azure

About

Context, not a pitch

I'm based in the United States. My day job is backend and cloud for healthcare: FHIR-shaped APIs, AWS serverless pieces, and the policies and incident practice that sit around them. I am as interested in how a team agrees on a contract as in how a query plan looks.

Away from that, I still build — multimodal assistants, computer vision, full-stack tools — because I learn by finishing things. Uniqueness is not a tagline; it's the combination of what you refuse to hand-wave and what you choose to study next.

Formal side: M.S. in Computer Science from Florida Tech (systems, data engineering, applied AI); B.Tech from KL University. Certifications — AWS Solutions Architect, ServiceNow CSA, Cisco CCNA — are useful shorthand; the proof is in how the systems behave.

Where I've worked

Roles where scope, compliance, and on-call reality mattered

Software Engineer

Cigna

Aug 2024 – Present

▸Build and maintain FHIR R4 REST APIs that translate legacy claims and eligibility into HL7 resources for the member app — serialization, bundle pagination, and versioned profiles at large member scale
▸Own a Golang-based masking pipeline in API Gateway that redacts PII/PHI (SSNs, diagnosis codes, insurance IDs) before responses reach clients, aligned with HIPAA minimum-necessary practice
▸Design and operate AWS serverless infrastructure for claims microservices: Lambda, DynamoDB key design, SQS DLQs, API Gateway throttling — Terraform and CloudFormation with least-privilege IAM across environments
▸Implement Pega BPM decision tables for adjudication and routing so business teams can update insurance rules independently
▸Lead production incident triage with Splunk; addressed HikariCP pool exhaustion and DynamoDB hot partitions — member login 5xx rate from 3.2% to under 0.5% over two sprint cycles
▸Partner across product, compliance, and engineering in SAFe Agile, including architecture reviews with distributed teams

Software Engineering Intern

Zoho

Aug 2021 – May 2022

▸Developed production J2EE SaaS backend modules in layered MVC, shipping features from database to API within Agile sprints
▸Designed a YAML-to-table JDBC data access layer replacing per-entity DAO boilerplate across 14 entities
▸Built authentication and sessions: BCrypt (work factor 12), OTP 2FA via JavaMail with SMTP TLS options, idle timeout for multi-tenant production
▸Optimized MySQL by fixing N+1 patterns with indexed JOINs — page loads from ~4.8s to under 200ms on realistic datasets

GitHub

Public rhythm — commits when I have something worth pushing

Commit Activity

Active

Real-time commit activity over the past year

Workflow orchestration

A live sketch of how layered pipelines fit together — useful for intuition, not a deployment diagram

uptime

99.9%

latency

<300ms

throughput

1K+ req/min

cache Hit Rate

95%

LLM Processing

Queries/min

Vector Search

Embeddings

API Gateway

Requests/sec

Frontend

Active Users

Distributed System

Real-time Processing

Live Workflow: This visualization demonstrates real-time AI pipeline orchestration—from LLM processing and vector search to API gateway and frontend delivery. Each step shows actual metrics and processing states, reflecting production system behavior with fault tolerance, monitoring, and scalable architecture.

System design snapshot

One detailed thread — e-commerce + AI — showing how I structure services and trade-offs

SmartBuy AI: Production E-Commerce Platform with AI Integration

A scalable full-stack eCommerce platform with AI-powered navigation, real-time inventory management, and optimized performance.

Architecture Overview

•Frontend: Next.js (SSR/SSG) with React, deployed on Vercel Edge Network for global CDN distribution
•Backend: FastAPI microservices with async/await for scalable request handling
•Database: PostgreSQL with connection pooling, fact/dimension tables for analytics, Redis cache layer (95% hit rate)
•Data Pipelines: Python + SQL ETL workflows with Airflow orchestration, data quality checks, and analytics-ready schemas
•AI Service: OpenAI API with request batching, response caching, and fallback mechanisms
•Vector Search: Pinecone for semantic product search with 50K+ product embeddings

Scalability & Performance

•Load Handling: Scalable architecture with horizontal scaling, connection pooling, and caching strategies
•Response Times: API latency <300ms (p95), AI queries <2s, page loads <1.2s
•Caching Strategy: Multi-layer (CDN, Redis, in-memory) reducing DB load by 80%
•Database: Read replicas for scaling, connection pooling (max 100 connections), query optimization
•Monitoring: Real-time metrics (Prometheus), error tracking (Sentry), APM (New Relic)

Key Design Decisions & Trade-offs

1. Microservices vs Monolith

Chose FastAPI microservices for independent scaling of AI service vs. core eCommerce logic, accepting added complexity for better resource utilization.

2. Vector DB Selection

Pinecone over self-hosted (FAISS/Milvus) for managed scalability and lower ops overhead, trading cost for reliability.

3. Caching Strategy

Multi-layer caching (CDN → Redis → DB) prioritized read performance, accepting eventual consistency for product data.

4. Failure Handling

Circuit breakers for AI API, graceful degradation to keyword search, retry logic with exponential backoff for improved reliability during service outages.

Scaling Strategy

Horizontal Scaling

FastAPI instances behind load balancer, auto-scaling based on CPU/memory (2-10 instances), stateless design for easy scaling.

Database Scaling

Read replicas for query distribution, connection pooling, query optimization, and planned sharding for 100K+ products.

Cost Optimization

AI request batching, response caching (60% cache hit rate), CDN for static assets, reducing infrastructure costs by 40%.

How I work

Habits and standards — not a sales sheet

End-to-end ownership

I prefer owning a slice from schema or contract through deployment and metrics. Side projects are how I keep that muscle — HealthScan, Blinds & Boundaries, SmartBuy, Resume Tailor — whole systems, not diagram-only ideas.

Stay with the failure mode

Production work taught me to read traces and logs before opinions. I care about why something broke for real users, not only that a graph went green again.

Depth across languages

Java, Python, Golang, TypeScript — each for what it is good at. Professionalism is knowing when FHIR and AWS are the story and when a small FastAPI service is the right tool.

Projects

Work I have carried from idea to something runnable — some experimental, all intentional

SmartBuy v2 — Multimodal AI Shopping Agent

Live

Full-Stack • 2024

SmartBuy v2 — Multimodal AI Shopping Agent

🎙️Interaction Mode

Voice + Webcam + Text

🧵Runtime

Persistent WebSocket

A real multimodal shopping agent with persistent WebSocket connectivity to Gemini Multimodal Live API for simultaneous voice, webcam context, and two-way interaction.

ReactTypeScriptGemini Multimodal Live APIWebSockets+3

Completed

AI/ML • 2025

HealthScan — AI Healthcare Assistant

🏗️Architecture

3-Engine Agent System

💊Medication Safety

RxNav/RxNorm Checks

Multimodal assistant that reads prescription photos, extracts medications with vision LLMs, checks interactions via RxNorm, and returns diet and lifestyle guidance — JWT auth, PII scrubbing, and audit-minded API design.

Next.js 15TypeScriptFastAPIReact Native+7

AI Resume Tailor — Full-Stack Job Application Platform

Completed

Full-Stack • 2025

AI Resume Tailor — Full-Stack Job Application Platform

🧱Architecture

Next.js API Routes Only

🤖Model

Anthropic Claude

A Vercel-only Next.js App Router platform that parses job descriptions with Claude AI, tailors resume DOCX output via XML manipulation, and logs applications to Google Sheets.

Next.js 14TypeScriptAnthropic Claude APIGoogle Sheets API+2

Blinds & Boundaries — AI Virtual Try-On Platform

Live

AI/ML • 2024

Blinds & Boundaries — AI Virtual Try-On Platform

🎯Detection Accuracy

88%

📸Images Processed

500+

Virtual try-on application for window blinds using computer vision and 3D rendering. Enables customers to visualize products in their actual space with realistic lighting and perspective. Deployed on Azure with real-time image processing.

ReactTypeScriptFastAPIPython+5

Blinds Pro - Advanced Visualization Platform

Completed

AI/ML • 2024

Blinds Pro - Advanced Visualization Platform

⚡Processing Speed

2.0s

📊Batch Processing

10x

Enhanced version of the virtual try-on application with additional features for professional interior design and commercial applications.

TypeScriptReactNext.jsThree.js+3

Production

AI/ML • 2024

Railway Predictive Maintenance System

🎯Model Accuracy

90%+

⚡Processing Latency

<200ms

Real-time predictive maintenance system for railway equipment using machine learning. Processes sensor data streams with TensorFlow LSTM models to predict equipment failures, enabling proactive maintenance and reducing downtime.

PythonTensorFlowApache SparkAWS Lambda+4

Live

Full-Stack • 2024

Auto Loan AI Processing System

🎯OCR Accuracy

95%+

⚡Processing Time

30-60s

AI-powered loan processing system using AWS Textract OCR and voice assistants, reducing processing time by 40% and improving accuracy.

ReactTypeScriptPythonFastAPI+5

Completed

Full-Stack • 2024

Fitness Transformation Application

📈User Retention

78%

🎯Voice Recognition Accuracy

94%

AI-driven fitness application with personalized recommendations, voice navigation, and intelligent workout planning using LLM APIs.

ReactTypeScriptNode.jsOpenAI API+3

AI Job Hunter - Intelligent Job Search Platform

Completed

Full-Stack • 2024

AI Job Hunter - Intelligent Job Search Platform

🎯Match Accuracy

88%

📊Job Discovery Rate

+60%

AI-powered job search and application platform that uses machine learning to match candidates with relevant opportunities and optimize application strategies.

TypeScriptReactNext.jsPython+4

Completed

Full-Stack • 2023

Taskify Pro - Smart Task Management

⚡Real-time Sync Latency

<50ms

👥Concurrent Users

1000+

Enterprise-grade task management application with real-time collaboration, AI-powered prioritization, and advanced notification systems.

JavaScriptNode.jsExpress.jsMongoDB+3

Completed

Full-Stack • 2023

Personal Portfolio & Showcase

🏆Lighthouse Score

100

⚡Page Load Time

0.8s

Comprehensive personal portfolio website showcasing professional projects, skills, and achievements with modern design and interactive features.

TypeScriptReactNext.jsTailwind CSS+1

Get in touch

If you are working on a hard systems problem, a regulated product, or something at the edge of useful AI, I am glad to read a thoughtful note. No need for a perfect subject line.

Email Me 📄Download Resume Connect on LinkedIn

Get In Touch

I'd love to hear from you! Let's discuss your next project.

Or email me directly at laharikarrotu@gmail.com

Elsewhere

LinkedIn GitHub Email

Currently turning over

• Agentic flows that earn their complexity — tool use, guardrails, and when not to call a model
• Event-driven boundaries and backpressure — the boring parts of "scale"
• Interop in healthcare — standards that help when they are implemented with care
• Small, sharp open-source contributions when a repo's problem statement is clear

Case Study: SmartBuy AI

How I shipped an AI shopping assistant end-to-end

Overview

SmartBuy AI embeds a conversational assistant inside a full-stack eCommerce experience. It handles search, recommendations, and navigation with a lightweight vector index and FastAPI services.

Problem

Users struggled to find products quickly and dropped off after slow search or poor recommendations.

Solution

Conversational assistant + semantic search backed by FastAPI, embeddings, and cached intents for sub-3s responses.

Tech Stack

Next.js, TypeScript, FastAPI, PostgreSQL, OpenAI API, Tailwind, Vercel

My Role

Architecture, API design, AI integration, frontend UX, deployment.

Results

• AI query responses in <3s with cached intents
• Faster product discovery via semantic search and voice/text navigation
• Production-ready deployment with monitoring and uptime focus

Architecture (high level)

Client (Next.js) → API Gateway (FastAPI) → LLM Orchestrator (OpenAI, LangChain)

Vector Store (embeddings) for semantic search + intent caching

PostgreSQL for products/orders · Background jobs for sync/index updates

Vercel + Azure/AWS services · Observability via logs/metrics

Architecture: ScanX Agent Loop

Frontend (Next.js) captures screenshots/specs → uploads to FastAPI

FastAPI routes to Orchestrator (LangChain + OpenAI) to parse layout and tasks

Vector store (PostgreSQL/pgvector) for similarity + retrieval

Action planner emits task cards → returned to UI for human-in-the-loop confirmation

Tech stack

What I reach for regularly — familiarity earned by use

Languages

JavaPythonGolangTypeScriptSQLJavaScript

Frameworks

Spring BootFastAPIReactNext.jsNode.jsHAPI FHIR

AWS

LambdaDynamoDBS3RDSSQSAPI GatewayCloudWatch

Azure & AI

App ServiceBlob StorageComputer VisionGeminiGPT-4oOpenCV

Data & enterprise

PostgreSQLMySQLOracleDB2Apache SparkDatabricksPega BPM

Observability & quality

SplunkNew RelicSonarQubePlaywright

Infra & delivery

TerraformCloudFormationDockerJenkinsGitHub ActionsSAFe Agile

Technical breadth

Grouped for clarity — not a contest of how many logos fit on a slide

🧩Full-Stack & Backend

☕Java / Spring Boot

🐍Python / FastAPI

🐹Golang

📘TypeScript / React / Next.js

🏥HAPI FHIR

🟢Node.js

🐘PostgreSQL / MySQL / DynamoDB

🔗REST / FHIR R4

🤖AI / ML & Agents

✨Gemini Pro / GPT-4o Vision

⛓️LangChain & agent pipelines

📚RAG & multimodal flows

🎭Playwright automation

📝Tesseract OCR / OpenCV

👁️Azure Computer Vision

📡Data Engineering & Pipelines

🗄️SQL (PostgreSQL, MySQL, Oracle, DB2)

⚡Apache Spark / Databricks

🔄ETL & data validation

⚙️DynamoDB access patterns

🔍Query tuning & indexing

📋Pega / enterprise BPM data

☁️Cloud & Infrastructure

☁️AWS Lambda · DynamoDB · S3 · RDS · SQS

🌐API Gateway & CloudWatch

🏗️Terraform & CloudFormation

🔷Azure App Service & Blob

🐳Docker

🔄Jenkins & GitHub Actions

📡Splunk & New Relic

▲Vercel (projects)

Education

Academic background in Computer Science

M.S. Computer Science — Systems, Data Engineering & Applied AI

Florida Institute of Technology

Aug 2022 – May 2024

GPA: 3.6 / 4.0

B.Tech Computer Science

KL University

Aug 2018 – May 2022

8.7 / 10

Certifications & Badges

Professional certifications and verified achievements

☁️

AWS Certified Solutions Architect – Associate

Amazon Web Services

View Credential

🛠️

ServiceNow Certified System Administrator

ServiceNow

🌐

Cisco CCNA: Switching, Routing & Wireless

Cisco

I Write Too

Thoughts on AI, distributed systems, and what I'm learning

📝

AI in 2026: Why Sovereignty, Smaller Models, and Smarter Data Will Matter More Than Bigger GPUs

An exploration of the shifting AI landscape, focusing on model sovereignty, efficient smaller models, and data quality over raw compute power.

Read on Medium

📝

The Hidden Energy Cost of AI: How LLMs and Distributed Systems Consume Power

An analysis of energy consumption patterns in large language models and distributed computing systems, exploring the environmental impact and optimization strategies.

Read on Medium

How I think about engineering

Trade-offs and patterns I have actually pushed through — not generic interview fodder

Backend Engineering

Scaling FastAPI for High-Throughput AI Workloads

How I optimized FastAPI microservices for scalable request handling, implementing connection pooling, async/await patterns, and request batching for AI API calls.

Async ArchitectureConnection PoolingRequest BatchingPerformance Optimization

Metrics:Optimized request handling, improved latency, cost-effective scaling

AI/ML Systems

Vector Database Optimization for Semantic Search

Deep dive into optimizing Pinecone vector search for 50K+ product embeddings, achieving 99.5% retrieval accuracy with <500ms query time through embedding compression and indexing strategies.

Vector DatabasesEmbedding OptimizationSemantic SearchRAG Systems

Metrics:50K+ embeddings, 99.5% accuracy, <500ms query time

Distributed Systems

Building Fault-Tolerant LLM Pipelines

Designing resilient AI workflows with circuit breakers, graceful degradation, retry logic, and fallback mechanisms for improved reliability during AI service outages.

Fault ToleranceCircuit BreakersGraceful DegradationError Handling

Metrics:Improved reliability, minimal monitoring overhead, robust error handling

Performance Engineering

Multi-Layer Caching Strategy for E-Commerce

Implementing CDN → Redis → Database caching layers, achieving 95% cache hit rate and reducing database load by 80% while maintaining data consistency.

Caching StrategiesCDN OptimizationRedisCache Invalidation

Metrics:95% cache hit rate, 80% DB load reduction, <1.2s page loads

Observability

Real-Time Monitoring for Production AI Systems

Building comprehensive monitoring with Prometheus, Sentry, and custom metrics for LLM pipelines, enabling real-time alerting and performance tracking with <100ms overhead.

MonitoringObservabilityAlertingPerformance Tracking

Metrics:Minimal monitoring overhead, real-time alerting, improved system reliability

Data Engineering

ETL Pipeline Design for Analytics at Scale

Architecting Python + SQL ETL pipelines processing 1,000+ structured records/day with data quality checks, fact/dimension table design, and Airflow orchestration, improving query performance by 35% and reducing data issues by 40%.

ETL PipelinesData ModelingSQL OptimizationData QualityAirflow

Metrics:1,000+ records/day, 35% query improvement, 40% fewer data issues