Software Engineer

|
Backend Services | Data Pipelines | AI-Enabled Applications

📋 Quick Summary:

  • Built FastAPI services processing 2,000+ daily requests (Arkatech)
  • Optimized PostgreSQL queries: 6s → 2s (67% improvement)
  • Improved API success rate: 87% → 98% (Anguliyam)
  • Reduced ETL runtime: 42min → 28min (33% faster, Cognizant)
  • AI projects: 85% accuracy (ScanX), 88% detection (Blinds), 82% F1 score (ATS Resume)
Lahari Karrotu

Impact at a Glance

2,000+
API Requests/Day
Production System
🚀
67%
Load Time Improved
6s → 2s
98%
API Success Rate
87% → 98%
📊
33%
ETL Runtime Saved
42min → 28min
🎯
85%
AI Accuracy
ScanX: 100+ interfaces
👁️
88%
Detection Rate
Blinds: 150 test images
🧠
82%
NLP Model F1
ATS: 500+ job posts
💾
78%
Cache Speedup
4.1s → 0.9s

What I Build

Quick scan: Full-stack AI systems, production APIs, and data pipelines

LLM Agents & Conversational Systems
Full-Stack AI Applications (Next.js + FastAPI)
Real-Time Voice Assistants
Distributed Data & Stream Processing Pipelines
Cloud-Native Deployments (Azure, AWS)
MLOps (MLflow, Docker, CI/CD, Terraform)

About

Building reliable systems for production environments

Software Engineer with experience building and maintaining backend services, data pipelines, and AI-enabled applications in production. My work focuses on Python and FastAPI, with hands-on experience deploying and operating systems on AWS and Azure.
I have worked on API design, database optimization, and AI/ML service integration, and I'm comfortable supporting applications through deployment, monitoring, and ongoing improvements.
I prefer building systems that are reliable, maintainable, and designed with real-world constraints in mind.

Professional Experience

Building production systems with measurable impact

Software Engineer

Arkatech Solutions

May 2025 – Present

  • Built FastAPI services integrating OpenAI and Azure Vision APIs, processing 2,000+ daily requests for document extraction
  • Optimized slow PostgreSQL queries by adding composite indexes—reduced dashboard load time from 6s to 2s
  • Implemented error handling with exponential backoff for API rate limits and circuit breakers for service failures
  • Deployed to Azure App Service with GitHub Actions CI/CD, configured Key Vault for secrets and Application Insights for monitoring

Software Engineer Intern

Anguliyam AI Solutions

Jun 2024 – May 2025

  • Developed voice navigation system with FastAPI backend integrated with Google Speech-to-Text API
  • Added Redis caching layer to reduce API response time from 3s to 800ms under load
  • Built JWT authentication with role-based access control for admin and standard users
  • Fixed production bug where Azure Vision API calls failed silently—added retry logic improving success rate from 87% to 98%

Software Engineer Intern

Cognizant Technology Solutions

Jan 2022 – Aug 2022

  • Worked on AWS Lambda functions processing S3 uploads, transforming CSV/JSON data with Pandas before DynamoDB loading
  • Contributed to PySpark jobs on EMR processing 200-500GB daily logs with schema validation before Redshift loading
  • Built Python ETL pipeline using SQLAlchemy—optimized batch size reducing runtime from 42min to 28min
  • Debugged Spark job OutOfMemoryError by tuning executor memory and partition count with senior engineer guidance

Data Engineering Intern

EPAM Systems

Dec 2020 – Mar 2021

  • Built Python ETL pipeline migrating 500K+ customer records from MySQL to PostgreSQL with data validation
  • Optimized slow analytics query by adding composite index—reduced execution time from 8min to 2.5min
  • Developed data quality checks using Pandas to catch nulls, duplicates, and integrity violations before loading
  • Collaborated on star schema design and documented ETL workflows

GitHub Activity

Live commit activity and open source contributions

Commit Activity

Active
GitHub Commit Activity Graph

Real-time commit activity over the past year

Real-Time Workflow Orchestration

Live visualization of production AI pipelines and distributed system workflows

uptime
99.9%
latency
<300ms
throughput
1K+ req/min
cache Hit Rate
95%

LLM Processing

Queries/min
0

Vector Search

Embeddings
0

API Gateway

Requests/sec
0

Frontend

Active Users
0
Distributed System
Real-time Processing

Live Workflow: This visualization demonstrates real-time AI pipeline orchestration—from LLM processing and vector search to API gateway and frontend delivery. Each step shows actual metrics and processing states, reflecting production system behavior with fault tolerance, monitoring, and scalable architecture.

System Design

Architecture, scalability, and distributed systems expertise

SmartBuy AI: Production E-Commerce Platform with AI Integration

A scalable full-stack eCommerce platform with AI-powered navigation, real-time inventory management, and optimized performance.

Architecture Overview

  • Frontend: Next.js (SSR/SSG) with React, deployed on Vercel Edge Network for global CDN distribution
  • Backend: FastAPI microservices with async/await for scalable request handling
  • Database: PostgreSQL with connection pooling, fact/dimension tables for analytics, Redis cache layer (95% hit rate)
  • Data Pipelines: Python + SQL ETL workflows with Airflow orchestration, data quality checks, and analytics-ready schemas
  • AI Service: OpenAI API with request batching, response caching, and fallback mechanisms
  • Vector Search: Pinecone for semantic product search with 50K+ product embeddings

Scalability & Performance

  • Load Handling: Scalable architecture with horizontal scaling, connection pooling, and caching strategies
  • Response Times: API latency <300ms (p95), AI queries <2s, page loads <1.2s
  • Caching Strategy: Multi-layer (CDN, Redis, in-memory) reducing DB load by 80%
  • Database: Read replicas for scaling, connection pooling (max 100 connections), query optimization
  • Monitoring: Real-time metrics (Prometheus), error tracking (Sentry), APM (New Relic)

Key Design Decisions & Trade-offs

1. Microservices vs Monolith

Chose FastAPI microservices for independent scaling of AI service vs. core eCommerce logic, accepting added complexity for better resource utilization.

2. Vector DB Selection

Pinecone over self-hosted (FAISS/Milvus) for managed scalability and lower ops overhead, trading cost for reliability.

3. Caching Strategy

Multi-layer caching (CDN → Redis → DB) prioritized read performance, accepting eventual consistency for product data.

4. Failure Handling

Circuit breakers for AI API, graceful degradation to keyword search, retry logic with exponential backoff for improved reliability during service outages.

Scaling Strategy

Horizontal Scaling

FastAPI instances behind load balancer, auto-scaling based on CPU/memory (2-10 instances), stateless design for easy scaling.

Database Scaling

Read replicas for query distribution, connection pooling, query optimization, and planned sharding for 100K+ products.

Cost Optimization

AI request batching, response caching (60% cache hit rate), CDN for static assets, reducing infrastructure costs by 40%.

Why Hire Me

Real impact, production systems, measurable results

🚀

Production-Ready Code

Built FastAPI services processing 2,000+ daily requests. Experience with real-world constraints: latency, reliability, monitoring, and deployment on AWS/Azure.

Full-Stack Expertise

Python/FastAPI backend, React/Next.js frontend, PostgreSQL/MongoDB, AWS/Azure cloud. End-to-end ownership from design to deployment with CI/CD pipelines.

🎯

Problem Solver

Optimized PostgreSQL queries reducing dashboard load time by 67% (6s→2s). Improved Azure Vision API success rate from 87% to 98%. Reduced ETL pipeline runtime by 33% (42min→28min).

Projects

Production-grade applications demonstrating expertise in full-stack development, AI/ML integration, and cloud solutions

SmartBuy AI eCommerce Platform
Live
Full-Stack2024

SmartBuy AI eCommerce Platform

⚛️Tech Stack
Next.js + FastAPI
🗄️Database
PostgreSQL + Redis

Production eCommerce platform with AI-powered navigation and real-time inventory management. Features semantic search, personalized recommendations, and secure payment integration. Built with React, Next.js, FastAPI, and PostgreSQL.

ReactTypeScriptNext.jsFastAPI+5
ScanX / ScanToActionAI
Completed
AI/ML2025

ScanX / ScanToActionAI

🎯UI Detection Accuracy
85%
📱Test Interfaces
100+

UI agent that turns screenshots and specs into actionable workflows with guided UX.

Next.jsTypeScriptFastAPIOpenAI API+2
ATS-Personalized Resume Generator
Completed
AI/ML2024

ATS-Personalized Resume Generator

🧠NER Model F1 Score
82%
📄Resume Parser Accuracy
90%

AI-powered resume optimization tool that analyzes job descriptions using NLP and generates ATS-optimized resumes. Extracts key requirements, matches skills, and creates personalized resumes with high compatibility scores.

PythonFastAPINLP LibrariesMachine Learning+4
Blinds & Boundaries - Virtual Try-On Application
Live
AI/ML2024

Blinds & Boundaries - Virtual Try-On Application

🎯Detection Accuracy
88%
📸Images Processed
500+

Virtual try-on application for window blinds using computer vision and 3D rendering. Enables customers to visualize products in their actual space with realistic lighting and perspective. Deployed on Azure with real-time image processing.

ReactTypeScriptFastAPIPython+5
Blinds Pro - Advanced Visualization Platform
Completed
AI/ML2024

Blinds Pro - Advanced Visualization Platform

Processing Speed
2.0s
📊Batch Processing
10x

Enhanced version of the virtual try-on application with additional features for professional interior design and commercial applications.

TypeScriptReactNext.jsThree.js+3
Railway Predictive Maintenance System
Production
AI/ML2024

Railway Predictive Maintenance System

🎯Model Accuracy
90%+
Processing Latency
<200ms

Real-time predictive maintenance system for railway equipment using machine learning. Processes sensor data streams with TensorFlow LSTM models to predict equipment failures, enabling proactive maintenance and reducing downtime.

PythonTensorFlowApache SparkAWS Lambda+4
Auto Loan AI Processing System
Live
Full-Stack2024

Auto Loan AI Processing System

🎯OCR Accuracy
95%+
Processing Time
30-60s

AI-powered loan processing system using AWS Textract OCR and voice assistants, reducing processing time by 40% and improving accuracy.

ReactTypeScriptPythonFastAPI+5
Fitness Transformation Application
Completed
Full-Stack2024

Fitness Transformation Application

📈User Retention
78%
🎯Voice Recognition Accuracy
94%

AI-driven fitness application with personalized recommendations, voice navigation, and intelligent workout planning using LLM APIs.

ReactTypeScriptNode.jsOpenAI API+3
AI Job Hunter - Intelligent Job Search Platform
Completed
Full-Stack2024

AI Job Hunter - Intelligent Job Search Platform

🎯Match Accuracy
88%
📊Job Discovery Rate
+60%

AI-powered job search and application platform that uses machine learning to match candidates with relevant opportunities and optimize application strategies.

TypeScriptReactNext.jsPython+4
Taskify Pro - Smart Task Management
Completed
Full-Stack2023

Taskify Pro - Smart Task Management

Real-time Sync Latency
<50ms
👥Concurrent Users
1000+

Enterprise-grade task management application with real-time collaboration, AI-powered prioritization, and advanced notification systems.

JavaScriptNode.jsExpress.jsMongoDB+3
Personal Portfolio & Showcase
Completed
Full-Stack2023

Personal Portfolio & Showcase

🏆Lighthouse Score
100
Page Load Time
0.8s

Comprehensive personal portfolio website showcasing professional projects, skills, and achievements with modern design and interactive features.

TypeScriptReactNext.jsTailwind CSS+1

Let's Work Together

Open to new opportunities, collaborative projects, and technical challenges. Available for full-time roles in Software Engineering, Backend Development, and AI/ML Systems.

Get In Touch

I'd love to hear from you! Let's discuss your next project.

Or email me directly at laharikarrothu@gmail.com

Professional Profiles

What I’m learning now

  • • Azure Functions + Durable orchestrations for AI workflows
  • • Evaluating LLM agents with guardrails and tracing
  • • Stream-processing patterns with Kafka + Flink

Case Study: SmartBuy AI

How I shipped an AI shopping assistant end-to-end

Overview

SmartBuy AI embeds a conversational assistant inside a full-stack eCommerce experience. It handles search, recommendations, and navigation with a lightweight vector index and FastAPI services.

Problem

Users struggled to find products quickly and dropped off after slow search or poor recommendations.

Solution

Conversational assistant + semantic search backed by FastAPI, embeddings, and cached intents for sub-3s responses.

Tech Stack

Next.js, TypeScript, FastAPI, PostgreSQL, OpenAI API, Tailwind, Vercel

My Role

Architecture, API design, AI integration, frontend UX, deployment.

Results

  • • AI query responses in <3s with cached intents
  • • Faster product discovery via semantic search and voice/text navigation
  • • Production-ready deployment with monitoring and uptime focus

Architecture (high level)

Client (Next.js) → API Gateway (FastAPI) → LLM Orchestrator (OpenAI, LangChain)
Vector Store (embeddings) for semantic search + intent caching
PostgreSQL for products/orders · Background jobs for sync/index updates
Vercel + Azure/AWS services · Observability via logs/metrics
Next.js UIFastAPI GatewayLLM (OpenAI)LangChain/RAGVector StorePostgreSQL

Architecture: ScanX Agent Loop

Frontend (Next.js) captures screenshots/specs → uploads to FastAPI
FastAPI routes to Orchestrator (LangChain + OpenAI) to parse layout and tasks
Vector store (PostgreSQL/pgvector) for similarity + retrieval
Action planner emits task cards → returned to UI for human-in-the-loop confirmation
Next.js UIFastAPI IngestLLM OrchestratorParser / PlannerVector StoreTask Cards UI

Tech Stack

The tools I ship with

Languages
PythonJavaScriptTypeScriptSQL
Frontend
ReactNext.jsTailwind
Backend
FastAPINode.jsExpress.jsREST APIs
AI/ML
OpenAI APILangChainAzure Computer VisionOpenCV
Cloud
AWS (Lambda, S3, EC2)Azure (App Service, Blob Storage)DockerGitHub Actions
Data
Apache SparkPySparkETL PipelinesPostgreSQLMongoDBRedis

Technical Expertise

Technologies and tools I work with

🧩Full-Stack & Backend

⚛️React
Next.js
📘TypeScript
🟢Node.js
FastAPI
🐍Python
🐘PostgreSQL
🍃MongoDB
🧠Redis
🔗REST & GraphQL APIs

🤖AI / ML & Agents

🤖LLMs & Agents
OpenAI API
⛓️LangChain
📚Vector DBs (Pinecone/FAISS)
🧭Embeddings & RAG
🧪MLflow
🧠TensorFlow
🔥PyTorch

📡Data Engineering & Pipelines

🗄️SQL (PostgreSQL, MySQL)
Apache Spark
📊Kafka
🌪️Apache Airflow
❄️Snowflake
📊Data Modeling (Fact/Dimension)
🔄ETL/ELT Pipelines
Data Quality & Validation
🔍Data Lineage & Observability

☁️Cloud & Infrastructure

☁️AWS (Lambda, S3, Glue)
🔷Azure (Functions, App Service)
🐳Docker
Kubernetes
🌍Terraform
🔄CI/CD
Vercel
📡Monitoring & Logging

Education

Academic background in Computer Science

Master of Science in Computer Science

Florida Institute of Technology

2024

GPA: 3.5/4.0

Bachelor of Science in Computer Science

KL University

2022

Certifications & Badges

Professional certifications and verified achievements

🧩

HackerRank Problem Solving

HackerRank

🗄️

HackerRank SQL (Intermediate)

HackerRank

🧱

HackerRank Data Structures

HackerRank

📊

HackerRank Data Science

HackerRank

🛰️

Big Data Specialization

Undergraduate Program

Technical Writing

Articles and insights on AI, distributed systems, and software engineering

Technical Deep Dives

In-depth analysis of system design, performance optimization, and engineering decisions

Backend Engineering

Scaling FastAPI for High-Throughput AI Workloads

How I optimized FastAPI microservices for scalable request handling, implementing connection pooling, async/await patterns, and request batching for AI API calls.

Async ArchitectureConnection PoolingRequest BatchingPerformance Optimization
Metrics:Optimized request handling, improved latency, cost-effective scaling
AI/ML Systems

Vector Database Optimization for Semantic Search

Deep dive into optimizing Pinecone vector search for 50K+ product embeddings, achieving 99.5% retrieval accuracy with &lt;500ms query time through embedding compression and indexing strategies.

Vector DatabasesEmbedding OptimizationSemantic SearchRAG Systems
Metrics:50K+ embeddings, 99.5% accuracy, &lt;500ms query time
Distributed Systems

Building Fault-Tolerant LLM Pipelines

Designing resilient AI workflows with circuit breakers, graceful degradation, retry logic, and fallback mechanisms for improved reliability during AI service outages.

Fault ToleranceCircuit BreakersGraceful DegradationError Handling
Metrics:Improved reliability, minimal monitoring overhead, robust error handling
Performance Engineering

Multi-Layer Caching Strategy for E-Commerce

Implementing CDN → Redis → Database caching layers, achieving 95% cache hit rate and reducing database load by 80% while maintaining data consistency.

Caching StrategiesCDN OptimizationRedisCache Invalidation
Metrics:95% cache hit rate, 80% DB load reduction, &lt;1.2s page loads
Observability

Real-Time Monitoring for Production AI Systems

Building comprehensive monitoring with Prometheus, Sentry, and custom metrics for LLM pipelines, enabling real-time alerting and performance tracking with &lt;100ms overhead.

MonitoringObservabilityAlertingPerformance Tracking
Metrics:Minimal monitoring overhead, real-time alerting, improved system reliability
Data Engineering

ETL Pipeline Design for Analytics at Scale

Architecting Python + SQL ETL pipelines processing 1,000+ structured records/day with data quality checks, fact/dimension table design, and Airflow orchestration, improving query performance by 35% and reducing data issues by 40%.

ETL PipelinesData ModelingSQL OptimizationData QualityAirflow
Metrics:1,000+ records/day, 35% query improvement, 40% fewer data issues