Software Engineer
📋 Quick Summary:
- Built FastAPI services processing 2,000+ daily requests (Arkatech)
- Optimized PostgreSQL queries: 6s → 2s (67% improvement)
- Improved API success rate: 87% → 98% (Anguliyam)
- Reduced ETL runtime: 42min → 28min (33% faster, Cognizant)
- AI projects: 85% accuracy (ScanX), 88% detection (Blinds), 82% F1 score (ATS Resume)

Impact at a Glance
What I Build
Quick scan: Full-stack AI systems, production APIs, and data pipelines
About
Building reliable systems for production environments
Professional Experience
Building production systems with measurable impact
Software Engineer
Arkatech Solutions
May 2025 – Present
- ▸Built FastAPI services integrating OpenAI and Azure Vision APIs, processing 2,000+ daily requests for document extraction
- ▸Optimized slow PostgreSQL queries by adding composite indexes—reduced dashboard load time from 6s to 2s
- ▸Implemented error handling with exponential backoff for API rate limits and circuit breakers for service failures
- ▸Deployed to Azure App Service with GitHub Actions CI/CD, configured Key Vault for secrets and Application Insights for monitoring
Software Engineer Intern
Anguliyam AI Solutions
Jun 2024 – May 2025
- ▸Developed voice navigation system with FastAPI backend integrated with Google Speech-to-Text API
- ▸Added Redis caching layer to reduce API response time from 3s to 800ms under load
- ▸Built JWT authentication with role-based access control for admin and standard users
- ▸Fixed production bug where Azure Vision API calls failed silently—added retry logic improving success rate from 87% to 98%
Software Engineer Intern
Cognizant Technology Solutions
Jan 2022 – Aug 2022
- ▸Worked on AWS Lambda functions processing S3 uploads, transforming CSV/JSON data with Pandas before DynamoDB loading
- ▸Contributed to PySpark jobs on EMR processing 200-500GB daily logs with schema validation before Redshift loading
- ▸Built Python ETL pipeline using SQLAlchemy—optimized batch size reducing runtime from 42min to 28min
- ▸Debugged Spark job OutOfMemoryError by tuning executor memory and partition count with senior engineer guidance
Data Engineering Intern
EPAM Systems
Dec 2020 – Mar 2021
- ▸Built Python ETL pipeline migrating 500K+ customer records from MySQL to PostgreSQL with data validation
- ▸Optimized slow analytics query by adding composite index—reduced execution time from 8min to 2.5min
- ▸Developed data quality checks using Pandas to catch nulls, duplicates, and integrity violations before loading
- ▸Collaborated on star schema design and documented ETL workflows
GitHub Activity
Live commit activity and open source contributions
Commit Activity
ActiveReal-time commit activity over the past year
Real-Time Workflow Orchestration
Live visualization of production AI pipelines and distributed system workflows
LLM Processing
Vector Search
API Gateway
Frontend
Live Workflow: This visualization demonstrates real-time AI pipeline orchestration—from LLM processing and vector search to API gateway and frontend delivery. Each step shows actual metrics and processing states, reflecting production system behavior with fault tolerance, monitoring, and scalable architecture.
System Design
Architecture, scalability, and distributed systems expertise
SmartBuy AI: Production E-Commerce Platform with AI Integration
A scalable full-stack eCommerce platform with AI-powered navigation, real-time inventory management, and optimized performance.
Architecture Overview
- •Frontend: Next.js (SSR/SSG) with React, deployed on Vercel Edge Network for global CDN distribution
- •Backend: FastAPI microservices with async/await for scalable request handling
- •Database: PostgreSQL with connection pooling, fact/dimension tables for analytics, Redis cache layer (95% hit rate)
- •Data Pipelines: Python + SQL ETL workflows with Airflow orchestration, data quality checks, and analytics-ready schemas
- •AI Service: OpenAI API with request batching, response caching, and fallback mechanisms
- •Vector Search: Pinecone for semantic product search with 50K+ product embeddings
Scalability & Performance
- •Load Handling: Scalable architecture with horizontal scaling, connection pooling, and caching strategies
- •Response Times: API latency <300ms (p95), AI queries <2s, page loads <1.2s
- •Caching Strategy: Multi-layer (CDN, Redis, in-memory) reducing DB load by 80%
- •Database: Read replicas for scaling, connection pooling (max 100 connections), query optimization
- •Monitoring: Real-time metrics (Prometheus), error tracking (Sentry), APM (New Relic)
Key Design Decisions & Trade-offs
1. Microservices vs Monolith
Chose FastAPI microservices for independent scaling of AI service vs. core eCommerce logic, accepting added complexity for better resource utilization.
2. Vector DB Selection
Pinecone over self-hosted (FAISS/Milvus) for managed scalability and lower ops overhead, trading cost for reliability.
3. Caching Strategy
Multi-layer caching (CDN → Redis → DB) prioritized read performance, accepting eventual consistency for product data.
4. Failure Handling
Circuit breakers for AI API, graceful degradation to keyword search, retry logic with exponential backoff for improved reliability during service outages.
Scaling Strategy
Horizontal Scaling
FastAPI instances behind load balancer, auto-scaling based on CPU/memory (2-10 instances), stateless design for easy scaling.
Database Scaling
Read replicas for query distribution, connection pooling, query optimization, and planned sharding for 100K+ products.
Cost Optimization
AI request batching, response caching (60% cache hit rate), CDN for static assets, reducing infrastructure costs by 40%.
Why Hire Me
Real impact, production systems, measurable results
Production-Ready Code
Built FastAPI services processing 2,000+ daily requests. Experience with real-world constraints: latency, reliability, monitoring, and deployment on AWS/Azure.
Full-Stack Expertise
Python/FastAPI backend, React/Next.js frontend, PostgreSQL/MongoDB, AWS/Azure cloud. End-to-end ownership from design to deployment with CI/CD pipelines.
Problem Solver
Optimized PostgreSQL queries reducing dashboard load time by 67% (6s→2s). Improved Azure Vision API success rate from 87% to 98%. Reduced ETL pipeline runtime by 33% (42min→28min).
Projects
Production-grade applications demonstrating expertise in full-stack development, AI/ML integration, and cloud solutions

SmartBuy AI eCommerce Platform
Production eCommerce platform with AI-powered navigation and real-time inventory management. Features semantic search, personalized recommendations, and secure payment integration. Built with React, Next.js, FastAPI, and PostgreSQL.

ATS-Personalized Resume Generator
AI-powered resume optimization tool that analyzes job descriptions using NLP and generates ATS-optimized resumes. Extracts key requirements, matches skills, and creates personalized resumes with high compatibility scores.

Blinds & Boundaries - Virtual Try-On Application
Virtual try-on application for window blinds using computer vision and 3D rendering. Enables customers to visualize products in their actual space with realistic lighting and perspective. Deployed on Azure with real-time image processing.
Railway Predictive Maintenance System
Real-time predictive maintenance system for railway equipment using machine learning. Processes sensor data streams with TensorFlow LSTM models to predict equipment failures, enabling proactive maintenance and reducing downtime.

Let's Work Together
Open to new opportunities, collaborative projects, and technical challenges. Available for full-time roles in Software Engineering, Backend Development, and AI/ML Systems.
Get In Touch
I'd love to hear from you! Let's discuss your next project.
Or email me directly at laharikarrothu@gmail.com
What I’m learning now
- • Azure Functions + Durable orchestrations for AI workflows
- • Evaluating LLM agents with guardrails and tracing
- • Stream-processing patterns with Kafka + Flink
Case Study: SmartBuy AI
How I shipped an AI shopping assistant end-to-end
Overview
SmartBuy AI embeds a conversational assistant inside a full-stack eCommerce experience. It handles search, recommendations, and navigation with a lightweight vector index and FastAPI services.
Problem
Users struggled to find products quickly and dropped off after slow search or poor recommendations.
Solution
Conversational assistant + semantic search backed by FastAPI, embeddings, and cached intents for sub-3s responses.
Tech Stack
Next.js, TypeScript, FastAPI, PostgreSQL, OpenAI API, Tailwind, Vercel
My Role
Architecture, API design, AI integration, frontend UX, deployment.
Results
- • AI query responses in <3s with cached intents
- • Faster product discovery via semantic search and voice/text navigation
- • Production-ready deployment with monitoring and uptime focus
Architecture (high level)
Architecture: ScanX Agent Loop
Tech Stack
The tools I ship with
Technical Expertise
Technologies and tools I work with
🧩Full-Stack & Backend
🤖AI / ML & Agents
📡Data Engineering & Pipelines
☁️Cloud & Infrastructure
Education
Academic background in Computer Science
Master of Science in Computer Science
Florida Institute of Technology
2024
GPA: 3.5/4.0
Bachelor of Science in Computer Science
KL University
2022
Certifications & Badges
Professional certifications and verified achievements
HackerRank Problem Solving
HackerRank
HackerRank SQL (Intermediate)
HackerRank
HackerRank Data Structures
HackerRank
HackerRank Data Science
HackerRank
Big Data Specialization
Undergraduate Program
Technical Writing
Articles and insights on AI, distributed systems, and software engineering
Technical Deep Dives
In-depth analysis of system design, performance optimization, and engineering decisions
Scaling FastAPI for High-Throughput AI Workloads
How I optimized FastAPI microservices for scalable request handling, implementing connection pooling, async/await patterns, and request batching for AI API calls.
Vector Database Optimization for Semantic Search
Deep dive into optimizing Pinecone vector search for 50K+ product embeddings, achieving 99.5% retrieval accuracy with <500ms query time through embedding compression and indexing strategies.
Building Fault-Tolerant LLM Pipelines
Designing resilient AI workflows with circuit breakers, graceful degradation, retry logic, and fallback mechanisms for improved reliability during AI service outages.
Multi-Layer Caching Strategy for E-Commerce
Implementing CDN → Redis → Database caching layers, achieving 95% cache hit rate and reducing database load by 80% while maintaining data consistency.
Real-Time Monitoring for Production AI Systems
Building comprehensive monitoring with Prometheus, Sentry, and custom metrics for LLM pipelines, enabling real-time alerting and performance tracking with <100ms overhead.
ETL Pipeline Design for Analytics at Scale
Architecting Python + SQL ETL pipelines processing 1,000+ structured records/day with data quality checks, fact/dimension table design, and Airflow orchestration, improving query performance by 35% and reducing data issues by 40%.




