Bhavik Mundra
4× AWS Certified • TCS (Vanguard Client)

Bhavik Mundra

Data Engineer • AWS | PySpark | Real-Time Pipelines

Building production-grade PySpark ETLs on AWS Glue & EMR on EKS • Migrated mainframe/COBOL workloads for Vanguard

About Me

I'm a Data Engineer with production experience delivering high-scale, regulated financial workloads at TCS for Vanguard.

I specialize in modernizing legacy ETL systems — migrating mainframe/COBOL + DB2 jobs to fully serverless AWS architectures using PySpark, Glue, Lambda, Step Functions, and EMR on EKS.

Currently focused on building real-time streaming pipelines, data lakehouses with Apache Iceberg, and Kubernetes-native data platforms using Argo Workflows and Karpenter.

Ujjain → Indore, India
Available for full-time roles

Technical Expertise

AWS Glue
PySpark
EMR on EKS
Apache Iceberg
Kinesis
Lambda
Step Functions
Terraform
ArgoCD
Karpenter
SageMaker

Experience

Data Engineer

Tata Consultancy Services (TCS)

Vanguard Project — Client-Embedded Team

Mar 2025 – Present

Indore, India

  • Collaborated directly with Vanguard’s US team to migrate mainframe/COBOL ETL jobs and DB2 data to AWS, enabling real-time client portfolio calculations.
  • Developed production-grade PySpark ETL pipelines in AWS Glue, serverless Lambda functions, orchestrated via Step Functions, with S3 + CloudFormation (IaC).
  • Designed secure, cost-efficient AWS architecture for regulated financial workloads.

Tech Stack I Ship With

Kinesis
Lambda
Step Functions
SageMaker
Cloud Formation
Cognito
EC2
EKS Cloud
Glue
RDS
Elastic Kubernetes Service
S3
Apache Spark
AWS
C++
Docker
FastAPI
Flask
Git
HashiCorp Terraform
Pandas
postgresSQL
Python
Kinesis
Lambda
Step Functions
SageMaker
Cloud Formation
Cognito
EC2
EKS Cloud
Glue
RDS
Elastic Kubernetes Service
S3
Apache Spark
AWS
C++
Docker
FastAPI
Flask
Git
HashiCorp Terraform
Pandas
postgresSQL
Python
Cloud Formation
Cognito
EC2
EKS Cloud
Glue
RDS
Elastic Kubernetes Service
S3
Apache Spark
AWS
C++
Docker
FastAPI
Flask
Git
HashiCorp Terraform
Pandas
postgresSQL
Python
Kinesis
Lambda
Step Functions
SageMaker
Cloud Formation
Cognito
EC2
EKS Cloud
Glue
RDS
Elastic Kubernetes Service
S3
Apache Spark
AWS
C++
Docker
FastAPI
Flask
Git
HashiCorp Terraform
Pandas
postgresSQL
Python

Featured Projects

Real-Time Fraud Detection with PySpark on EKS + SageMaker

50k TPS pipeline via Kinesis → PySpark streaming + SageMaker inference. Cut false positives by 41%.

PySparkKinesisEKSSageMaker50k TPS
View on GitHub

Iceberg Data Lakehouse with Glue & Athena

Zero-ETL lakehouse with schema evolution → reduced Athena query costs by 68%.

Apache IcebergGlueAthenaCost Optimization
View on GitHub
72% Cost Reduction

Spot-Optimized PySpark Platform with Karpenter

90% Spot Graviton fleet on EMR on EKS using Karpenter + Slack-triggered Step Functions → savings of 72%.

KarpenterEMR on EKSSpot InstancesGravitonStep Functions
View on GitHub
RTO < 3 min

Multi-Region PySpark Failover (RTO < 3 min)

Active-passive DR for 2TB analytics using S3 CRR, DynamoDB Global Tables, Route53 health checks + Terraform automation.

Disaster RecoveryS3 CRRDynamoDB GlobalRoute53Terraform
View on GitHub
LLM-Powered

LLM-Powered Anomaly Detection with Bedrock

PySpark + Bedrock Titan embeddings on EKS → OpenSearch + plain-English Slack alerts via LLM summaries.

Bedrock TitanOpenSearchEKSLLMAnomaly Detection
View on GitHub

Multi-Domain Data Processing with Argo Workflows

Kubernetes-native data mesh with ArgoCD + isolated scaling and cost tagging.

ArgoCDArgo WorkflowsEKSData Mesh
View on GitHub

AWS Certifications

All certificates are publicly verifiable on AWS Credly

Data Engineer, AWS Data Engineer, PySpark Developer, AWS Glue, EMR on EKS, Apache Iceberg, Kinesis Data Streams, Lambda, Step Functions, Terraform, Kubernetes EKS, Karpenter, ArgoCD, Argo Workflows, DynamoDB Global Tables, S3 CRR, Route53 Failover, Bedrock Titan LLM, Amazon SageMaker, Real-time Analytics, Disaster Recovery RTO, Spot Instances Cost Optimization, Graviton, Athena Cost Reduction, AWS Certified Data Engineer Associate, AWS Certified Developer Associate, AWS Certified Solutions Architect Associate, AWS Cloud Practitioner, TCS Vanguard, Financial Services Data Engineering, Mainframe to AWS Migration, COBOL to PySpark Migration, Serverless ETL, Data Lakehouse, OpenSearch, Anomaly Detection LLM, Slack Automation, Indore, Madhya Pradesh, India