Student Mental Health Risk Prediction
→
Summary
Developed a predictive analytics model with 84% accuracy to support proactive mental health intervention programs in education systems.
Highly accomplished and results-driven Data Engineer with extensive experience in designing scalable data solutions across Azure, GCP, and AWS ecosystems. Proven expertise in building cloud-native ETL pipelines, advanced Python/SQL development, and optimizing data infrastructure for real-time insights. Adept at leveraging CI/CD, automation, and robust governance frameworks to enhance data reliability, accelerate deployment velocity by 60%, and drive business-critical decisions.
Azure Data Engineer
Irving, TX, US
→
Summary
Spearheaded ETL architecture in Azure Data Factory and Databricks, enabling real-time insights that accelerated executive decision-making across finance and risk portfolios.
Highlights
Optimized data load processes, reducing load time by 40% and enhancing SLA compliance, which accelerated revenue recognition from partner analytics reports.
Led the strategic adoption of dbt Core for modular data transformations in Databricks SQL, improving collaboration and enabling version-controlled, testable, and reusable data models.
Orchestrated robust CI/CD pipelines using Jenkins and Azure DevOps, increasing deployment velocity by 60% and significantly reducing production downtime risks.
Strengthened data governance by enhancing GDPR and CCPA compliance through role-based access controls, mitigating enterprise regulatory risks and audit penalties.
Collaborated with BI teams to optimize Power BI semantic models, decreasing report latency by 35% and improving stakeholder data adoption.
GCP Data Engineer
India
→
Summary
Designed and deployed scalable GCP-based data solutions for real-time fraud detection and enterprise data integration, optimizing data access and insights for business leaders.
Highlights
Engineered and deployed near real-time streaming pipelines on Dataflow and Pub/Sub, increasing fraud prevention effectiveness by 25%.
Automated critical data workflows using Cloud Composer, reducing manual intervention time by 60% and enabling data teams to prioritize revenue-driving initiatives.
Orchestrated the migration of legacy data to Hadoop on Cloud Dataproc, enabling scalable data science workloads and unlocking historical data insights for executive dashboards.
Integrated diverse external data sources into the enterprise data warehouse (Snowflake) via secure REST/SOAP APIs, enabling advanced analytics and comprehensive reporting.
Optimized BigQuery performance for 1B+ row datasets, reducing query time by 45% and accelerating time-to-insight for product strategy leaders while maintaining robust data security.
Data Engineer
India
→
Summary
Engineered high-accuracy ETL pipelines and optimized data infrastructure to support multi-TB analytics and accelerate product lifecycle decisions for global R&D.
Highlights
Engineered high-accuracy ETL pipelines using PySpark and HiveSQL, powering multi-TB analytics and enabling data-driven product lifecycle decisions at scale.
Migrated on-prem data pipelines to AWS using Terraform, decreasing provisioning delays by 50% and increasing engineering velocity for R&D analytics.
Introduced Docker/Kubernetes for containerization, reducing infrastructure costs by 25% and improving environment consistency across development and production.
Achieved 99.95% system uptime through proactive monitoring with CloudWatch and Nagios, ensuring continuous data availability for global business units.
Streamlined data preprocessing using Pandas/NumPy, accelerating AI model training timelines and supporting faster go-to-market cycles.
→
Master of Science
Business Analytics
Grade: GPA - 3.36
Courses
Business Process Analytics
Predictive analytics
Data Mining
Programming Languages for BA
Issued By
Amazon Web Services (AWS)
Issued By
Databricks
Issued By
Penn Engineering (University of Pennsylvania)
Issued By
W3 Schools
Airflow, dbt, Informatica PowerCenter, IBM DataStage, Fivetran.
Azure (ADF, Synapse, Databricks), GCP (BigQuery, Dataflow, Pub/Sub), AWS (EC2, S3, Glue, EMR).
Python (NumPy, Pandas, Sklearn, PyTorch), SQL (Advanced), T-SQL, Java, Scala, Shell, R.
Spark, PySpark, Hive, Kafka, Hadoop, Flink, HDFS, Presto.
Docker, Kubernetes, Jenkins, Terraform, GitHub, Azure DevOps, Agile.
Snowflake, BigQuery, Azure Synapse, Vertica, Teradata, MongoDB, DynamoDB, HBase.
Power BI, Tableau, Excel, DAX, Microsoft Fabric, Grafana, Qlik.
GDPR, CCPA, RBAC, IAM.
→
Summary
Developed a predictive analytics model with 84% accuracy to support proactive mental health intervention programs in education systems.
→
Summary
Built executive dashboards to highlight global health disparities, enabling NGO executives to allocate resources and grants more equitably.