Senior Data Engineer
Building Scalable Data Solutions | ETL Specialist | Big Data Expert
Transforming Complex Data Challenges into Efficient Solutions
Senior Data Engineer with extensive experience in designing and implementing robust data architectures. Specialized in building scalable ETL pipelines and optimizing data workflows for large-scale systems. Strong background in distributed computing and big data technologies.
Performance Optimization
Reduced ETL processing time by 90% through advanced optimization techniques
Innovation
Developed custom data quality framework reducing manual testing effort by 80%
Scale
Managed data pipelines processing over 10GB of data daily with 99.9% uptime
Software Engineer ETL
R1 RCM
Sep 2022 - Present
- Designed and maintained ETL pipelines (SQL, Python, SSIS, Azure Databricks, Pyspark) for data transformation, ensuring data accuracy, reliability, and accessibility for BI.
- Analyzed complex datasets to identify patterns, trends, and insights, informing business decisions and optimizing data processing/analysis.
- Collaborated with cross-functional teams to maintain data consistency throughout the ETL pipeline, meeting business requirements.
- Proactively identified and resolved data quality issues using various tools and techniques, delivering high-quality data for stakeholders.
- Migrated ETL solutions, slashing workload from 3 FTE to 0.5 FTE while maintaining quality through 90% workflow efficiency gains.
Quality Engineer ETL
Assimilate Solutions, A SitusAMC Company
Jan 2021 - Sep 2022
- Owned data validation for data service scrum team, ensuring data integrity using SQL queries and automated Python scripts on Snowflake.
- Validated Informatica mappings and workflows for accurate data processing and loading.
- Documented test cases and results for efficient data quality tracking and monitoring.
- Collaborated with onshore teams and stakeholders to meet data quality requirements.
Associate Analyst
GlobalLogic
Oct 2019 - Jan 2020
- Leveraged crowd-sourced data to train Google Lens for rich shopping experiences.
- Collaborated on training Google Lens models for enhanced shopping experiences.
- Improved team efficiency and quality through data analysis and reporting.
Programming Languages
Databases
System Design
ETL Tools
Reporting
File Handling
Python Libraries
Tools & Technologies
Data Science with Python
Data Analysis Specialization
Snowflake
The Complete Masterclass
Python Bootcamp
From Zero to Hero in Python
Scala 3
Complete Development Masterclass
Power BI
Essential Training
Tableau
Essential Training
ETL - Batch & Historical
Enterprise-scale ETL pipeline for efficient data transfer between delta tables and on-premise SQL server.
- Created ETL pipeline leveraging multiple technologies for data transfer
- Optimized load time using Python concurrency and parallel Databricks workflows
- Implemented comprehensive data validation and quality checks
- Added webhooks for notifications and dashboard monitoring
ETL - DataStreaming
Real-time data streaming pipeline using Google Pub/Sub for efficient data distribution.
- Built real-time streaming pipeline with Google Pub/Sub integration
- Implemented data transformation and multi-server distribution
- Added parallel processing and retry mechanisms
- Orchestrated via Windows service with auto-recovery
ML - Stock Prediction Model
Machine learning pipeline for stock market analysis and prediction using multiple models.
- Developed ML pipeline for Yahoo Finance data extraction
- Implemented data cleansing and technical indicator generation
- Trained and maintained multiple prediction models
- Created action recommendation system based on model predictions
Testing Tool - ETL
Python GUI application for automated ETL validation and testing.
- Developed automated ETL validation GUI tool
- Implemented smoke testing and standard checks
- Added support for heterogeneous data source comparison
- Automated validation summary reporting
Let's Build Something Amazing
I'm always interested in hearing about new projects and opportunities in data engineering and analytics.