I am a data engineer with 3+ years of experience building scalable data pipelines, ETL workflows, and cloud-based analytics solutions across AWS, GCP, Azure, and Databricks. I specialize in end-to-end data architecture, including PySpark, Python, SQL, Hadoop, Delta Lake, Airflow, Dataflow, Step Functions, Glue, Redshift, BigQuery, and Dataform.
I help businesses migrate legacy systems, automate ETL jobs, design real-time & batch pipelines, implement data lakes/warehouses, and ensure high-quality, secure, and analytics-ready datasets. I have delivered pipelines processing 2 TB/day, improved performance by 40%, and achieved 99.9% data accuracy using strong data governance, schema validation, and CI/CD automation.
If you need data migration, Databricks development, cloud ETL, data quality frameworks, orchestration workflows, or end-to-end pipeline design, I can deliver reliable and production-grade solutions with complete documentation and ongoing support.