Data Engineer (PySpark, Apache Airflow, DBT, DataBricks, Kafka, Terraform, AWS, GitHub Actions, Kubernetes)
• Built and maintained highly scalable, big data engineering workloads leveraging cloud native, open source stack
• Engineered end-to-end, robust ELT pipelines with medallion architecture on Azure Databricks
• Migrated legacy RDBMS or Warehouse based systems on modern data lakehouse architecture to overcome performance and scalability issues
• Mitigated pipeline and storage performance issues to maximize throughput, minimize latency and reduce operational cost
• Constructed automated, reliable, git-ops powered CI/CD pipelines using Jenkins/JenkinsX
• Scaled containerized deployments with Kubernetes tools on cloud or on-prem environments
• Collaborated with upstream and downstream stakeholders to sync data engineering solutions with business needs
• Diagnosed data systems health issues via metrics built using open source monitoring tools
• Developed highly available, event-sourced Microservices leveraging Akka platform
• Enforced compliance and best practices via technical leadership, conducting domain, design, code & test review sessions