I am a Senior Data Engineer with 15+ years of enterprise experience designing and scaling modern data infrastructure, ETL/ELT pipelines, and cloud-based solutions. My expertise includes Spark (Scala/PySpark), Hive, SQL, Python, Ab Initio, and Kafka, with proven results in performance tuning, cost optimization, and pipeline reliability. I have led large-scale migrations into Iceberg-based data lakehouse architectures on AWS S3, enabling analytics-ready, compliant solutions in highly regulated industries such as healthcare and insurance.
What sets me apart is my ability to balance hands-on technical delivery with strategic design. I have built and maintained high-volume, HIPAA-compliant pipelines processing terabytes of structured and unstructured data daily. I’ve also automated workflows with shell scripting and schedulers, and collaborated cross-functionally to align pipelines with evolving business needs.
Whether you need data ingestion, transformation, real-time streaming, or lakehouse modernization, I deliver secure, scalable, and efficient solutions tailored to your organization.