=>You will get support for writing highly scalable ETL & Data Processing Jobs, Data Migration (Homogenous & Heterogeneous), Real-time data streaming and analytic dashboards.
===Data Engineering===
• Creating a Data Pipeline on Cloud Platforms like Amazon Web Services (AWS) and Google Cloud Platform (GCP)
• Writing Extract-Transform-Load (ETL) jobs for data processing using technologies like (AWS Glue, Pyspark, GCP DataProc, AWS EMR, AWS Lambda)
• Building real-time data pipelines for Streaming data using Apache Kafka, AWS Kinesis, GCP Data Flow
========= Data Engineering Skills =========
Expertise:
Amazon Web Services:
RDS, EC2, Glue, Lambda, Data Migration Service (DMS), S3, Sagemaker, Batch, ECS, ECR, Athena, Redshift, QuickSight, Kinesis.
Google Cloud Platform:
DataFlow, Pub/Sub, BigQuery, DataProc, Google Storage, Google Data Studio, Cloud Functions, Google AutoML, Google DataLab.
Tools & Libraries:
PySpark, Spark, Scala, Python, Hadoop, Hive, SparkML, AirFlow.
Database:
Postgres, MySQL, Oracle, DynamoDB, MongoDB, AWS Aurora, MSSQL