Banner Image

All Services

Engineering & Architecture

Data Engineer (ETL), kafka & python

$5/hr Starting at $25

• Having 13+ years hands-on experience in IT including 4+ years in Big Data Technologies. • Hands-on experience in software design and development, Administrator and Architect in Datastax Cassandra, Kafka, Spark, Pyspark, Hive, HDFC, Zookeeper, Elasticsearch, Kibana, Python, Sqoop, Hive, Hadoop, HDFS. • Having experience in Cassandra database modeling and administration and depth knowledge of Apache Cassandra architecture. • Experience in designing data models in Cassandra and working with Cassandra Query Language (CQL). • Building and managing Cassandra using Datastax Opscenter. Experience in Cassandra systems backup and recovery/restore. • Hands-on experience on building Cassandra cluster with mult-node cluster on single machine/Server • Generated Cassandra backup report using Python and shell script • Hands-on experience to build multimode Kafka cluster and Elasticsearch Cluster • Generating report as biz requirement from Cassandra and python. • Experience in performance tuning and maintenance of Cassandra Database. • Developed Kafka Producer and Consumer in python using pykafka • Knowledge in Cassandra read and write paths and internal architecture. • Developed ETL / Data Preparation / Pre-Processing jobs in Python to persist data into Cassandra & Elasticsearch. • Building data pipe line using Kafka, python and elasticsearch and Cassandra • Experience in developing scalable solutions using NoSQL databases CASSANDRA, Python and Kafka • Experience in developing SQL, Stored Procedures, functions, triggers using DB2 and comfortable working with databases like MySQL, No-SQL and MSSQL. • Working closely with Cassandra loading activity on history load and incremental loads from Teradata and DB2 Databases and resolving loading issues and tuning the loader for optimal performance. • Experience in working with Cassandra utilities like NODETOOL repair, flush, Compact, snapshot, refresh, cfstats, tpstats, ring,info, drain etc. • Data Streaming using Kafka. • Building Docker Image for Datastax Cassandra and creating multiple containers for multinode instance on single server. • Excellent understanding of Hadoop architecture and different components of Hadoop clusters which include components of Hadoop (Job Tracker, Task Tracker, Name Node and Data Node). • Understanding of YARN (Resource Manager, Node Manager, Application Manager and Container) • Developed script/code to ingest Data to HDFS from various data sources. • Developed python program to automate manual system activities. • Importing and exporting the data from relational databases, NO SQL DB’S using SQOOP. • Prepared SQOOP, Flume, Hive and using Oozie workflow (Designer) and automated these jobs using Oozie scheduling (Coordinator). • Developed Python & Cassandra integration program to generate various report. • Having experience on AGILE/SCRUM and SDLC Water Fall Model. • Effectively communicating with customer to gathering requirements and providing status of tasks. • Experienced

About

$5/hr Ongoing

Download Resume

• Having 13+ years hands-on experience in IT including 4+ years in Big Data Technologies. • Hands-on experience in software design and development, Administrator and Architect in Datastax Cassandra, Kafka, Spark, Pyspark, Hive, HDFC, Zookeeper, Elasticsearch, Kibana, Python, Sqoop, Hive, Hadoop, HDFS. • Having experience in Cassandra database modeling and administration and depth knowledge of Apache Cassandra architecture. • Experience in designing data models in Cassandra and working with Cassandra Query Language (CQL). • Building and managing Cassandra using Datastax Opscenter. Experience in Cassandra systems backup and recovery/restore. • Hands-on experience on building Cassandra cluster with mult-node cluster on single machine/Server • Generated Cassandra backup report using Python and shell script • Hands-on experience to build multimode Kafka cluster and Elasticsearch Cluster • Generating report as biz requirement from Cassandra and python. • Experience in performance tuning and maintenance of Cassandra Database. • Developed Kafka Producer and Consumer in python using pykafka • Knowledge in Cassandra read and write paths and internal architecture. • Developed ETL / Data Preparation / Pre-Processing jobs in Python to persist data into Cassandra & Elasticsearch. • Building data pipe line using Kafka, python and elasticsearch and Cassandra • Experience in developing scalable solutions using NoSQL databases CASSANDRA, Python and Kafka • Experience in developing SQL, Stored Procedures, functions, triggers using DB2 and comfortable working with databases like MySQL, No-SQL and MSSQL. • Working closely with Cassandra loading activity on history load and incremental loads from Teradata and DB2 Databases and resolving loading issues and tuning the loader for optimal performance. • Experience in working with Cassandra utilities like NODETOOL repair, flush, Compact, snapshot, refresh, cfstats, tpstats, ring,info, drain etc. • Data Streaming using Kafka. • Building Docker Image for Datastax Cassandra and creating multiple containers for multinode instance on single server. • Excellent understanding of Hadoop architecture and different components of Hadoop clusters which include components of Hadoop (Job Tracker, Task Tracker, Name Node and Data Node). • Understanding of YARN (Resource Manager, Node Manager, Application Manager and Container) • Developed script/code to ingest Data to HDFS from various data sources. • Developed python program to automate manual system activities. • Importing and exporting the data from relational databases, NO SQL DB’S using SQOOP. • Prepared SQOOP, Flume, Hive and using Oozie workflow (Designer) and automated these jobs using Oozie scheduling (Coordinator). • Developed Python & Cassandra integration program to generate various report. • Having experience on AGILE/SCRUM and SDLC Water Fall Model. • Effectively communicating with customer to gathering requirements and providing status of tasks. • Experienced

Skills & Expertise

Administrative AssistantApacheApache HiveCassandraData ManagementDatabase DevelopmentDB2DesignElasticsearchEtlFlaskManagementModelingNode.jsNoSQLProgram ManagementPythonReportsRequirements AnalysisSoftware DevelopmentSQLSystems EngineeringTeradataWriting

0 Reviews

This Freelancer has not received any feedback.