Azure Databricks Data Engineer (Python) (5304889)

Programming & Development Database Design & Administration

Azure Databricks Data Engineer (Python)

$30/hr Starting at $25

I am a developer with experience in databases. Started working with Microsoft SQL Server 2000 until 2019. Also I have experience in on prem (SSIS / SSRS). Actually I'm working with Azure services, specifically creating notebooks and pipelines in Azure Databricks using Python notation. Also I'm using Azure Data Factory, ADLS, Azure Key Vault for storing sensitive data, Azure Synapse Analytics, GIT Flow, etc. My current project is Billing Hub. We are consuming data from a lot of different sources and ingesting them in the raw area. We apply some cleansing and the result of that we store it in the curated layer. Then we perform some aggregations and stored that data in the aggregated layer. Finally there are two final layers: platform_services and invoice layers. We can think of the invoice layer as the gold layer in which the final user consumes the data. Our team follows Agile methodologies, so we have daily meetings, sprint planning, refining, retro, etc. We are using Azure DevOps for creating the User Stories and the US breakdown (tasks creation). In Azure Databricks we are using best practices. So for each notebook we include Markdowns, all the commands (Cmd) with titles, ADB Parameterization section, Function declaration, a section for importing libraries, and also we import common reusable functionality from other notebooks, proper indentation following PEP-8 guidelines, sensitive data (e.x. passwords) are stored in Azure Key Vault in secret properties, we optimize tables by Z Order, etc. Also we use different environments for our daily job: DEV - QA - UAT and PROD. Thanks for your attention and we keep in touch!

About

$30/hr Ongoing

Download Resume

I am a developer with experience in databases. Started working with Microsoft SQL Server 2000 until 2019. Also I have experience in on prem (SSIS / SSRS). Actually I'm working with Azure services, specifically creating notebooks and pipelines in Azure Databricks using Python notation. Also I'm using Azure Data Factory, ADLS, Azure Key Vault for storing sensitive data, Azure Synapse Analytics, GIT Flow, etc. My current project is Billing Hub. We are consuming data from a lot of different sources and ingesting them in the raw area. We apply some cleansing and the result of that we store it in the curated layer. Then we perform some aggregations and stored that data in the aggregated layer. Finally there are two final layers: platform_services and invoice layers. We can think of the invoice layer as the gold layer in which the final user consumes the data. Our team follows Agile methodologies, so we have daily meetings, sprint planning, refining, retro, etc. We are using Azure DevOps for creating the User Stories and the US breakdown (tasks creation). In Azure Databricks we are using best practices. So for each notebook we include Markdowns, all the commands (Cmd) with titles, ADB Parameterization section, Function declaration, a section for importing libraries, and also we import common reusable functionality from other notebooks, proper indentation following PEP-8 guidelines, sensitive data (e.x. passwords) are stored in Azure Key Vault in secret properties, we optimize tables by Z Order, etc. Also we use different environments for our daily job: DEV - QA - UAT and PROD. Thanks for your attention and we keep in touch!

Skills & Expertise

BillingData ManagementData WarehouseDatabase DesignDatabase DevelopmentDevOpsEngineeringGitHubMicrosoft SQL ServerPythonSQLStored Procedures

Andres Pablo Escobar