- Data Engineer designing and deploying scalable ETL pipelines, automating data workflows, and building reliable cloud-based data systems with a major focus on improvement of data accessibility, consistency, and performance accross platforms.
- Key expertise in Python, SQL, Pyspark and Delta Lake.
- Indepth knowhow of Databricks, AWS Services like Glue, Lambda, Redshift, Athena and S3 to use end-to-end data engineering solution with these skills transferrable to Azure ecosystem (ADF, Azure SQL, AppFunctions, ADLS etc.)
- Proficient in:
- Data Ingestion
- Data Aggregation
- Data Transformation
- Data Pre-processing
- Supervised and unsupervised machine learning techniques
- Data Validation
- Implemented end-to-end data monitoring and transformation ETL Pipeline in medallion architecture and metadata driven pipeline streamlining ingestion, transformations, and validation across multiple data sources.
- Experienced in integrating data pipelines with BI Tools to enable faster decision-making and improve data visibility.
- Involved in data governance, lineage tracking and monitoring to ensure the parity in data quality and operational reliability.
- Experience in creating CI/CD workflows and pipelines using Azure DevOps and Github Actions.
- I am currently exploring frameworks like Apache Airflow, dbt, Great Expectations and Apache Iceberg to diversify my personal toolkit.
- I am also exploring the Infrastructure as a code paradigm and DevOps in using Jenkins and Terraform
- I am an AWS certified Associate for Developer and Data Engineer
- I am also Microsoft Certified in Azure Fundamentals.
βοΈ
Learning new things
Pinned Loading
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.