Senior Advanced Data Engineer

Job Summary :

Lead the design, development, and optimization of highly scalable, robust, and secure data pipelines and data processing architectures within cloud environments.

Serve as a technical expert, driving the modernization and performance tuning of data ingestion, transformation, and consumption layers to support advanced analytics, machine learning initiatives, and operational reporting.

Define best practices for data governance, quality, security, and lifecycle management, ensuring data assets are reliable and meet enterprise-level SLAs.

Core Technical Responsibilities :

- Architect and implement end-to-end Extract, Transform, Load (ETL) and ELT data pipelines using advanced features within major cloud platforms, demonstrating expertise in AWS, GCP, or Azure.

- Design and manage complex data processing workflows using modern orchestration tools, specifically demonstrating deep experience with Apache Airflow.

- Develop and optimize high-volume data transformation jobs leveraging cloud-native tools like Google Dataflow (using Apache Beam) and cloud data warehousing solutions like Snowflake or Google BigQuery.

- Implement data movement strategies using Managed File Transfer (MFT) solutions and traditional integration services like SQL Server Integration Services (SSIS).

- Utilize object-oriented and functional scripting languages, particularly Python and Java, for developing custom data processing modules, APIs, and automation scripts.

- Ensure data security and credential management within pipelines by integrating with secure vaulting solutions like Vault or equivalent cloud-native services (e.g., AWS Secrets Manager, Azure Key Vault).

- Design and maintain relational and non-relational database schemas, optimize complex SQL queries for performance, and manage data warehousing operations.

- Implement Continuous Integration/Continuous Deployment (CI/CD) practices for data pipelines using tools like Azure Pipelines or comparable services (e.g., GitLab CI, Jenkins on cloud infrastructure).

- Cloud Platform and Data Tool Expertise

- Demonstrated hands-on experience utilizing core services within at least one major cloud provider (AWS, GCP, or Azure) for data engineering (e.g., S3/ADLS/Cloud Storage, Lambda/Cloud Functions, EMR/Dataproc).

- Expertise in writing high-performance, optimized SQL queries and stored procedures for large-scale data sets.

- Experience working with data presentation and reporting technologies, including SQL Server Reporting Services (SSRS) or equivalent BI tools.

- Proven ability to utilize Google Dataflow and Snowflake features, including understanding warehouse sizing, clustering keys, and performance tuning mechanisms.

Required Experience and Competencies :

- Minimum of 10+ years of progressive, hands-on experience in data engineering, data warehousing, or software development focused on data-intensive applications.

- Demonstrated ability to function as a technical leader, mentoring junior engineers and driving technical direction for data initiatives.

- Deep understanding of data modeling techniques (Dimensional, 3NF, Data Vault) and their application in cloud data warehouses.

- Strong problem-solving skills and the ability to debug and optimize complex distributed data systems.

- Excellent communication skills for collaborating with product owners, data scientists, and business stakeholders.

- Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field.

Preferred Skills and Certifications :

- Professional certification in a major cloud platform (AWS Certified Data Analytics - Specialty, Google Cloud Professional Data Engineer, or Microsoft Certified : Azure Data Engineer Associate).

- Experience with real-time data streaming technologies such as Apache Kafka or Google Pub/Sub.

- Working knowledge of infrastructure-as-Code (IaC) tools like Terraform or CloudFormation/ARM for deploying data infrastructure.

- Familiarity with containerization technologies like Docker and Kubernetes for microservices supporting the data platform.

- Experience with NoSQL databases (e.g., MongoDB, Cassandra) and graph databases.