Posted on: 04/12/2025
Description :
Key Responsibilities :
Data Architecture & Platform Development :
- Design and implement robust cloud-based data architectures, including data lakes, data warehouses, and real-time streaming systems.
- Develop scalable ETL/ELT pipelines using cloud-native tools such as AWS Glue, Azure Data Factory, GCP Dataflow, Databricks, or Apache Airflow.
- Integrate structured, semi-structured, and unstructured data from various sources into centralized platforms.
Data Pipeline Engineering :
- Build, automate, and optimize high-performance data pipelines for ingestion, transformation, and processing.
- Ensure data availability, reliability, and integrity across all processing stages.
- Implement data quality checks, validations, and monitoring frameworks.
Cloud Technologies & Infrastructure :
- Work with cloud-native services such as AWS Redshift, Snowflake, BigQuery, Azure Synapse, S3, ADLS, GCS, Lambda, etc.
- Develop and manage infrastructure as code (IaC) using Terraform, CloudFormation, or ARM templates.
- Optimize cloud resource utilization to ensure cost efficiency and performance.
Big Data & Analytics Support :
- Work with distributed data processing frameworks like Apache Spark, Hadoop, PySpark, Kafka, Flink, etc.
- Support data scientists with feature engineering, dataset preparation, and model deployment pipelines.
- Enable analytics platforms and BI tools through curated datasets and semantic layers.
Data Governance & Security :
- Implement data governance practices including metadata management, data cataloging, lineage tracking, and security controls.
- Ensure compliance with organizational, regulatory, and cloud security standards.
- Manage access control, encryption, data masking, and privacy requirements.
Collaboration & Stakeholder Management :
- Work closely with product teams, business analysts, and engineering stakeholders to understand data needs.
- Participate in Agile ceremonies, provide estimates, and support sprint deliveries.
- Communicate technical concepts clearly to non-technical stakeholders.
Monitoring, Optimization & Troubleshooting :
- Monitor data workflows and pipelines to proactively identify issues and optimize performance.
- Troubleshoot failures in ingestion, transformation, or downstream usage.
- Conduct root-cause analysis and propose preventive improvements.
Required Skills & Qualifications :
- Bachelors or Masters degree in Computer Science, Data Engineering, Information Systems, or related field.
- Expertise in one major cloud platform : AWS, Azure, or GCP.
- Strong programming skills in Python, SQL, Scala, or Java.
- Proficiency with ETL/ELT tools, data orchestration tools, and workflow scheduling frameworks.
- Experience with relational and NoSQL databases (Redshift, BigQuery, Synapse, MongoDB, DynamoDB, etc.
- Solid understanding of data modeling, warehousing concepts, and distributed systems
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1584749
Interview Questions for you
View All