Description :

Were looking for a Data Engineer I to help us build and scale data infrastructure that powers real-time decision-making and intelligent product experiences.

If transforming messy data into clean pipelines excites you, and you love designing systems that perform under scaleyoull thrive in this role. Youll work at the intersection of cloud technologies, big data, and automation, driving the way we handle, manage, and leverage data across the organization.

Requirements :

- 3 to 5 years of experience in Data Engineering or similar roles

- Strong foundation in cloud-native data infrastructure and scalable architecture design

- Build and maintain reliable, scalable ETL/ELT pipelines using modern cloud-based tools

- Design and optimize Data Lakes and Data Warehouses for real-time and batch processing

- Ingest, transform, and organize large volumes of structured and unstructured data

- Collaborate with analysts, data scientists, and backend engineers to define data needs

- Monitor, troubleshoot, and improve pipeline performance, cost-efficiency, and reliability

- Implement data validation, consistency checks, and quality frameworks

- Apply data governance best practices and ensure compliance with privacy and security standards

- Use CI/CD tools to deploy workflows and automate pipeline deployments

- Automate repetitive tasks using scripting, workflow tools, and scheduling systems

- Translate business logic into data logic while working cross-functionally

- Strong in Python and familiar with libraries like pandas and PySpark

- Hands-on experience with at least one major cloud provider (AWS, Azure, GCP)

- Experience with ETL tools like AWS Glue, Azure Data Factory, GCP Dataflow, or Apache NiFi

- Proficient with storage systems like S3, Azure Blob Storage, GCP Cloud Storage, or HDFS

- Familiar with data warehouses like Redshift, BigQuery, Snowflake, or Synapse

- Experience with serverless computing like AWS Lambda, Azure Functions, or GCP Cloud Functions

- Familiar with data streaming tools like Kafka, Kinesis, Pub/Sub, or Event Hubs

- Proficient in SQL, and knowledge of relational (PostgreSQL, MySQL) and NoSQL (MongoDB, DynamoDB) databases

- Familiar with big data frameworks like Hadoop or Apache Spark

- Experience with orchestration tools like Apache Airflow, Prefect, GCP Workflows, or ADF Pipelines

- Familiarity with CI/CD tools like GitLab CI, Jenkins, Azure DevOps

- Proficient with Git, GitHub, or GitLab workflows

- Strong communication, collaboration, and problem-solving mindset

- Experience with data observability or monitoring tools (bonus points)

- Contributions to internal data platform development (bonus points)

- Comfort working in data mesh or distributed data ownership environments (bonus points)

- Experience building data validation pipelines with Great Expectations or similar tools (bonus points)