HamburgerMenu
hirist

Job Description

Job Description :


Key Responsibilities :


- Design, build, and optimize data pipelines and workflows on the Databricks platform using Apache Spark.

- Develop scalable and efficient ETL/ELT processes to ingest, transform, and process large volumes of structured and unstructured data.

- Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver solutions.

- Implement data quality checks, monitoring, and alerting to ensure pipeline reliability and accuracy.

- Optimize performance of data processing jobs and storage for cost-efficiency and scalability.

- Work closely with cloud engineering teams to deploy and maintain data infrastructure on cloud platforms such as Azure, AWS, or GCP.

- Develop and maintain technical documentation for data architecture and pipeline processes.

- Support data governance and compliance by implementing security and access controls.

- Troubleshoot and resolve data issues promptly to minimize downtime.


Required Skills & Qualifications :


- 5+ years of experience in data engineering or related roles.

- Strong hands-on experience with Databricks and Apache Spark (PySpark, Scala, or Spark SQL).

- Proficient in programming languages such as Python, Scala, or Java.

- Experience with cloud platforms (Azure, AWS, or GCP) and their data services (e.g., Azure Data Lake, S3, BigQuery).

- Solid understanding of ETL/ELT concepts, data modeling, and data warehousing principles.

- Experience working with large-scale datasets and distributed computing frameworks.

- Familiarity with SQL and NoSQL databases.

- Knowledge of DevOps practices including CI/CD pipelines and infrastructure as code.

- Strong problem-solving skills and ability to work in a fast-paced environment.

- Excellent communication and collaboration skills.


Preferred Qualifications :


- Experience with workflow orchestration tools like Apache Airflow or Azure Data Factory.

- Knowledge of containerization (Docker, Kubernetes).

- Understanding of data governance, security, and compliance standards.

- Familiarity with machine learning pipelines and tools


info-icon

Did you find something suspicious?