Job Title : Databricks Developer

Job Overview :

We are seeking an experienced Databricks Developer to join our dynamic data engineering team. The ideal candidate will have expertise in developing data solutions using Databricks, Apache Spark, and Azure/AWS cloud platforms. As a Databricks Developer, you will be responsible for building scalable data pipelines, optimizing big data processing workflows, and collaborating with data scientists and analysts to derive insights from large datasets.

Responsibilities :

- Design, develop, and maintain efficient data pipelines using Databricks on Azure/AWS.

- Collaborate with data engineers and data scientists to understand business requirements and build data solutions.

- Implement ETL workflows to process and transform large datasets in Spark and Databricks.

- Write scalable and optimized PySpark/Scala/SQL code for data processing.

- Integrate and process structured, semi-structured, and unstructured data from different sources (e.g., databases, flat files, APIs).

- Optimize Spark jobs for performance, scalability, and cost efficiency.

- Maintain and improve existing data pipelines and production workflows.

- Monitor and troubleshoot data pipelines, ensuring reliability and performance.

- Leverage Databricks notebooks for data analysis, visualization, and reporting.

- Ensure data quality, security, and governance within the data pipeline.

- Collaborate with stakeholders to design solutions that meet business needs.

Requirements :

- 3+ years of hands-on experience as a Databricks Developer or Big Data Engineer.

- Strong experience with Databricks, Apache Spark, and related technologies (e.g., Delta Lake, MLflow).

- Proficiency in programming languages such as Python, Scala, and SQL for data processing.

- Experience working with cloud platforms like Azure or AWS and their respective big data services (e.g., Azure Databricks, AWS Glue, Amazon S3, etc.).

- Knowledge of ETL processes and workflow automation.

- Experience with distributed computing, data storage systems, and performance optimization techniques.

- Strong understanding of data lakes, data warehouses, and data pipelines.

- Familiarity with data formats like Parquet, Avro, ORC, and JSON.

- Experience in data integration and working with REST APIs.

- Good understanding of CI/CD pipelines, version control (e.g., Git), and deployment workflows.

- Familiarity with data visualization tools (e.g., Power BI, Tableau, or Databricks notebooks).

- Strong problem-solving and troubleshooting skills.

- Ability to work in a collaborative and fast-paced environment.