Data Ingestion & Integration :

- Ingest data from a variety of sources such as Azure SQL DB, Google Analytics, Google Play Store, Apple App Store, Salesforce, and others.

- Develop and optimize ETL/ELT pipelines to transform data from CSV, JSON, SQL tables, and APIs into usable formats.

- Work with REST APIs to pull data from various external sources and integrate into our data ecosystem.

Data Transformation & Modeling :

- Design and implement efficient data transformation processes to cleanse, aggregate, and enrich data.

- Apply industry best practices for data modeling to ensure scalability, performance, and data integrity.

- Collaborate with data analysts and data scientists to provide clean, high- quality datasets for reporting and analysis.

Databricks & Cluster Management :

- Utilize Databricks for data processing, transformation, and orchestration tasks.

- Manage and optimize Databricks clusters for performance, reliability, and cost-effectiveness.

- Implement Databricks workflows to automate and streamline data pipelines.

- Use Unity Catalog for data governance and metadata management, ensuring compliance and data access control.

Experience :

- 5+ years of hands-on experience in data engineering or a related field.

- Proven experience with Databricks and Databricks workflows, including cluster management and data pipeline orchestration.

- Strong experience in data ingestion from SQL databases (Azure SQL DB), APIs (Google Analytics, Google Play Store, Apple App Store, Salesforce), and file-based sources (CSV,JSON).

Technical Skills :

- Proficiency in SQL for data manipulation and transformation.

- Experience with Python or Scala for writing and managing data workflows.

- Working knowledge of REST APIs for data integration.

- Experience in data transformation using Apache Spark, Delta Lake, or similar technologies.

- Knowledge of cloud platforms such as Azure, with a focus on Azure SQL DB.

- Familiarity with Unity Catalog for metadata management and governance.

Data Engineering Best Practices :

- Understanding of data architecture, data pipelines, and the ETL/ELT process.

- Experience in data modeling, optimizing queries, and working with large datasets.

- Familiar with data governance, metadata management, and data access controls.

Preferred Skills (Optional) :

- Knowledge of Apache Kafka or other real-time streaming technologies.

- Experience with Data Lake or Data Warehouse technologies.

- Familiarity with additional data transformation tools such as Apache Airflow or dbt.

- Understanding of machine learning workflows and data pipelines.