Key Responsibilities :

- Design, develop, and maintain ETL pipelines using Databricks, PySpark, and Python.

- Build scalable, high-performance data ingestion, transformation, and processing workflows.

- Optimize existing ETL pipelines for performance, scalability, and reliability.

- Develop and maintain data warehouse solutions, ensuring efficient data storage and retrieval.

- Design robust data models to support analytics, reporting, and machine learning use cases.

- Collaborate with data architects and analysts to define and implement best practices in data modeling.

- Work extensively with AWS cloud platform, including S3, Redshift, Glue, and other related services.

- Ensure secure, cost-effective, and highly available data solutions.

- Monitor and optimize cloud-based data pipelines and storage systems.

- Develop and support real-time data pipelines using streaming frameworks such as Kafka.

- Implement monitoring, alerting, and error-handling mechanisms for streaming data.

- Ensure timely delivery of real-time data to downstream consumers.

- Work with business intelligence and analytics teams to provide clean, structured datasets.

- Support visualization and analytics efforts using tools like Tableau, Power BI, or R.

- Provide actionable insights and recommendations based on data analysis.

- Mentor junior and mid-level data engineers, review code, and ensure adherence to best practices.

- Lead design discussions, technical reviews, and implementation strategies for complex data engineering projects.

- Collaborate with cross-functional teams to ensure successful project delivery.

Required Skills & Qualifications :

- Minimum 8 years of experience in technology (application development or production support).

- Minimum 5 years of hands-on experience in data engineering, ETL development, Python, PySpark, and Databricks.

- Strong understanding of data warehousing, data modeling, and big data concepts.

- Proficiency with Spark, Hive, and SQL.

- Experience with cloud platforms, preferably AWS, and related data services.

- Experience with streaming frameworks such as Kafka.

- Familiarity with data visualization and analytics tools (Tableau, R, Power BI).

- Strong problem-solving, debugging, and performance tuning skills.

- Excellent collaboration, communication, and mentoring abilities.

Preferred Qualifications :

- Experience in end-to-end data pipeline architecture for large-scale systems.

- Exposure to CI/CD pipelines for data engineering workflows.

- Knowledge of data governance, quality, and compliance frameworks.

- Experience with machine learning pipelines and ML feature engineering