We are seeking an experienced Data Architect with strong expertise in designing, building, and maintaining scalable data pipelines and data processing solutions. The ideal candidate will have hands-on experience with modern data engineering tools and frameworks, solid programming capabilities, and strong analytical thinking to support data-driven decision-making.

Key Responsibilities :

- Design, develop, and manage robust data pipelines and ETL workflows using PySpark, SQL, and Apache Airflow.

- Integrate data from multiple sources using connectors such as JDBC, Impala, and Hive.

- Optimize data ingestion and transformation processes for performance and scalability.

- Collaborate with data scientists, analysts, and business teams to understand requirements and deliver reliable datasets.

- Implement data quality checks, error handling, and pipeline monitoring to ensure high data integrity.

- Work with large-scale data systems in a distributed environment.

- Participate in code reviews and contribute to continuous improvement of data engineering practices.

Mandatory Skills :

- Strong hands-on experience in PySpark, SQL, and Apache Airflow.

- Experience in developing and managing data pipelines and ETL workflows.

- Proficiency with data connectors such as JDBC, Impala, and Hive.

- Good analytical and programming skills with attention to detail.

- Solid understanding of data structures, data modeling, and performance optimization.

Good to Have :

- Exposure to cloud platforms (Azure / AWS / GCP) and their data services.

- Experience with big data ecosystems (e.g., Hadoop, Databricks).

- Knowledge of version control (Git) and CI/CD for data pipelines.