HamburgerMenu
hirist

Data Engineer - Python/Pandas

Pravi HR Advisory
Hyderabad
4 - 6 Years

Posted on: 17/07/2025

Job Description

We are seeking a highly skilled Data Engineer to join our AI/ML team. The ideal candidate will design and maintain robust data pipelines to support AI model development and deployment across large scale, high-performance systems. You will work closely with data scientists, machine learning engineers, and software developers to ensure data reliability, scalability, and performance.


Key Responsibilities :

- Design, build, and optimize scalable ETL/ELT data pipelines for structured and unstructured data.

- Collaborate with AI/ML teams to prepare, clean, and manage datasets for model training and evaluation.

- Create and manage data lakes and data warehouses for efficient access and long-term storage.

- Develop real-time data ingestion pipelines using streaming platforms (e.g., Kafka, Spark Streaming, Flink).

- Automate and schedule data workflows using tools like Apache Airflow, Luigi, or Prefect.

- Ensure data quality, integrity, and lineage using validation frameworks and monitoring tools.

- Work with DevOps to deploy, monitor, and scale data infrastructure in cloud environments (AWS, GCP, Azure).

- Collaborate on building feature stores and maintaining versioned datasets for reproducible ML workflows.


Required Skills :


- Proficiency in Python and experience with data libraries (Pandas, NumPy, PyArrow).

- Strong knowledge of SQL and working experience with relational and NoSQL databases (PostgreSQL, MongoDB, BigQuery, etc.).

- Experience with data pipeline orchestration tools (Airflow, DBT, etc.).

- Familiarity with distributed systems and tools like Apache Kafka, Spark, or Hadoop.

- Understanding of AI/ML lifecycle and experience in data prep for machine learning.

- Hands-on experience with Docker, Git, and CI/CD pipelines.


Preferred Qualifications :

- Background in machine learning, data science, or big data systems.

- Experience with cloud data platforms (AWS Glue, S3, Redshift, GCP Dataflow, BigQuery).

- Exposure to ML metadata tracking tools (MLflow, Weights & Biases, or Neptune.ai).

- Knowledge of data governance and security best practices.


Education :

- Bachelors or Masters in Computer Science, Data Engineering, or a related field.


info-icon

Did you find something suspicious?