- Data Pipeline Development : Design, build, and optimize scalable, secure, and reliable data pipelines to ingest, process, and transform large volumes of structured and unstructured data.

- Data Architecture : Architect and maintain data storage solutions, including data lakes, data warehouses, and databases, ensuring performance, scalability, and cost-efficiency.

- Data Integration : Integrate data from diverse sources, including APIs, third-party systems, and streaming platforms, ensuring data quality and consistency.

- Performance Optimization : Monitor and optimize data systems for performance, scalability, and cost, implementing best practices for partitioning, indexing, and caching.

- Collaboration : Work closely with data scientists, analysts, and software engineers to understand data needs and deliver solutions that enable advanced analytics, machine learning, and reporting.

- Data Governance : Implement data governance policies, ensuring compliance with data security, privacy regulations (e.g., GDPR, CCPA), and internal standards.

- Automation : Develop automated processes for data ingestion, transformation, and validation to improve efficiency and reduce manual intervention.

- Mentorship : Guide and mentor junior data engineers, fostering a culture of technical excellence and continuous learning.

- Troubleshooting: Diagnose and resolve complex data-related issues, ensuring high availability and reliability of data systems.

Required Qualifications

- Education : Bachelors or Masters degree in Computer Science, Engineering, Data Science, or a related field.

- Experience : 3+ years of experience in data engineering or a related role, with a proven track record of building scalable data pipelines and infrastructure.

Technical Skills :

- Proficiency in programming languages such as Python

- Expertise in SQL and experience with NoSQL databases (e.g., MongoDB, Cassandra).

- Strong experience with cloud platforms (e.g., AWS, GCP) and their data services (e.g., Redshift, BigQuery, Snowflake).

- Hands-on experience with ETL/ELT tools (e.g., Apache Airflow, Talend, Informatica) and data integration frameworks.

- Familiarity with big data technologies (e.g., Hadoop, Spark, Kafka) and distributed systems.

- Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes) is a plus.

Soft Skills :

- Excellent problem-solving and analytical skills.

- Strong communication and collaboration abilities.

- Ability to work in a fast-paced, dynamic environment and manage multiple priorities.

- Certifications (optional but preferred) : Cloud certifications (e.g., AWS Certified Data Analytics, Google Professional Data Engineer) or relevant data engineering certifications.