HamburgerMenu
hirist

DemandMatrix - Data Engineer - Big Data Platform

DemandMatrix
Anywhere in India/Multiple Locations
7 - 10 Years

Posted on: 18/07/2025

Job Description

Job Description :


Key Responsibilities :


- Design, develop, and maintain robust big data pipelines using Spark (PySpark), Hadoop, and related technologies.

- Architect and implement large-scale data platforms on AWS or GCP with high availability, scalability, and performance.

- Handle structured and unstructured data transformation, cleansing, and integration for analytics, AI, and knowledge graph applications.

- Optimize data processing workflows with an in-depth understanding of data locality, disk I/O, network I/O, and shuffling strategies.

- Develop and deploy data applications on Unix/Linux-based environments ensuring optimal system performance and reliability.

- Apply strong software engineering practices, including code reviews, testing, version control, and continuous integration.

- Configure and integrate data tools such as Sqoop, Flume, Pig, Hive, and RDBMS for ETL and big data querying use cases.

- Monitor and ensure performance, data quality, and security across data infrastructure and pipelines.

- Lead and contribute to architectural decisions and reference implementations adhering to industry best practices.

- Uphold and promote high standards in engineering, following Unix philosophy and functional programming principles.

- Collaborate with cross-functional teams to understand business needs and translate them into scalable data solutions.


Required Skillsets :


- Strong foundation in Computer Engineering, Unix/Linux, Data Structures, and Algorithms.

- 7+ years of experience in big data platform architecture and development on AWS or GCP.

- Deep expertise in Apache Spark (especially PySpark) and Hadoop/MapReduce.

- Solid understanding of data processing models (streaming, batch, event-based).

- Experience with NoSQL stores like MongoDB, HBase/HDFS, and Elasticsearch.

- Proficiency in Python and working with non-structured text data.

- Experience with ETL frameworks such as Sqoop, Flume, and big data querying tools like Hive, Pig.

- Familiarity with RDBMS systems and SQL performance tuning.

- Strong grasp of big data design patterns, functional computation models, and orthogonal code design.

- Passion for clean, maintainable code with a commitment to engineering excellence and standards.


info-icon

Did you find something suspicious?