HamburgerMenu
hirist

Job Description

About the Role :


We are seeking a highly skilled Data Engineer to design, build, and maintain scalable, high-performance data pipelines and systems that support advanced analytics and business intelligence.

You will work with cutting-edge big data technologies and frameworks to ingest, process, transform, and manage massive datasets.

The ideal candidate thrives in a fast-paced environment, is passionate about data innovation, and excels at bridging academic research with production-scale business solutions.


Key Responsibilities :

- Design, develop, and maintain robust, scalable big data pipelines using tools such as Apache Spark, Airflow, Bodo, Flume, Flink, and others for ingesting and processing terabytes of data efficiently.

- Architect and implement data warehouses and data marts utilizing technologies like Presto, Snowflake, Hadoop, and other cloud or on-premise data platforms.

- Build optimized data models and schemas to enable fast and reliable query performance.

- Develop and manage strategies for data ingestion, cleansing, transformation, and aggregation from heterogeneous sources including structured, semi-structured, and unstructured data.

- Understand and apply graph query languages such as GQL, Gremlin, Cypher for modeling and querying complex relationships.

- Build scalable graph-based system architectures to support business use cases involving relationship and network analysis.

- Apply graph data technologies for business impact, including data management, infrastructure optimization, budgeting, trade-offs, and workflow/project management.

- Write efficient and maintainable code in Python, Scala, or Rust, especially for supercomputing environments handling large-scale datasets (TB+).

- Develop quantitative analytics and business operation dashboards using tools like Tableau, Apache Superset, or similar visualization platforms.

- Stay abreast of developments in natural language processing (NLP) and large language models (LLMs) and incorporate these capabilities where relevant.

- Evaluate cutting-edge academic methods and prototype their application in production systems.

- Use advanced problem-solving skills combined with deep understanding of statistics, probability, algorithms, and mathematics to deliver innovative data solutions.

- Collaborate effectively with data scientists, software engineers, product managers, and business stakeholders to deliver high-impact projects.

- Operate efficiently in a dynamic, fast-moving development environment with changing priorities, tight deadlines, and limited resources.

- Demonstrate strong self-motivation and ability to work independently as well as within cross-functional teams.


Required Qualifications & Skills :

- Bachelors or Masters degree in Computer Science, Engineering, Data Science, or related discipline.

- Extensive experience in building and managing big data pipelines using Apache Spark, Airflow, Flink, Kafka, or comparable frameworks.

- Strong expertise in data warehousing solutions and data modeling using platforms like Snowflake, Presto, Hadoop.

- Proficiency with graph databases and query languages (GQL, Gremlin, Cypher) is highly desirable.

- Advanced programming skills in Python, Scala, or Rust, with a focus on performance and scalability.

- Familiarity with data visualization tools such as Tableau, Superset, or equivalents.

- Experience with data mining, relational and NoSQL databases, and data automation practices.

- Understanding of natural language processing (NLP) and working with large language models is a plus.

- Strong foundation in probability, statistics, algorithms, and mathematical modeling.

- Excellent analytical, problem-solving, and communication skills


info-icon

Did you find something suspicious?