Posted on: 30/07/2025
About the Role :
You will work with cutting-edge big data technologies and frameworks to ingest, process, transform, and manage massive datasets.
The ideal candidate thrives in a fast-paced environment, is passionate about data innovation, and excels at bridging academic research with production-scale business solutions.
Key Responsibilities :
- Design, develop, and maintain robust, scalable big data pipelines using tools such as Apache Spark, Airflow, Bodo, Flume, Flink, and others for ingesting and processing terabytes of data efficiently.
- Architect and implement data warehouses and data marts utilizing technologies like Presto, Snowflake, Hadoop, and other cloud or on-premise data platforms.
- Build optimized data models and schemas to enable fast and reliable query performance.
- Develop and manage strategies for data ingestion, cleansing, transformation, and aggregation from heterogeneous sources including structured, semi-structured, and unstructured data.
- Understand and apply graph query languages such as GQL, Gremlin, Cypher for modeling and querying complex relationships.
- Build scalable graph-based system architectures to support business use cases involving relationship and network analysis.
- Apply graph data technologies for business impact, including data management, infrastructure optimization, budgeting, trade-offs, and workflow/project management.
- Write efficient and maintainable code in Python, Scala, or Rust, especially for supercomputing environments handling large-scale datasets (TB+).
- Develop quantitative analytics and business operation dashboards using tools like Tableau, Apache Superset, or similar visualization platforms.
- Stay abreast of developments in natural language processing (NLP) and large language models (LLMs) and incorporate these capabilities where relevant.
- Evaluate cutting-edge academic methods and prototype their application in production systems.
- Use advanced problem-solving skills combined with deep understanding of statistics, probability, algorithms, and mathematics to deliver innovative data solutions.
- Collaborate effectively with data scientists, software engineers, product managers, and business stakeholders to deliver high-impact projects.
- Operate efficiently in a dynamic, fast-moving development environment with changing priorities, tight deadlines, and limited resources.
- Demonstrate strong self-motivation and ability to work independently as well as within cross-functional teams.
Required Qualifications & Skills :
- Bachelors or Masters degree in Computer Science, Engineering, Data Science, or related discipline.
- Extensive experience in building and managing big data pipelines using Apache Spark, Airflow, Flink, Kafka, or comparable frameworks.
- Strong expertise in data warehousing solutions and data modeling using platforms like Snowflake, Presto, Hadoop.
- Proficiency with graph databases and query languages (GQL, Gremlin, Cypher) is highly desirable.
- Advanced programming skills in Python, Scala, or Rust, with a focus on performance and scalability.
- Familiarity with data visualization tools such as Tableau, Superset, or equivalents.
- Experience with data mining, relational and NoSQL databases, and data automation practices.
- Understanding of natural language processing (NLP) and working with large language models is a plus.
- Strong foundation in probability, statistics, algorithms, and mathematical modeling.
- Excellent analytical, problem-solving, and communication skills
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1522334
Interview Questions for you
View All