About Gradera : Digital Twin & Physical AI Platform

At Gradera, we are building a next-generation Digital Twin and Physical AI platform that enables enterprises to model, simulate, and optimize complex real-world systems. Our work brings together strategy, architecture, data, simulation, and experience design to power decision-making across large-scale operational environments such as manufacturing, logistics, and supply chain networks.

This platform-led initiative applies AI-native execution, advanced simulation, and governed orchestration to help organizations test scenarios, predict outcomes, and continuously improve performance. We operate with an enterprise-first mindset prioritizing reliability, transparency, and measurable business impact as we build intelligent systems that scale beyond a single industry or use case.

Role : Data Engineer

Overview :

We are seeking skilled Data Engineers to join our Data & Digital Twin Foundation team. You will design, build, and maintain data pipelines that power digital twin platforms, real-time operational systems, and AI/ML workloads.

Working closely with data architects, simulation engineers, and ML teams, you will transform raw operational data into high-quality, governed datasets that drive intelligent decision-making.

Our core data platform stack includes :

Data Platform & Lakehouse :

- Databricks as the single point of truth for all data

- Realtime Data Pipelines implemented using Kafka for data ingestion.

- Databricks SQL for analytical queries

- Unity Catalog for metadata management and governance

- Terradata for data warehouse and business intelligence.

Stream & Event Processing :

- Apache Kafka for real-time event ingestion

- Structured Streaming for continuous data processing

- Delta Live Tables for declarative, quality-enforced pipelines

Data Quality :

- Delta Live Tables expectations for data validation

- Data profiling and anomaly detection

Key Responsibilities :

- Design, develop, and maintain scalable data pipelines using Databricks, PySpark, and Delta Lake

- Build real-time and batch data ingestion pipelines from diverse operational systems using high-performance Kafka data pipelines.

- Implement data transformations that serve digital twin platforms and operational analytics

- Integrate Kafka event streams with Databricks for real-time operational state updates

- Implement data quality checks using Delta Live Tables expectations

- Ensure data governance compliance through Unity Catalog (lineage, access control, metadata)

- Optimize pipeline performance, reliability, and cost efficiency

- Write clean, well-documented, and testable code following engineering best practices

- Collaborate with ML engineers to deliver feature-engineered datasets

- Participate in code reviews, knowledge sharing, and continuous improvement initiatives

- Support production data systems through monitoring, troubleshooting, and incident resolution.

- Build business data warehouse solutions using Terradata for business intelligence.

Preferred Qualifications :

- 4+ years of hands-on data engineering experience

- Track record of building and maintaining production-grade data pipelines

- Experience with Delta Live Tables for declarative pipeline development

- Experience working in agile, cross-functional teams

- Familiarity with time-series data patterns and operational data modelling

Highly Desirable :

- Experience building data pipelines for digital twin or simulation platforms

- Familiarity with operational state modeling for real-time systems

- Exposure to physics-informed or time-series ML feature engineering

- Experience working with distributed, multidisciplinary teams

- Exposure to industrial domains such as Manufacturing, Logistics, or Transportation is a plus

Location: Hyderabad, Telangana Department: Engineering Employment Type: Full-Time