- Architect at Scale: Design and lead petabyte-scale data ingestion, processing, and analytics platforms using Snowflake, Apache Spark, Iceberg, Parquet, and AWS-native services.

- Own the Data Flow: Build streaming and batch pipelines handling billions of events daily, orchestrated through Apache Airflow for reliability and fault tolerance.

- Set the Standards: Define frameworks for data modeling, schema evolution, partitioning strategies, and data quality/observability for analytics and AI workloads.

- Code Like a Pro: Stay hands-on, writing high-performance data processing jobs in Python, SQL, and Scala, and conducting deep-dive reviews when it matters most.

- Master the Lakehouse: Architect data lakes and warehouse solutions that balance cost, performance, and scalability, leveraging AWS S3 and Snowflake.

- Solve Complex Problems: Elegantly and efficiently debug and optimize long-running jobs, data skew, and high-volume ETL bottlenecks.

- Collaborate and influence: Work with the Product, AI/ML, and Platform teams to ensure that data solutions directly power real-time cyber risk analytics.

- Innovate Constantly: Evaluate and introduce emerging data technologies (e.g., Flink, Druid, Rockset) to keep SAFE at the forefront of data engineering innovation.

What Were Looking For :

- 8+ years of experience in data engineering, with a proven track record of designing and scaling distributed data systems.

- Deep expertise in big data processing frameworks (Apache Spark, Flink) and workflow orchestration (Airflow).

- Strong hands-on experience with data warehousing (Snowflake) and data lakehouse architectures (Iceberg, Parquet).

- Proficiency in Python, SQL, Scala, Go/Nodejs with an ability to optimize large-scale ETL/ELT workloads.

- Expertise in real-time data ingestion pipelines using Kafka or Kinesis, handling billions of events daily.

- Experience operating in cloud-native environments (AWS) and leveraging services like S3, Lambda, ECS, Glue, and Athena.

- Strong understanding of data modeling, schema design, indexing, and query optimization for analytical workloads.

- Proven leadership in mentoring engineers, driving architectural decisions, and aligning data initiatives with product goals.

- Experience in streaming architectures, CDC pipelines, and data observability frameworks.

- Ability to navigate ambiguous problems, high-scale challenges, and lead teams toward innovative solutions.

- Proficient in deploying containerized applications (Docker, Kubernetes, ECS).

- Familiarity with using AI Coding assistants like Cursor, Claude Code, or GitHub Copilot.

Preferred Qualification :

- Exposure to CI/CD pipelines, automated testing, and infrastructure-as-code for data workflows.

- Familiarity with real-time analytics engines (Druid, Pinot, Rockset) or machine learning data pipelines.

- Contributions to open-source data projects or thought leadership in the data engineering community.

- Prior experience in cybersecurity, risk quantification, or other high-scale SaaS domain.

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

Payal

Associate Manager at Lucideus Technologies Private Limited

Last Active: 29 Jan 2026

Job Views:
137

Applications: 80

Recruiter Actions: 0

Posted in

Data Engineering

Functional Area

Data Engineering

Job Code

1549707

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers