Posted on: 21/09/2025
Were looking for a seasoned Senior Software Development Engineer (SDE III) Data Engineering with deep expertise in building large-scale data ingestion, processing, and analytics platforms.
In this role, youll collaborate closely with Product, Design, and cross-functional stakeholders to design and implement high-throughput data pipelines and lakehouse solutions that power analytics, AI, and real-time decision-making for predicting and preventing cyber breaches.
What Youll Do :
- Technical Leadership: Lead and mentor developers, fostering a culture of innovation, quality, and continuous improvement.
- Design and Develop: Architect and implement high-scale data pipelines leveraging Apache Spark, Flink, and Airflow to process streaming and batch data efficiently.
- Data Lakehouse and Storage Optimization: Build and maintain data lakes and ingestion frameworks using Snowflake, Apache Iceberg, and Parquet, ensuring scalability, cost efficiency, and optimal query performance.
- Data Modeling and System Design: Design robust, maintainable data models to handle structured and semi-structured datasets for analytical and operational use cases.
- Real-time and Batch Processing: Develop low-latency pipelines using Kafka and Spark Structured Streaming, supporting billions of events per day.
- Workflow Orchestration: Automate and orchestrate end-to-end ELT processes with Airflow, ensuring reliability, observability, and recovery from failures.
- Cloud Infrastructure: Build scalable, secure, and cost-effective data solutions leveraging AWS native services (S3, Lambda, ECS, etc.
- Mentorship: Lead by example, guide junior engineers, conduct reviews, and drive adoption of best practices for data engineering excellence.
- Monitoring and Optimization: Implement strong observability, data quality checks, and performance tuning to maintain high data reliability and pipeline efficiency.
What Were Looking For :
- Bachelors or Master's degree in Computer Science, Engineering, or a related field.
- 6+ years of experience in data engineering with a proven track record of designing large-scale, distributed data systems.
- Strong expertise in Snowflake and other distributed analytical data stores.
- Hands-on experience with Apache Spark, Flink, Airflow, and modern data lakehouse formats (Iceberg, Parquet).
- Deep understanding of data modeling, schema design, query optimization, and partitioning strategies at scale.
- Proficiency in Python, SQL, Scala, Go/Nodejs with strong debugging and performance-tuning skills.
- Experience in streaming architectures, CDC pipelines, and data observability frameworks.
- Ability to mentor engineers, review designs, and lead technical discussions.
- Familiarity with using AI Coding assistants like Cursor, Claude Code, or GitHub Copilot.
Preferred Qualifications :
- Exposure to CI/CD pipelines, automated testing, and infrastructure-as-code for data workflows.
- Familiarity with streaming platforms (Kafka, Kinesis, Pulsar) and real-time analytics engines (Druid, Pinot, Rockset).
- Understanding of data governance, lineage tracking, and compliance requirements in a multi-tenant SaaS platform.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1549693
Interview Questions for you
View All