SparkSpark Streaming

Cognite is revolutionising industrial data management through our flagship product, Cognite Data Fusion - a state-of-the-art SaaS platform that transforms how industrial companies leverage their data. We're seeking a Senior Data Platform Engineer who excels at building high-performance distributed systems and thrives in a fast-paced startup environment. You'll be working on cutting-edge data infrastructure challenges that directly impact how Fortune 500 industrial companies manage their most critical operational data.

Responsibilities :

- High-Performance Data Systems : Design and implement robust data processing pipelines using Apache Spark, Flink, and Kafka for terabyte-scale industrial datasets.

- Build efficient APIs and services that serve thousands of concurrent users with sub-second response times.

- Optimise data storage and retrieval patterns for time-series, sensor, and operational data.

- Implement advanced caching strategies using Redis and in-memory data structures.

Distributed Processing Excellence :

- Engineer Spark applications with a deep understanding of Catalyst optimiser, partitioning strategies, and performance tuning

- Develop real-time streaming solutions processing millions of events per second with Kafka and Flink.

- Design efficient data lake architectures using S3/GCS with optimised partitioning and file formats (Parquet, ORC).

- Implement query optimisation techniques for OLAP datastores like ClickHouse, Pinot, or Druid.

- Scalability and Performance :

- Scale systems to 10K+ QPS while maintaining high availability and data consistency.

- Optimise JVM performance through garbage collection tuning and memory management.

- Implement comprehensive monitoring using Prometheus, Grafana, and distributed tracing.

- Design fault-tolerant architectures with proper circuit breakers and retry mechanisms.

Technical Innovation :

- Contribute to open-source projects in the big data ecosystem (Spark, Kafka, Airflow).

- Research and prototype new technologies for industrial data challenges.

- Collaborate with product teams to translate complex requirements into scalable technical solutions.

- Participate in architectural reviews and technical design discussions.

Requirements :

Distributed Systems Experience (4-6 years) :

- Production Spark experience - built and optimised large-scale Spark applications with understanding of internals

- Streaming systems proficiency - implemented real-time data processing using Kafka, Flink, or Spark Streaming

- JVM Language expertise - strong programming skills in Java, Scala, or Kotlin with performance optimisation experience.

Data Platform Foundations (3+ years) :

- Big data storage systems - hands-on experience with data lakes, columnar formats, and table formats (Iceberg, Delta Lake)

- OLAP query engines - worked with Presto/Trino, ClickHouse, Pinot, or similar high-performance analytical databases

- ETL/ELT pipeline development - built robust data transformation pipelines using tools like DBT, Airflow, or custom frameworks

Infrastructure and Operations :

- Kubernetes production experience -deployed and operated containerised applications in production environments.

- Cloud platform proficiency - hands-on experience with AWS, Azure, or GCP data services.

- Monitoring and observability - implemented comprehensive logging, metrics, and alerting for data systems.

Technical Depth Indicators :

- Performance Engineering - System optimisation experience - delivered measurable performance improvements (2x+ throughput gains).

- Resource efficiency - optimised systems for cost while maintaining performance requirements.

- Concurrency expertise - designed thread-safe, high-concurrency data processing systems.

- Data Engineering Best Practices - Data quality frameworks -implemented validation, testing, and monitoring for data pipelines.

- Schema evolution - managed backwards-compatible schema changes in production systems.

- Data modelling expertise - designed efficient schemas for analytical workloads.