Cognite is revolutionizing industrial data management through our flagship product, Cognite Data Fusion, a state-of-the-art SaaS platform that transforms how industrial companies leverage their data.

We're seeking a senior data platform engineer who excels at building high-performance distributed systems and thrives in a fast-paced startup environment. You'll be working on cutting-edge data infrastructure challenges that directly impact how Fortune 500 industrial companies manage their most critical operational data.

The core responsibilities for the job include the following :

High-Performance Data Systems :

- Design and implement robust data processing pipelines using Apache Spark, Flink, and Kafka for terabyte-scale industrial datasets.

- Build efficient APIs and services that serve thousands of concurrent users with sub-second response times.

- Optimize data storage and retrieval patterns for time-series, sensor, and operational data.

- Implement advanced caching strategies using Redis and in-memory data structures.

Distributed Processing Excellence :

- Engineer Spark applications with a deep understanding of Catalyst optimizer, partitioning strategies, and performance tuning

- Develop real-time streaming solutions processing millions of events per second with Kafka and Flink.

- Design efficient data lake architectures using S3/GCS with optimized partitioning and file formats (Parquet, ORC).

- Implement query optimization techniques for OLAP datastores like ClickHouse, Pinot, or Druid.

Scalability and Performance :

- Scale systems to 10K+ QPS while maintaining high availability and data consistency.

- Optimize JVM performance through garbage collection tuning and memory management.

- Implement comprehensive monitoring using Prometheus, Grafana, and distributed tracing.

- Design fault-tolerant architectures with proper circuit breakers and retry mechanisms.

Technical Innovation :

- Contribute to open-source projects in the big data ecosystem (Spark, Kafka, Airflow).

- Research and prototype new technologies for industrial data challenges.

- Collaborate with product teams to translate complex requirements into scalable technical solutions.

- Participate in architectural reviews and technical design discussions.

Requirements :

- Distributed Systems Experience (2-6 years) : Production Spark experience; built and optimized large-scale Spark applications with understanding of internals; streaming systems proficiency; implemented real-time data processing using Kafka, Flink, or Spark Streaming; JVM language expertise; strong programming skills in Java, Scala, or Kotlin with performance optimization experience.

- Data Platform Foundations (3+ years) : Big data storage systems; hands-on experience with data lakes, columnar formats, and table formats (Iceberg, Delta Lake); OLAP query engines; worked with Presto/Trino, ClickHouse, Pinot, or similar high-performance analytical databases; ETL/ELT pipeline development; built robust data transformation pipelines using tools like DBT, Airflow, or custom frameworks.

- Infrastructure and Operations : Kubernetes production experience. deployed and operated containerized applications in production environments. Cloud platform proficiency and hands-on experience with AWS, Azure, or GCP data services.

- Monitoring and observability : implemented comprehensive logging, metrics, and alerting for data systems.

Technical Debt Indicators :

- Performance Engineering : System optimization experience; delivered measurable performance improvements (2x+ throughput gains).

- Resource efficiency : optimized systems for cost while maintaining performance requirements.

- Concurrency expertise : designed thread-safe, high-concurrency data processing systems.

- Data Engineering Best Practices : Data quality frameworks; implemented validation, testing, and monitoring for data pipelines.

- Schema evolution : managed backward-compatible schema changes in production systems.

- Data modeling expertise : designed efficient schemas for analytical workloads.

Collaboration and Growth :

- Technical Collaboration : Cross-functional partnership worked effectively with product managers, ML engineers, and data scientists.

- Code review excellence : provided thoughtful technical feedback and maintained high code quality standards.

- Documentation and knowledge sharing : created technical documentation and participated in knowledge transfer.

- Continuous Learning : Technology adoption; quickly learned and applied new technologies to solve business problems.

- Industry awareness : stayed current with big data ecosystem developments and best practices.

- Problem-solving approach : demonstrated a systematic approach to debugging complex distributed system issues.

Startup Mindset :

- Execution Excellence : Rapid delivery; consistently shipped high-quality features within aggressive timelines.

- Technical pragmatism : made smart trade-offs between technical debt, velocity, and system reliability.

- End-to-end ownership : took responsibility for features from design through production deployment and monitoring.

- Ambiguity comfort : thrived in environments with evolving requirements and unclear specifications.

- Technology flexibility : adapted to new tools and frameworks based on project needs.

- Customer focus : understood how technical decisions impact user experience and business metrics.

Bonus Points :

- Open-source contributions to major Apache projects in the data space (e. g., Apache Spark or Kafka) are a big plus.

- Conference speaking or technical blog writing experience, industrial domain knowledge, and previous experience with IoT, manufacturing, or operational technology systems.

Primary Technologies (Technical Stack) :

- Languages : Kotlin, Scala, Python, and Java.

- Big Data : Apache Spark, Apache Flink, Apache Kafka.

- Storage : PostgreSQL, ClickHouse, Elasticsearch, S3-compatible systems.

- Infrastructure : Kubernetes, Docker, Terraform.

Technologies You May Work With :

- Table Formats : Apache Iceberg, Delta Lake, Apache Hudi.

- Query Engines : Trino/Presto, Apache Pinot, DuckDB.

- Orchestration : Apache Airflow, Dagster.

- Monitoring : Prometheus, Grafana, Jaeger, and ELK Stack.