HamburgerMenu
hirist

Engineering Lead - Workflow Orchestration

Recruit Square
Multiple Locations
8 - 12 Years

Posted on: 24/11/2025

Job Description

Description :


We are seeking a seasoned Engineering Lead with deep expertise in workflow orchestration systems, stateful execution engines, and distributed task runtimes. You will architect the next generation of our DAG-based runtime, develop the infrastructure for agentic workflow composition, and drive execution excellence across engineering. This role combines hands-on systems design with technical leadership and team mentorship.

Key Responsibilities :

Architect & Build the Swarm Runtime :

- Design and implement a DAG-based orchestration engine using Temporal, Argo Workflows, or equivalent event-driven runtimes.

- Build a scalable primitive registry for tasks, operators, guards, and computational nodes.

- Architect a robust scheduler capable of handling event triggers, retries, backoffs, and distributed coordination.

Develop the Workflow Composition Layer :

- Define and build a YAML/JSON-based DSL for describing agentic workflows, dependencies, and execution semantics.

- Create a schema-driven rules engine ensuring validations, model calls, parallelism, conditional branching, and approval gates are seamlessly integrated.

Orchestration Logic & Runtime Intelligence :

- Implement orchestration logic that coordinates :

1. Validation layers

2. Model calls (LLMs, embedding engines, external APIs)


3. Human-in-the-loop approval gates

4. Stateful transitions and checkpointing

- Ensure deterministic execution, traceability, and safe rollback mechanisms.

Workflow Certification & Automated Testing :

- Define certification standards for every workflow type, including :

1. Functional correctness

2. Latency and concurrency thresholds

3. Error-handling expectations

4. Observability and trace coverage

- Build automated regression test suites validating workflow integrity before deployment.

Engineering Leadership & Delivery :

- Lead, mentor, and grow a team of backend and systems engineers.

- Drive sprint planning, reviews, engineering discipline, and roadmap execution.

- Own runtime delivery deadlines, cross-team coordination, and release quality.

Observability, Reliability & Production Excellence :

- Instrument the runtime with observability hooks : metrics, tracing, structured logs, and execution heatmaps.

- Build robust retry logic, distributed locks, idempotency guards, and failover strategies.

- Improve runtime stability, throughput, and scale characteristics.

Requirements :

Must-Have :


- 8+ years of backend engineering experience building high-scale systems.

- 3+ years leading teams focused on workflows, automation, orchestration, or distributed runtimes.

- Deep understanding of :

1. Stateful orchestration engines (Temporal, Step Functions, Argo, Airflow)

2. Message queues, pub/sub systems, and event-driven patterns

3. Retry logic, compensating transactions, and idempotent operations

4. Distributed tracing, observability pipelines, and health checks

- Strong background in concurrent programming, async task management, and execution models.

- Hands-on experience with at least one systems language or backend stack (Python, Go, Rust, Node).

Nice-to-Have :


- Experience building workflow DSLs or schema-driven interpreters.

- Familiarity with LLM pipelines, agentic runtimes, or AI-driven workflow automation.

- Knowledge of Kubernetes-based runtime environments and workflow controllers.

- Experience with pluggable architecture design, sandboxing, or execution policies.

What This Role Offers :


- Ownership of the core execution engine powering Perceive Nows intelligent automation platform.

- A high-impact leadership position shaping architectural strategy and engineering culture.

- The opportunity to solve cutting-edge problems at the intersection of orchestration, distributed systems, and AI.


info-icon

Did you find something suspicious?