Description :

- First-level escalation for 247 monitoring and support of Databricks clusters, jobs, workflows, repos, and data pipelines

- SME-level troubleshooting for cluster failures, auto-scaling issues, job failures (PySpark/Scala/Spark SQL/Delta Live Tables), workspace availability

- Work with application development teams to remediate pipeline failures

- Participate in resolution of Sev1/Sev2 incidents and perform root cause analysis

- Implement workspace governance, RBAC, cluster policies, and data security best practices

- Build custom dashboards for job performance, analytics, and cluster utilization

- Maintain SOPs, runbooks, and architecture diagrams

- Escalate recurring issues to L3/platform engineering

- Debug complex Spark issues, including OOM errors, long GC cycles

Required Skills & Experience :

- 6-10 years of experience in Big Data/Cloud Data Platform Support

- SME-level knowledge of Databricks platform, Spark clusters, jobs, repos, MLflow, warehouse

- Strong experience in UNIX, SQL, Shell Scripting

- Experience with Spark UI debugging

- Hands-on with CI/CD pipelines (Azure DevOps preferred)

- Strong expertise in Apache Spark and Azure Cloud

- Educational Qualification : B.E/B.Tech/MCA

Work Details :

- 247 rotational support across Morning, Afternoon, and Night shifts

- Location : Navi Mumbai (Ghansoli)

- Work Mode : Work from Office

- Immediate Joiners Only

- Interview Process : 2-3 rounds (including Client Round)