Posted on: 17/03/2026
We are looking for a skilled Gen AI Platform Engineer to join our team. The ideal candidate will have experience in managing LLM-based systems, with expertise in infrastructure management, prompt versioning, fine-tuning, and deployment. This role requires a strong understanding of GenAI workloads, performance tuning, scalability, and governance in cloud environments such as AWS, Azure, and Google Cloud.
Key Responsibilities :
- Manage and oversee the infrastructure for LLM-based systems, ensuring seamless operation and scalability.
- Fine-tune, evaluate, and deploy prompt-based models, leveraging industry-standard tools and platforms.
- Ensure the performance, scalability, and governance of GenAI workloads in cloud environments (AWS, Azure, Google Cloud).
- Build and deploy AI use cases and solutions using the respective platforms and tools.
- Collaborate with cross-functional teams to ensure effective deployment and performance optimization.
- Lead the evaluation and enhancement of LLM-based models through iterative testing and fine-tuning.
- Handle deployment pipelines, including CI/CD for LLM models.
- Contribute to setting up automated processes for model fine-tuning and versioning.
- Work on optimizing cloud-based infrastructure to support the growth of GenAI workloads.
Required Skills :
- Strong experience with cloud platforms such as AWS Sagemaker, Google Vertex, or Azure AI.
- Proficiency in handling LLM systems, prompt fine-tuning, and versioning.
- Hands-on experience with infrastructure management, model deployment, and optimization.
- Strong understanding of cloud architecture, performance, and scalability for GenAI workloads.
- Proficiency in Python, SQL, and Bash scripting.
- Experience with machine learning frameworks such as Hugging Face, TensorFlow, PyTorch.
- Familiarity with CI/CD pipelines, Docker, Kubernetes, and MLOps workflows.
- Strong analytical skills and ability to troubleshoot complex infrastructure issues.
Nice to Have Skills :
- Familiarity with NLP frameworks and libraries such as Hugging Face, TensorFlow, PyTorch.
- Experience in working with large-scale data processing frameworks like Apache Spark, Hadoop.
- Knowledge of model explainability and interpretability techniques for LLMs.
- Familiarity with containerization technologies (e.g., Docker, Kubernetes) for model deployment and orchestration.
- Hands-on experience with MLOps pipelines.
Tools & Technical Skills :
- Platforms : AWS Sagemaker, Google Vertex, Azure AI.
- Tools : Docker, Kubernetes, Terraform, Jenkins (CI/CD), MLflow.
- Languages : Python, SQL, Bash scripting.
- Frameworks : Hugging Face, TensorFlow, PyTorch, Keras.
- Databases : MySQL, PostgreSQL, NoSQL (MongoDB, Cassandra).
- Other : Git, GitHub, Jenkins, CloudFormation.
Did you find something suspicious?