The core responsibilities for the job include the following :

Data Engineering :

- Build and maintain scalable data preparation pipelines using Python and relevant tools.

- Optimize data workflows to handle large datasets efficiently.

- Work with structured, semi-structured, and unstructured data sources to preprocess and transform data for LLM consumption.

LLM Prompt Development :

- Develop, test, and refine prompts to optimize LLM outputs for specific use cases.

- Collaborate with data scientists to fine-tune LLMs or utilize pre-trained models effectively.

- Research and implement best practices for prompt engineering to achieve high-quality results.

API Hosting :

- Design and deploy RESTful APIs to host LLM-based solutions.

- Ensure APIs are secure, performant, and scalable to handle high traffic.

- Integrate APIs with monitoring and logging solutions to track performance and issues.

Collaboration and Documentation :

- Collaborate with cross-functional teams to gather requirements and deliver solutions.

- Write comprehensive technical documentation, including API specs and data pipeline designs.

Requirements :

- Strong Python programming skills with experience in building production-grade applications.

- Experience with data engineering tools such as Pandas, PySpark, Airflow, or similar.

- Hands-on experience with LLMs (i.e., OpenAI, Hugging Face) and prompt development.

- Proficiency in designing and deploying APIs using frameworks like FastAPI or Flask REST Framework.

- Familiarity with cloud platforms (AWS, Azure, or GCP) and containerization tools like Docker.

- Knowledge of SQL and experience with relational databases like PostgreSQL, MySQL, etc.

- Basic understanding of API security (i.e., OAuth, JWT) and scalability strategies.

Nice to Have :

- Experience with vector databases like Pinecone, Weaviate, or FAISS for LLM embeddings.

- Familiarity with MLOps tools like MLflow or Kubeflow.

- Exposure to streaming technologies such as Kafka or Spark Streaming.

- Knowledge of LLM fine-tuning techniques using frameworks like PyTorch or TensorFlow.

- Experience with monitoring tools such as Prometheus, Grafana, or Datadog.

- Preferably from product-based SaaS or AI companies with experience in enterprise software, cloud, AI-driven platforms, and B2B