We are seeking a highly skilled and experienced Senior Python & ML Engineer with a strong background in PySpark, machine learning, and large language models (LLMs).

The ideal candidate will be instrumental in designing, developing, and deploying scalable data pipelines, machine learning models, and LLM-powered applications.

This role requires a deep understanding of Python's ecosystem, distributed computing with PySpark, and practical experience in building and optimizing AI solutions.

Responsibilities :

Data Engineering & ETL :

- Design, develop, and maintain robust and scalable data pipelines using PySpark for data ingestion, transformation, and loading (ETL) from various sources.

- Optimize PySpark jobs for performance, efficiency, and cost-effectiveness on large datasets.

- Implement data quality checks and ensure data integrity throughout the pipeline.

Machine Learning Development :

- Develop, train, and deploy machine learning models (e.g., classification, regression, clustering) using Python's key ML libraries (scikit-learn, TensorFlow, PyTorch).

- Perform feature engineering, model selection, hyperparameter tuning, and model evaluation.

- Integrate ML models into production systems, ensuring scalability and reliability.

Large Language Model (LLM) Integration & Development :

- Research, experiment with, and integrate state-of-the-art Large Language Models (LLMs) into applications.

- Develop and implement solutions leveraging LLMs for tasks such as natural language understanding, text generation, summarization, and question answering.

- Fine-tune and adapt pre-trained LLMs for specific business needs and datasets.

- Explore and implement techniques for prompt engineering, RAG (Retrieval Augmented Generation), and LLM evaluation.

Technical Leadership & Collaboration :

- Collaborate closely with data scientists, other engineers, and product managers to understand requirements and translate them into technical solutions.

- Mentor junior team members and contribute to best practices for code quality, testing, and deployment.

- Participate in code reviews, design discussions, and architectural decisions.

- Stay up-to-date with the latest advancements in Python, PySpark, ML, and LLMs.

Deployment & Operations :

- Work with MLOps principles to ensure seamless deployment, monitoring, and maintenance of models and

applications in production environments.

- Troubleshoot and resolve issues related to data pipelines, ML models, and LLM applications.

Required Skills & Qualifications :

Education : Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.

Python Expertise :

- Strong proficiency in Python programming, including object-oriented programming, data structures, and algorithms.

- In-depth knowledge of Python's scientific computing stack (NumPy, Pandas).

- Experience with testing frameworks (e.g., pytest) and version control (Git).

PySpark Proficiency :

- Extensive hands-on experience with PySpark for big data processing and analytics.

- Solid understanding of Spark architecture, RDDs, DataFrames, and Spark SQL.

- Experience with optimizing Spark jobs for performance and resource utilization.

Machine Learning :

- Proven experience in building and deploying machine learning models in a production environment.

- Proficiency with key ML libraries such as scikit-learn, TensorFlow, and/or PyTorch.

- Understanding of various machine learning algorithms, their strengths, and limitations.

Large Language Models (LLMs) :

- Practical experience working with and integrating LLMs (e.g., OpenAI GPT series, Llama, Hugging Face models).

- Familiarity with LLM frameworks (e.g., Hugging Face Transformers, LangChain, LlamaIndex).

- Understanding of concepts like embeddings, tokenization, prompt engineering, and fine-tuning.

Cloud Platforms (Preferred) :

- Experience with cloud platforms like AWS (S3, EMR, SageMaker), Azure (Databricks, Azure ML), or GCP (Dataproc, AI Platform).

Other Key Skills :

- Strong problem-solving and analytical skills.

- Excellent communication and teamwork abilities.

- Ability to work independently and as part of a collaborative team.

Nice-to-Have Skills :

- Familiarity with R and Shiny: Understanding of the R programming language and experience with

developing interactive web applications using Shiny.

- Experience with streaming data technologies (e.g., Kafka, Spark Streaming).

- Familiarity with containerization technologies (Docker, Kubernetes).

- Knowledge of MLOps tools and practices (e.g., MLflow, Kubeflow).

- Experience with graph databases or other NoSQL databases.

- Contributions to open-source projects.

Did you find something suspicious?

Posted by

Vijay Ananad

Resourcing specialist at Vivid Edge Corp

Last Active: 5 Dec 2025

Job Views:
93

Applications: 26

Recruiter Actions: 0

Posted in

AI/ML

Functional Area

ML / DL Engineering

Job Code

1518186

Jobs by location

Interview Questions for you

View All

Top 20+ NumPy Interview Questions and Answers

Top 25+ CCNA Interview Questions and Answers

Top 25 LLM Interview Questions and Answers