HamburgerMenu
hirist

Vivid Edge Corp - Senior Machine Learning Engineer - Python

Posted on: 23/07/2025

Job Description

We are seeking a highly skilled and experienced Senior Python & ML Engineer with a strong background in PySpark, machine learning, and large language models (LLMs).

The ideal candidate will be instrumental in designing, developing, and deploying scalable data pipelines, machine learning models, and LLM-powered applications.

This role requires a deep understanding of Python's ecosystem, distributed computing with PySpark, and practical experience in building and optimizing AI solutions.


Responsibilities :


Data Engineering & ETL :


- Design, develop, and maintain robust and scalable data pipelines using PySpark for data ingestion, transformation, and loading (ETL) from various sources.

- Optimize PySpark jobs for performance, efficiency, and cost-effectiveness on large datasets.

- Implement data quality checks and ensure data integrity throughout the pipeline.


Machine Learning Development :


- Develop, train, and deploy machine learning models (e.g., classification, regression, clustering) using Python's key ML libraries (scikit-learn, TensorFlow, PyTorch).

- Perform feature engineering, model selection, hyperparameter tuning, and model evaluation.

- Integrate ML models into production systems, ensuring scalability and reliability.


Large Language Model (LLM) Integration & Development :


- Research, experiment with, and integrate state-of-the-art Large Language Models (LLMs) into applications.

- Develop and implement solutions leveraging LLMs for tasks such as natural language understanding, text generation, summarization, and question answering.

- Fine-tune and adapt pre-trained LLMs for specific business needs and datasets.

- Explore and implement techniques for prompt engineering, RAG (Retrieval Augmented Generation), and LLM evaluation.


Technical Leadership & Collaboration :


- Collaborate closely with data scientists, other engineers, and product managers to understand requirements and translate them into technical solutions.

- Mentor junior team members and contribute to best practices for code quality, testing, and deployment.

- Participate in code reviews, design discussions, and architectural decisions.

- Stay up-to-date with the latest advancements in Python, PySpark, ML, and LLMs.


Deployment & Operations :


- Work with MLOps principles to ensure seamless deployment, monitoring, and maintenance of models and

applications in production environments.

- Troubleshoot and resolve issues related to data pipelines, ML models, and LLM applications.


Required Skills & Qualifications :


Education : Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.


Python Expertise :


- Strong proficiency in Python programming, including object-oriented programming, data structures, and algorithms.

- In-depth knowledge of Python's scientific computing stack (NumPy, Pandas).

- Experience with testing frameworks (e.g., pytest) and version control (Git).


PySpark Proficiency :


- Extensive hands-on experience with PySpark for big data processing and analytics.

- Solid understanding of Spark architecture, RDDs, DataFrames, and Spark SQL.


- Experience with optimizing Spark jobs for performance and resource utilization.


Machine Learning :


- Proven experience in building and deploying machine learning models in a production environment.

- Proficiency with key ML libraries such as scikit-learn, TensorFlow, and/or PyTorch.

- Understanding of various machine learning algorithms, their strengths, and limitations.


Large Language Models (LLMs) :


- Practical experience working with and integrating LLMs (e.g., OpenAI GPT series, Llama, Hugging Face models).

- Familiarity with LLM frameworks (e.g., Hugging Face Transformers, LangChain, LlamaIndex).

- Understanding of concepts like embeddings, tokenization, prompt engineering, and fine-tuning.


Cloud Platforms (Preferred) :


- Experience with cloud platforms like AWS (S3, EMR, SageMaker), Azure (Databricks, Azure ML), or GCP (Dataproc, AI Platform).


Other Key Skills :


- Strong problem-solving and analytical skills.

- Excellent communication and teamwork abilities.

- Ability to work independently and as part of a collaborative team.


Nice-to-Have Skills :


- Familiarity with R and Shiny: Understanding of the R programming language and experience with

developing interactive web applications using Shiny.

- Experience with streaming data technologies (e.g., Kafka, Spark Streaming).

- Familiarity with containerization technologies (Docker, Kubernetes).


- Knowledge of MLOps tools and practices (e.g., MLflow, Kubeflow).

- Experience with graph databases or other NoSQL databases.

- Contributions to open-source projects.


info-icon

Did you find something suspicious?