HamburgerMenu
hirist

Deep Learning Engineer - Machine Learning Models

Wise Craft Consulting Pvt. Ltd
Hyderabad
8 - 15 Years

Posted on: 05/11/2025

Job Description

Key Responsibilities :

- Model Porting & Deployment : Port and deploy complex deep learning models from various frameworks (e.g., PyTorch, TensorFlow) to proprietary or commercial ML accelerator hardware platforms (e.g., TPUs, NPUs, GPUs).

- Performance Optimization : Analyse and optimize the performance of ML models for target hardware, focusing on latency, throughput, and power consumption.

- Quantization : Lead the efforts in model quantization (e.g., INT8) to reduce model size and accelerate inference while preserving model accuracy.

- Profiling & Debugging : Utilize profiling tools to identify performance bottlenecks and debug issues in the ML inference pipeline on the accelerator.

- Collaboration : Work closely with the, hardware, and software teams to understand model requirements and hardware capabilities, providing feedback to improve both.

- Tooling & Automation : Develop and maintain tools and scripts to automate the model porting, quantization, and performance testing workflows.

- Research & Innovation : Stay current with the latest trends and research in ML hardware, model compression, and optimization techniques.


Required Qualifications :


- Experience : 8 to 10 years of professional experience in software engineering, with a focus on model deployment and optimization.

- Technical Skills :

- Deep expertise in deep learning frameworks such as PyTorch and TensorFlow.

- Proven experience in optimizing models for inference onNPUs, TPUs, or other specialized accelerators.

- Extensive hands-on experience with model quantization (e.g., Post-Training Quantization, Quantization-Aware Training).

- Strong proficiency in C++ and Python, with experience writing high-performance, low-level code.

- Experience with GPU programming models like CUDA/cuDNN.

- Familiarity with ML inference engines and runtimes (e.g., TensorRT, OpenVINO, TensorFlow Lite).

- Strong understanding of computer architecture principles, including memory hierarchies, SIMD/vectorization, and cache optimization.

- Version Control : Proficient with Git and collaborative development workflows.

- Education : Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field.


Preferred Qualifications :


- Experience with hardware-aware model design and co-design.

- Knowledge of compiler technologies for deep learning.

- Contributions to open-source ML optimization projects.

- Experience with real-time or embedded systems.

- Knowledge of cloud platforms (AWS, GCP, Azure) and MLOps best practices.

- Familiarity with CI/CD pipelines and automated testing for ML models.

- Domain knowledge in areas like computer vision, natural language processing, or speech recognition.


info-icon

Did you find something suspicious?