Posted on: 17/09/2025
What Youll Own :
- Build and extend backend services that power AI-driven media search and metadata enrichment
- Develop, integrate, and deploy AI/ML inference pipelines (embeddings, vision/audio models, transcription, background removal, etc.)
- Fine-tune and optimize computer vision and generative models (e.g., U2Net, BiRefNet, CLIP, Whisper, YOLO, diffusion models)
- Work with large datasets (100k5M images): preprocessing, augmenting, and structuring for training/inference
- Contribute to building pipelines for tasks like background removal, inpainting/outpainting, banner generation, logo/face detection, and multimodal embeddings
- Integrate with vector databases (e.g., FAISS, Pinecone, Weaviate, Qdrant) for similarity and semantic search
- Collaborate with the engineering team to deploy scalable AI inference endpoints (Docker + GPU/EC2/SageMaker)
Skills & Experience We Expect :
- Core Python (Required) : Solid programming and debugging skills in production systems
- AI/ML Libraries : Hands-on experience with PyTorch and/or TensorFlow, NumPy, OpenCV, Hugging Face Transformers
- Model Training/Fine-Tuning : Experience fine-tuning pre-trained models for vision, audio, or multimodal tasks
- Data Handling : Preprocessing and augmenting image/video datasets for training and evaluation
- Comfortable with chaining or orchestrating multimodal inference workflows (e.g., image + audio + OCR unified embedding)
Bonus Points If You :
- Have worked with generative models (diffusion, inpainting, or outpainting)
- Understand large-scale media workflows (video, design files, time-coded metadata)
- Enjoy experimenting with new models and pushing them into production
- Care about making AI useful in real-world creative pipelines
- Vector Search familiarity with FAISS, Pinecone, or similar for embeddings-based search
Did you find something suspicious?