We are seeking an experienced Staff AI Engineer to join our AI and Data Platform team, where you'll play a pivotal role in building and scaling our next-generation AI workforce platform.

You'll work on cutting-edge agent-based systems that are transforming supply chain operations for Fortune 500 companies, delivering real business value through intelligent automation.

Key Responsibilities :

Technical Leadership :

- Design and implement production-scale AI agent systems and orchestration frameworks (LangGraph, LangChain, similar architectures)

- Lead architecture for multi-agent systems handling complex business workflows

- Optimize deployment strategies using both LLMs and SLMs based on use case requirements

- Build natural language-configurable business process automation frameworks

- Implement multi-modal AI systems for document understanding (tables, charts, layouts)

AI/ML Implementation & Optimization

- Deploy and optimize LLMs/SLMs in production with fine-tuning techniques (LoRA, QLoRA, DPO)

- Implement quantization strategies (INT8, INT4) and model distillation for edge deployment

- Build evaluation frameworks including LLM-as-judge systems and regression testing

- Design streaming architectures for real-time LLM responses (SSE, WebSockets)

- Create semantic caching and embedding-based retrieval systems

- Develop GraphRAG and long-context handling strategies (100k+ tokens)

System Architecture & Engineering :

- Design scalable microservices with comprehensive observability (LangSmith, Arize, custom telemetry)

- Build secure multi-tenant systems with prompt injection prevention and output validation

- Implement cost optimization through intelligent model routing and fallback strategies

- Develop document processing pipelines with OCR and layout understanding

- Create event-driven architectures for real-time shipment tracking and exception handling

Data & Infrastructure :

- Build data pipelines for training data curation, synthetic generation, and PII masking

- Implement RLHF/RLAIF feedback loops for continuous improvement

- Design experiment tracking and model registry systems (MLflow, DVC)

- Optimize inference costs through batch processing and spot instance utilization

- Establish model governance, audit trails, and compliance frameworks

Required Qualifications :

Technical Skills :

- 8+ years software engineering, 3+ years in production AI/ML systems

- Expertise in Python, PyTorch/JAX, and AI frameworks (LangChain, Transformers, PEFT)

- Experience with LLMs (GPT-4, Claude, Gemini) and SLMs (Phi, Llama, Mistral)

Hands-on experience with :

- Fine-tuning techniques (LoRA, QLoRA, DPO, RLHF)

- Model optimization (quantization, distillation, pruning)

- Vector databases and RAG architectures

- Streaming systems and real-time processing

- Security measures (prompt injection prevention, jailbreak detection)

- Strong background in distributed systems, Kubernetes, and cloud platforms

Domain Knowledge(nice to have) :

- Experience with document intelligence and multi-modal AI systems

- Understanding of supply chain operations, EDI/API integrations

- Knowledge of token economics and consumption-based pricing models

- Familiarity with enterprise compliance requirements (GDPR, CCPA, SOC2)

Professional Skills :

- Track record of delivering complex projects with measurable business impact

- Experience with technical sales support, POCs, and customer success

- Strong communication for technical and non-technical audiences

- Data-driven decision making for model selection and cost optimization

Preferred Qualifications :

- Supply chain, logistics, or transportation management experience

- Experience with OCR pipelines and document extraction at scale

- Knowledge of GraphRAG and knowledge graph integration

- Contributions to open-source AI projects (Hugging Face, Ollama)

- Experience reducing inference costs by 50%+ through optimization

- Familiarity with MoE architectures and constitutional AI approaches

- Background in building usage-based billing and margin optimization

- Experience with specialized tools (vLLM, TGI, Triton, ONNX, TensorRT)

What You'll Work On :

- Building specialized AI agents solving supply chain problems

- Fine-tuning domain-specific models for supply chain terminology

- Implementing hybrid architectures combining cloud LLMs with edge SLMs

- Creating secure document intelligence systems for Fortune 500 clients

- Developing real-time exception handling for shipment tracking

- Building observability and evaluation frameworks for agent performance

- Designing fallback strategies and multi-provider redundancy

Technical Environment :

- Models : GPT-4, Claude, Gemini, Llama 3, Mistral, Phi-3, custom fine-tuned models

- Fine-tuning : LoRA/QLoRA, PEFT, DeepSpeed, bitsandbytes, Axolotl

- Infrastructure : Kubernetes, AWS SageMaker/Bedrock, GPU clusters, edge devices

- Frameworks : LangChain, LangGraph, vLLM, FastAPI, Transformers

- Observability : LangSmith, Weights & Biases, custom telemetry

- Data : PostgreSQL, Redis, Vector DBs, Kafka, feature stores

Impact & Growth :

You'll directly contribute to AI initiatives generating millions in revenue while shaping systems processing millions of transactions daily.

Lead technical decisions affecting 25+ engineers while mentoring the next generation of AI engineers.

Be at the forefront of production AI optimization, balancing performance, cost, and latency for enterprise customer