Job Title : Generative AI Engineer (LLM Expert - AWS Focus)

Location : Remote

Employment Type : Ongoing Contract

About BigRio

BigRio is a Boston-based, remote-first technology consulting firm specializing in advanced data, cloud, and software engineering solutions. We partner with forward-thinking organizations to deliver scalable, secure, and high-performance technologies, with deep expertise in AI/ML, data engineering, and AWS-native architectures.

Our clients span healthcare, life sciences, government, and enterprise sectors, and we're known for tackling complex, high-impact challenges with cutting-edge innovation and measurable results.

About the Role :

We're seeking a hands-on Generative AI Engineer (LLM Expert) who combines strong AWS development experience (70%) with deep expertise in applied LLM engineering (30%).

This role is ideal for an engineer who has built real-world applications using OpenAI APIs and retrieval-augmented generation (RAG) - not someone focused on traditional ML or model training. You'll work with BigRio's internal AI team and client partners to design, build, and optimize LLM-powered features, integrating them into cloud-native, production-ready systems.

This is a senior technical role, not a research or experimental position. The focus is on building, shipping, and scaling LLM applications using OpenAI models, LangChain, and AWS infrastructure.

Key Responsibilities :

- Design, develop, and deploy AWS-based applications (Lambda, API Gateway, ECS, RDS, S3, Secrets Manager) that integrate LLM-powered features.

- Implement OpenAI-driven workflows, leveraging reasoning and non-reasoning models, temperature settings, and model versioning best practices.

- Apply prompt engineering and prompt chaining techniques to improve LLM accuracy and performance for production workloads.

- Build retrieval-augmented generation (RAG) pipelines using LangChain, ChromaDB, or similar frameworks.

- Develop FastAPI or Flask-based backends that connect to OpenAI APIs and vector databases.

- Build interactive front-ends and tools using Gradio or Streamlit for rapid prototyping and testing.

- Ensure secure, containerized deployments using Docker and integrate SSO and role-based access controls.

- Automate data pipelines and document workflows via Google Drive, AWS SDKs, or REST APIs.

- Write production-grade Python code, following clean architecture, documentation, and CI/CD best practices.

- Collaborate closely with AI engineers, DevOps teams, and clients to deliver enterprise-ready LLM applications.

Required Qualifications :

- 8+ years of experience in professional software development, with a strong focus on AWS cloud and backend systems.

- 3+ years of direct experience working with OpenAI APIs, GPT models, and LLM application development.

- Proven ability to build and deploy LLM-powered applications, not just experiment with models.

- Expertise in Python, FastAPI, and API-driven architecture.

- Strong practical experience with LangChain, ChromaDB, RAG, and prompt engineering.

- Proficiency in Docker, AWS IAM, and secure deployment practices.

- Excellent communication skills - ability to explain LLM behavior, tradeoffs, and reasoning clearly to both technical and non-technical teams.

- Comfortable working independently in a fast-paced, client-facing environment across time zones.

Nice to Have :

- Experience with LangGraph or other LLM orchestration frameworks.

- Knowledge of vector databases like Pinecone or FAISS.

- Familiarity with MLOps, CI/CD pipelines, and observability for LLM workloads.

- Exposure to healthcare, biotech, or regulated data environments.

- Demonstrated experience explaining and documenting AI system design and decision-making for non-AI stakeholders.