Required Skills
- 8+ years of experience in hands-on exposure to AI/ML or Generative AI systems
- Strong understanding of AI evaluation techniques, including hallucination detection, factual accuracy, bias, and output consistency
- Knowledge of Responsible AI principles, including fairness, transparency, and explainability
- Python (must-have) and Experience with: REST APIs and microservices
Key Responsibilities
- Design, build, and operate a secure, scalable, and cost-efficient enterprise Generative AI platform on AWS, supporting production-grade LLM applications in regulated environments.
- Own the full GenAI platform lifecycle, including architecture, deployment, operations, monitoring, incident management, and continuous reliability and performance improvements.
- Implement and run AWS Bedrock-based solutions, enabling LLM inference, RAG, Agents, and Guardrails with high availability, fault tolerance, and SLA compliance.
- Establish strong operational and governance frameworks, covering observability, alerting, RCA, security controls, access management, compliance, and cost optimization.
- Bring deep expertise in cloud ML platforms and financial services, with strong Python skills, AWS services knowledge, GenAI hands-on experience, and a background in production support, reliability engineering, and AI governance.