Lead the assessment and definition of model quality benchmarks
Manage end-to-end evaluation infrastructure, including validation datasets and performance indicators
Develop and maintain authoritative 'golden sets' for each product to ensure consistent quality standards
Drive initiatives to enhance model accuracy, precision, and recall across all offerings
Define and monitor AI performance KPIs and customer-facing SLAs during scaling phases
Design and implement automated quality assurance workflows
Identify manual QA and data processing tasks and replace them with automated solutions
Minimize reliance on human review by improving model confidence and consistency
Construct scalable pipelines for evaluating model outputs
Develop standardized procedures to rapidly assess pilot programs and active customer implementations
Maintain high accuracy levels despite increasing customer volume and product complexity
Integrate real-world user feedback into model improvement cycles
Collect and examine instances of model failure in production environments
Rank accuracy issues by impact and collaborate with engineering teams to implement fixes

Shape the evolution of AI Copilots in a rapidly expanding and transformative market
Exercise significant autonomy and influence as a foundational team member
Collaborate with proactive, high-performing colleagues and large enterprise clients

Remote — SF, NYC

NavigateAI is hiring an AI Product Manager