Remote Position in AI Performance Analysis
Role Overview
We are seeking a data-driven analyst to conduct comprehensive failure analysis on AI agent performance across finance-sector tasks. You'll identify patterns, root causes, and systemic issues in our evaluation framework by analyzing task performance across multiple dimensions (task types, file types, criteria, etc.).
Key Responsibilities
• Statistical Failure Analysis: Identify patterns in AI agent failures across task components (prompts, rubrics, templates, file types, tags)
• Root Cause Analysis: Determine whether failures stem from task design, rubric clarity, file complexity, or agent limitations
• Dimension Analysis: Analyze performance variations across finance sub-domains, file types, and task categories
• Reporting & Visualization: Create dashboards and reports highlighting failure clusters, edge cases, and improvement opportunities
• Quality Framework: Recommend improvements to task design, rubric structure, and evaluation criteria based on statistical findings
• Stakeholder Communication: Present insights to data labeling experts and technical teams
Required Qualifications
• Statistical Expertise: Strong foundation in statistical analysis, hypothesis testing, and pattern recognition
• Programming: Proficiency in Python (pandas, scipy, matplotlib/seaborn) or R for data analysis
• Data Analysis: Experience with exploratory data analysis and creating actionable insights from complex datasets
• AI/ML Familiarity: Understanding of LLM evaluation methods and quality metrics
• Tools: Comfortable working with Excel, data visualization tools (Tableau/Looker), and SQL
Preferred Qualifications
• Experience with AI/ML model evaluation or quality assurance
• Background in finance or willingness to learn finance domain concepts
• Experience with multi-dimensional failure analysis
• Familiarity with benchmark datasets and evaluation frameworks
• 2-4 years of relevant experience
We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.
Remote (Global)
Rockwell Automation is hiring a Mercor - Data Scientist, application via RippleMatch
Earn more as a remote developer
Performance pay that rewards your skills
Iglu's revenue-sharing model means top performers earn significantly more than traditional salaries. Choose your projects, deliver great work, and see it reflected in your pay.
Revenue-sharing compensation
Project choice & autonomy
International client base
Career growth support
Top earners exceed market rate
