Remote (Global)

MagicSchool AI is hiring an Associate LLM Quality Analyst

About the Role

The role involves assessing the performance of large language models in educational contexts by identifying issues, categorizing errors, and providing structured feedback to improve model behavior and reliability.

Responsibilities

  • Evaluate outputs from language models for factual correctness and coherence
  • Identify harmful, biased, or inappropriate content in AI-generated text
  • Classify types of model errors including hallucinations and logical flaws
  • Follow detailed guidelines to score model responses consistently
  • Provide clear, actionable feedback to improve model training
  • Test model behavior across diverse educational prompts and scenarios
  • Document patterns in model failures for engineering review
  • Collaborate with researchers to refine evaluation criteria
  • Maintain high accuracy and attention to detail in assessments
  • Adapt quickly to updated instructions and testing protocols
  • Contribute to the development of new evaluation frameworks
  • Ensure alignment of model outputs with pedagogical goals
  • Report edge cases that reveal model limitations
  • Participate in calibration sessions with team members
  • Track and log evaluation results in shared systems
  • Support quality assurance across multiple AI features
  • Help prioritize issues based on severity and frequency
  • Review model updates for improvements or regressions
  • Maintain confidentiality of internal testing data
  • Engage in ongoing training to stay current with AI developments
  • Communicate findings clearly and concisely
  • Work independently while meeting deadlines
  • Contribute to a culture of continuous improvement
  • Follow ethical guidelines in all evaluations
  • Assist in creating realistic educational prompts for testing

Nice to Have

  • Master’s degree in education or related field
  • Experience working with large language models
  • Background in special education or diverse learning needs
  • Familiarity with K–12 curriculum frameworks
  • Prior work in AI ethics or content safety
  • Experience with annotation or labeling tasks
  • Knowledge of prompt engineering techniques
  • Exposure to educational technology products
  • Research experience in cognitive science or learning theory
  • Multilingual abilities

Compensation

$60,000 - $80,000 annually, commensurate with experience

Work Arrangement

Remote with flexible hours; some real-time collaboration required

Team

Small, agile team focused on AI-driven educational tools

What You’ll Be Doing

  • Review and score AI-generated responses to classroom-related prompts
  • Flag content that violates safety or accuracy standards
  • Participate in weekly team discussions to align on evaluation standards

Why This Role Matters

  • Your work directly improves the reliability of AI tools used by educators and students
  • You help ensure AI outputs are safe, factual, and appropriate for learning environments

Not available for this position

Required Skills
Test Case DesignData AnalysisSQLPythonStatistical AnalysisA/B TestingAPI TestingQuality AssuranceDocumentationCritical ThinkingCommunication
About company
MagicSchool AI
MagicSchool is the premier generative AI platform for teachers. More than 6 million teachers from all over the world have joined our platform.
All jobs at MagicSchool AI Visit website
Job Details
Category other
Posted 7 months ago