Germany Remote (Global)

Mindrift is hiring a Freelance AI Evaluation Engineer (Python/Full-Stack)

About the Role

You will help shape how AI coding systems are evaluated by creating meaningful programming challenges grounded in real-world development scenarios. Your work will directly influence how AI performance is measured, focusing on reasoning, implementation accuracy, and handling of complex requirements.

What You'll Do

  • Develop and refine coding tasks based on realistic production codebases, ensuring they reflect authentic development challenges
  • Write detailed functional tests that assess full behavior, including edge cases and integration points
  • Design problems that are fair but demanding—requiring AI to synthesize information across files and external sources
  • Examine AI-generated solutions to identify patterns in success and failure
  • Improve tasks based on structured feedback from expert reviewers using defined quality benchmarks

What We're Looking For

  • Computer Science or related degree
  • At least 5 years of hands-on software development with strong Python experience (pytest, async/await, subprocess, file operations)
  • Full-stack background with practical work in both React front-ends and backend systems
  • Proven experience writing tests, not just running them
  • Familiarity with Docker for running isolated evaluations
  • Working knowledge of CI/CD, particularly GitHub Actions (triggers, labels, interpreting results)
  • Functional English (B2 level or higher)

Work Environment

This is a freelance, project-based position open to candidates worldwide. You’ll have full control over your schedule as long as deadlines are met. Projects vary in scope and complexity, and compensation adjusts accordingly, with an equivalent base rate of $50 per hour.

Technology Stack

Python, pytest, async/await, subprocess, file operations, React, Docker, GitHub Actions, CI/CD

Required Skills
Pythonpytestasync/awaitsubprocessfile operationsReactDockerGitHub ActionsFull-Stack DevelopmentBack-end SystemsFunctional TestingIntegration TestingCI/CD Pythonpytestasync/awaitsubprocessfile operationsReactDockerGitHub ActionsCI/CDFull-Stack DevelopmentTest Case DesignBackend SystemsIntegration Testing
Ready to relocate and code from paradise?

Thailand or Vietnam — your office, your rules

Iglu offers relocation to Bangkok, Chiang Mai, Ho Chi Minh City, or Hong Kong. Full employment, legal setup, and a community of 200+ digital professionals.

Relocation to 5 countries
Full legal work setup
Developer community access
Work-life balance culture
Explore locations
Relocation support included
About company
Mindrift
Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems.
All jobs at Mindrift Visit website
Job Details
Category fullstack
Posted a month ago