This role is focused on advancing enterprise AI capabilities by reengineering outdated text processing systems into robust, Python-driven natural language processing (NLP) solutions. You will lead the full lifecycle transition from rule-based platforms to scalable, maintainable architectures that support long-term innovation.
Key Responsibilities
- Lead the complete migration of legacy text processing systems to modern, Python-based NLP frameworks, enhancing scalability and adaptability.
- Refactor existing linguistic rules and grammar logic into efficient, production-grade NLP pipelines using standard libraries and best practices.
- Design and maintain high-precision, context-sensitive extraction models to identify entities, attributes, events, and relationships critical to product development and analytics.
- Develop and apply domain-specific ontologies and taxonomies to ensure consistent interpretation of language and improve the explainability of results.
- Translate evolving business needs into accurate, testable extraction logic, ensuring alignment with canonical data models and controlled vocabularies.
- Collaborate across technical and non-technical teams to communicate NLP capabilities, constraints, and optimization strategies clearly.
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or related field, or equivalent professional experience.
- Minimum of 7 years in data transformation and development with a focus on NLP, particularly within insurance or similar domains.
- At least 5 years of hands-on experience with Python and SQL in production environments.
- Proven track record building rule-based and hybrid (rules + machine learning) text extraction systems.
- Direct experience creating and applying ontologies to structure unstructured text and improve semantic accuracy.
- Deep understanding of linguistic feature engineering, dependency parsing, negation detection, and section-aware processing.
- Experience debugging and optimizing complex text pipelines and evaluating performance through precision, recall, and error analysis.
- Familiarity with data integration patterns and how extracted features feed into analytics, ML models, and enterprise applications.
- Ability to adapt quickly in environments where domain models and terminology evolve alongside product development.
