Responsibilities
- Train and fine-tune large language models using supervised fine-tuning techniques.
- Work with open-source model architectures such as LLaMA, Mistral, Qwen, and comparable variants.
- Develop LoRA and Q-LoRA pipelines to enable efficient model adaptation.
- Design and improve data preprocessing workflows, including tokenization and long-context sequence handling.
- Extend and utilize Hugging Face Transformers and Datasets libraries for training and inference tasks.
- Process structured and semi-structured data, including parsing XML and XSD files.
- Implement document parsing for Office file formats using tools like python-docx and OpenXML.
- Design and deploy end-to-end Retrieval-Augmented Generation (RAG) systems for document-based question answering.
- Construct and manage vector databases and embedding pipelines using FAISS, Chroma, Weaviate, or pgvector.
- Enhance retrieval performance through hybrid search methods, re-ranking strategies, and domain-specific chunking.
- Develop and maintain MCP server integrations to allow LLMs to interact securely with tools, APIs, and external data.
- Create agentic workflows using MCP for structured, auditable access to internal systems and contextual data.
- Deploy and operate models in fully offline and air-gapped environments.
- Apply model quantization and optimization techniques such as GGUF, GPTQ, AWQ, and bitsandbytes.
- Build and support inference systems using vLLM, TGI, and Ollama frameworks.
- Optimize GPU utilization through CUDA, cuDNN, and VRAM-aware batching strategies.
- Maintain on-premise CI/CD pipelines for machine learning models without reliance on cloud services.
- Manage local model registries, version control, and artifact storage.
- Ensure RAG and MCP components function reliably in disconnected or network-restricted settings.
- Develop backend services in Python to support machine learning training and inference operations.
- Integrate with relational databases like Postgres and MySQL, as well as vector databases for RAG storage.
- Utilize Docker and Git to ensure consistent development and deployment workflows.
- Implement CI/CD pipelines using Azure DevOps, including configurations with local runners.
Work Arrangement
Remote (Worldwide)