Toronto or Ajax

Bell Canada Enterprises is hiring a Senior AI DevOps Architect

About the Role

Bell Canada Enterprises is looking for a Senior AI DevOps Architect to serve as the architect and strategist for our AI/ML developer experience. You will define the vision, design the frameworks, and ensure the long-term success of our AI development lifecycle through innovative DevOps practices.

What You'll Do

  • Proactively identify pain points in the developer journey and architect solutions to streamline workflows and enhance productivity.
  • Design and implement AI-optimized CI/CD pipelines that automate build, test, and deployment processes.
  • Integrate AI-powered tools to automate code reviews and identify errors, vulnerabilities, and style inconsistencies.
  • Implement AI-driven systems for continuous security monitoring, enabling proactive threat detection.
  • Evaluate and recommend new AI capabilities and tools that can enhance developer experience and operational efficiency.
  • Collaborate with the Platform team to establish organizational standards, security policies, and governance frameworks.
  • Develop and execute strategies to ensure widespread adoption of MLOps best practices across engineering teams.
  • Guide and mentor intermediate engineers and provide expert consultation on complex MLOps challenges.

What We're Looking For

  • Ability to define and articulate a long-term vision for AI/ML developer experience and architect robust, scalable solutions.
  • Deep understanding of the end-to-end machine learning lifecycle, including data management, model development, training, deployment, monitoring, and governance.
  • Proven ability to design, implement, and optimize sophisticated CI/CD pipelines specifically for AI/ML workloads.
  • Experience with major cloud platforms (AWS, GCP) and services relevant to AI/ML, including containerization (Docker, Kubernetes) and Infrastructure as Code (Terraform, Ansible).
  • Familiarity with a broad range of AI/ML frameworks, libraries, and platforms (e.g., TensorFlow, PyTorch, MLflow, Kubeflow, SageMaker, Vertex AI).
  • Expertise in integrating security best practices throughout the AI/ML lifecycle, including threat detection and vulnerability management.
  • Excellent ability to diagnose complex technical challenges, identify root causes, and develop innovative solutions.
  • Demonstrated capability to lead technical initiatives, guide junior engineers, and provide expert consultation.
  • Strong interpersonal and communication skills, with the ability to collaborate with engineering teams, platform teams, and stakeholders.
  • A proactive approach to staying abreast of the rapidly evolving landscape of AI, ML, DevOps, and cloud technologies.

Nice to Have

  • Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or a related technical field.
  • 7-10 years of experience in DevOps, Site Reliability Engineering, or Software Engineering roles.
  • At least 3-5 years of direct experience implementing and managing MLOps practices for AI/ML projects.
  • Proven track record of designing scalable CI/CD pipelines for complex applications, preferably including ML models.
  • Hands-on experience with container orchestration platforms like Kubernetes.
  • Experience with Infrastructure as Code tools (e.g., Terraform, CloudFormation).
  • Experience evaluating, selecting, and integrating new tools to improve developer workflows.
  • Experience in defining and enforcing technical standards, policies, and governance frameworks.
  • Experience mentoring engineers and leading technical discussions.
  • Strong programming skills, particularly in Python.
  • Familiarity with monitoring and logging solutions (e.g., Prometheus, Grafana, ELK Stack).
  • Knowledge of security best practices in cloud and DevOps environments.

Technical Stack

  • Cloud: AWS, GCP
  • Infrastructure & Orchestration: Docker, Kubernetes, Terraform, Ansible
  • AI/ML Frameworks & Platforms: TensorFlow, PyTorch, MLflow, Kubeflow, SageMaker, Vertex AI
  • Programming & Monitoring: Python, Prometheus, Grafana, ELK Stack

Team & Environment

This role is part of the Customer Experience team and involves close collaboration with the Platform team.

Bell Canada Enterprises is an equal opportunity employer.

Required Skills
AWSGCPDockerKubernetesTerraformAnsibleTensorFlowPyTorchMLflowKubeflowCI/CDMLOpsInfrastructure as CodeCloud ArchitectureMachine Learning AWSGCPDockerKubernetesTerraformAnsibleTensorFlowPyTorchMLflowKubeflowCI/CDMLOpsInfrastructure as CodeCloud ArchitectureMachine Learning
Scaling your freelance income?

Invoice multiple clients effortlessly

Managing 3+ international clients? Glopay streamlines everything. One EU company, unlimited invoices, automatic compliance. You just send and get paid.

Unlimited clients & invoices
Multi-currency support
Automated tax compliance
Client portal for easy payments
Scale with Glopay
Trusted by 10,000+ freelancers
About company
Bell Canada Enterprises
Bell builds world-class networks, develops innovative services, and creates original multiplatform media content. The Bell Mobility team offers mobile devices, wireless services, and Internet of Things solutions to consumer and business customers.
All jobs at Bell Canada Enterprises Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 2 months ago