Devops Engineer - Machine Learning

Details of the offer

At CoMind, we are developing a non-invasive neuromonitoring technology that will result in a new era of clinical brain monitoring. In joining us, you will be helping to create cutting-edge technologies that will improve how we diagnose and treat brain disorders, ultimately improving and saving the lives of patients across the world.
The Role CoMind is seeking a skilled DevOps Engineer to join our dynamic Research Data Science team to lead the orchestration of a robust ML training pipeline in AWS. This role is critical to enabling the scalable training and testing of a range of ML models on large volumes of a totally new form of clinical neuromonitoring data.
Responsibilities: Architect and implement a scalable solution to support the Research Data Science Team in running a large number of assorted machine learning pipelines, including model training, evaluation, and inference
Create a CI/CD pipeline for building containers from in-house Python packages, running integration tests, and publishing to AWS ECR
Set up ECS or AWS Batch Tasks to run containers stored in AWS ECR
Establish a robust configuration management system to store, version, and retrieve configurations associated with multiple machine learning workflows
Implement robust error handling and monitoring solutions to ensure timely debugging across the pipeline with centralised logging and error reporting
Implement cost monitoring solutions to track and manage compute costs across different runs, building dashboards to provide insights into resource usage and cost optimization
Ensure security and data protection are integrated into the pipelines by applying AWS best practices for security protocols and data management
Monitor and manage the team's compute resources, including both cloud (AWS) and on-premise GPU nodes, ensuring efficient use and scalability
Implement Infrastructure as Code (IaC) to set up and manage the pipeline architecture, using Terraform, AWS CloudFormation, or similar tools.
Skills & Experience: Git or Bitbucket for version control, including experience with managing versioned infrastructure-as-code (IaC) repositories
CI/CD pipelines for automating workflows, including experience with integration testing and containerization pipelines
Experience managing and orchestrating complex cloud workflows (e.g., ECS Tasks, AWS Batch), with a focus on event-driven and parallel processing
Infrastructure as Code (IaC) experience (e.g., Terraform, AWS CloudFormation) for creating, maintaining, and scaling cloud infrastructure
Docker for containerization, including experience with containerizing machine learning workflows and publishing containers to repositories like AWS ECR.
Benefits: Company equity plan
Company pension scheme
Private medical, dental and vision insurance
Group life assurance
Comprehensive mental health support and resources
Unlimited holiday allowance (+ bank holidays)
Hybrid working (3 days in-office)
Quarterly work-from-anywhere policy
Weekly lunches
Breakfast and snacks provided.

#J-18808-Ljbffr


Nominal Salary: To be agreed

Source: Jobleads

Requirements

Senior Backend Engineer

For more information on what we do and (more importantly) why we do it, head over to our blog and check out our  rebranding to Goodstack If you've been looki...


Goodstack - England

Published 22 days ago

Fscs Analyst

Investec is a distinctive Specialist Bank serving clients principally in the UK and South Africa. Our culture gives us our edge: we work hard to find colleag...


Investec - England

Published 10 days ago

Senior Full-Stack Engineer

Qualifications Bachelor's or Master's degree in Computer Science or similar (PhDs will be given preference) 7+ years of experience in full-stack development;...


Scopeworker - England

Published 10 days ago

Data Analyst

Hometree is a home energy services company on a mission to accelerate the transition to net zero. By fixing, installing and financing home energy hardware, w...


Hometree - England

Published 10 days ago

Built at: 2024-11-25T10:37:45.569Z