Details of the offer

HAYWARD HAWK is working with a long standing client of ours to find an SRE to join their expanding team.
Responsibilities: Collaborate with the Network Automation Team to develop and deploy infrastructure for a new automation platform.
Implement Site Reliability Engineering (SRE) principles focusing on measurement (SLI/SLO/SLA), reduction of manual tasks, and reliability modeling.
Define and monitor metrics to drive data-based decisions aimed at improving availability, reliability, and operational speed.
Create, maintain, and refine SLO and SLI baselines for network, system, and application performance.
Support go/no-go decision-making processes, validation/verification, and review of current and upcoming products/services.
Conduct proactive data analysis and testing to ensure optimal performance of production applications and services.
Troubleshoot and resolve business-impacting issues in collaboration with internal stakeholders.
Manage escalations, incident responses, root cause analysis, and conduct blameless postmortems.
Participate in a 24x7 on-call rotation.
Qualifications: Minimum of 3 years of experience with cloud/web/CDN scale infrastructure.
Proficiency in Python and Go; experience with C/C++ is advantageous.
In-depth knowledge of Linux systems, network programming, and protocols such as TCP, UDP, DNS, TLS/SSL, and HTTP.
Familiarity with BGP and Anycast routing is a plus.
Hands-on experience with DevOps practices and tools including Infrastructure as Code (Ansible/Saltstack), CI/CD (Gitlab, Jenkins, Git), and monitoring/visualization tools (Prometheus, Grafana).
Exposure to big data technologies like NoSQL/RDBMS, Redis, ElasticSearch, and Kafka.
Experience with containerization and container orchestration (Docker, Kubernetes).
Skilled in building and analyzing data telemetry, modeling, pipelines, and UI visualization.
Proven experience in software development, troubleshooting, and monitoring of large-scale distributed systems.
Adherence to software engineering best practices, standards, and the software development life cycle.
Knowledgeable about Agile software development methodologies.
Strong collaboration, communication, and documentation skills, with a demonstrated ability to work across functional teams.
Bachelors or Masters degree in computer science, engineering, or a related technical field, or equivalent experience.
For more information, please contact Alice Armstrong at Hayward Hawk.
Skills: Site Reliability Engineering Python CI/CD Linux


Nominal Salary: To be agreed

Source: Talent_Dynamic-Ppc

Requirements

Trainee Cloud Engineer (Career Pathway With Employment Guarantee)

Job Description Are you passionate about building and optimising cloud-based infrastructures to drive innovation and efficiency in modern businesses? Are you...


Nuyew Tech Academy - County Antrim

Published 9 days ago

Junior Technical Consultant

Description Napier is a new breed of financial crime compliance technology specialist. Our AI enhanced platform – Napier Continuum – transforms compliance fr...


Napier Ai - County Antrim

Published 24 days ago

Client Onboarding Senior Analyst

Are you looking for a career move that will put you at the heart of a global financial institution? Then bring your skills in analysis, problem solving and c...


Citigroup Global Markets Limited - County Antrim

Published 9 days ago

2025 Technology Analyst Placement - Belfast

You're ready to bring your knowledge from the classroom to the boardroom, and Citi wants to help you get there. Whether it's honing your skills or building y...


Citi - County Antrim

Published 9 days ago

Built at: 2024-11-24T01:43:38.774Z