Company Description: Camlin is a global technology leader that operates with the vision of bringing revolutionary products to life for a wide range of industries, including power and rail, and also has interests in a number of R&D projects in a variety of scientific sectors.
At Camlin we believe in high quality engineering and design, allowing us to develop market leading products and services.
In short, we love creating value for our customers by solving difficult problems.
As of today, the Camlin operation spans over 20 countries across the globe.
Job Overview: We are seeking a dedicated and experienced Site Reliability Engineer (SRE) to join our dynamic team.
The SRE will be responsible for ensuring the reliability, performance, and availability of our critical systems and services.
This role requires a blend of software engineering and operations skills to build and run large-scale, distributed, fault-tolerant systems.
Key Responsibilities: System Reliability and Performance: Design, implement, and maintain scalable and reliable infrastructure.
Monitor system performance, detect issues, and ensure maximum uptime.
Develop and implement strategies for disaster recovery and data backup.
Automation and Tooling: Automate repetitive tasks to improve efficiency and reduce human error.
Build and maintain tools for deployment, monitoring, and operations.
Create and maintain CI/CD pipelines to streamline application delivery.
Incident Management: Respond to and resolve incidents, minimizing impact on customers.
Conduct post-incident reviews to identify root causes and prevent recurrence.
Develop and maintain incident response protocols and playbooks.
Collaboration and Communication: Work closely with development teams to integrate reliability into the software development lifecycle.
Communicate effectively with stakeholders about system status and health.
Provide guidance and mentorship to junior team members.
Security and Compliance: Ensure systems comply with security standards and best practices.
Implement and maintain security measures, including patch management and vulnerability assessments.
Assist in audits and compliance initiatives as required.
Essential Criteria: Bachelor's degree in Computer Science, Engineering, or a related field.
4+ years of hands-on experience in Site Reliability Engineering or DevOps role.
Strong experience in maintaining cloud platforms (e.g., AWS, Azure).
Proficiency in programming and scripting languages (e.g., Python, Go, Bash).
Experience with infrastructure automation and container orchestration tools - (e.g., Docker, Kubernetes, Terraform, Ansible, Helm etc) Familiarity with continuous integration and deployment tools (e.g., Gitlab CI , Argo workflow ,Argo CD.)
Experience in managing distributed systems like Kafka Experience with monitoring/logging solutions (e.g., DataDog, ELK, Prometheus.)
Good understanding of concepts related to computer architecture, data structures and programming practices.
Solid understanding of networking, databases, and security principles.
Excellent problem-solving skills and attention to detail.
Strong debugging / troubleshooting skills.
Strong communication and collaboration skills.
Success at Camlin demands demonstrable cultural traits such as being a fast learner, adaptable to changing landscape and most importantly a strong believer in being hands-on.
Desirable: AWS Certifications.
Kubernetes Certification.
Linux Certifications - RHCE.
Open Source Contributions.
DataPlatform Operations.
Our Values: We work together - We know that working collaboratively will help us reach our shared goals faster, so we always look for ways to help each other.
We believe in people - Here at Camlin, our people are central to what we do and what we can achieve.
And as we move towards becoming industry and customer 'partners' that's even more important.
We trust our team members to do their best and be supportive.
We won't accept the 'way it's always been done' - Since Camlin's inception, we've been curious, inquisitive and always want to improve.
Thinking differently is in our DNA and we love solving tough challenges.
We listen to learn - Whether it's our customers, our markets, or each other, we ask questions and listen to the answers so we can learn and improve.
We're trying to do the right thing - We take responsibility for our actions and take decisions based on what's right for people, profit, and planet.
Benefits: Competitive salary Company Pension & Life Assurance Schemes On-site parking Hybrid Working Subsidised Gym Membership Wellness programmes EQUAL EMPLOYMENT OPPORTUNITY STATEMENT Individuals seeking employment at Camlin are considered without regards to race, colour, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, gender identity, or sexual orientation.
PI254300314