About the Job Our client is a global quantitative and systematic hedge fund who apply software engineering, data engineering and financial engineering to crypto trading.
They are searching for a Site Reliability Engineer (SRE) to play a key role in ensuring the reliability, performance, and scalability of their high-frequency cryptocurrency trading systems.
This role is fully remote but based in London.
Day-to-day your role will include focusing on system health, performance monitoring, issue resolution, and process automation.
The ideal candidate will have a passion for maintaining high-performance and mission-critical systems in a challenging 24/7 trading environment.
They will be detail-orientated and take a proactive approach to problem-solving.
Examples of projects you could be working on include performance optimization, reliability engineering, incident management and data integrity.
Responsibilities: Monitor and Maintain trading system health Identify, triage, and resolve issues in real-time Develop automation scripts and tools Implement and maintain monitoring and alerting systems Conduct root cause analysis and implement preventive measures Optimize trading system performance through analysis and tuning Maintain documentation for operational procedures and system architecture Assist in planning and scaling trading infrastructure Requirements: Bachelor's degree in Computer Science, Software Engineering, or related field Experience in financial trading systems or high-frequency environments Strong proficiency in Python for scripting and automation Linux systems administration Network protocols and troubleshooting techniques Familiar with SQL/NoSQL databases and real-time data processing Experience with monitoring tools (e.g., Prometheus, Grafana)