Platform Reliability Engineer

London, Greater London

Platform Reliability Engineer

Ncounter is supporting a highly sophisticated, technology driven trading environment in the search for a Platform Reliability Engineer to help operate, engineer, and continuously improve a large scale distributed production platform used by researchers and software engineers. This role sits at the intersection of software engineering, infrastructure engineering, and production operations, with a strong focus on reliability, automation, observability, and operational excellence across mission critical systems. You will work closely with developers and infrastructure teams to maintain resilient services, diagnose complex production issues, and engineer tooling and automation that reduces operational toil while improving platform stability and performance.

Key Responsibilities
• Improve reliability and resilience of production platform services
• Build automation and internal tooling to streamline operational workflows
• Design observability across metrics, logging, tracing, and alerting
• Diagnose complex production issues and improve system performance
• Contribute to operational runbooks, incident reviews, and reliability standards

Experience Required
• Background in SRE, Production Engineering, or platform operations supporting large scale systems
• Strong Linux troubleshooting experience across distributed or containerised environments
• Programming capability in Python with Git based workflows and CI/CD pipelines
• Hands on experience with observability platforms and monitoring systems
• Experience operating high availability infrastructure and improving system resilience

Exposure to technologies such as Kubernetes, Prometheus, Grafana, ELK, Kafka, PostgreSQL, Redis, Terraform, or Ansible would be beneficial.

If you enjoy solving complex reliability challenges and building the tooling that keeps large scale platforms operating smoothly, we would welcome a conversation

Job Info

Job Title:

Platform Reliability Engineer

Company:

CV-Library

Location:

London, Greater London

Salary:

£150000 - £160000 Per annum plus Bonus & Package

Posted:

Mar 4th 2026

Closes:

Apr 4th 2026

Sector:

Accounting, Financial & Insurance

Contract:

Permanent

Hours:

Full Time

Platform Reliability Engineer

London, Greater London

Useful Links

More Links

Popular Locations