Site Reliability Engineer - Digital SaaS Platform, .net, Cloud, CI/CD, Infrastructure, Scalability

Site Reliability Engineer - Digital SaaS Platform, .net, Cloud, CI/CD, Infrastructure, Scalability - London/Hybrid - Perm - 65k Plus benefits

My client - Global E-commerce company - are seeking to recruit an experienced SRE/Web Operations Engineer to join their team. This is an exciting time to join as they are going through an expansion therefore an opportunity to progress.
In this role you will be responsible for the reliability, scalability and performance of the company's digital platform and infrastructure. You will lead a small team of engineers, plus assist in the management of our external Azure Managed Service Providers.
Reporting to the Head of Engineering, you will be responsible for improving and optimising our Azure platform. Incident and problem management, DR, release management, manage observability tools (Datadog) and improve the developer experience and tools, the role would be ideal role for a .net developer who wants to get more involved in cloud/Dev ops.

Duties include:

SRE Strategy and Vision: Define and implement the overall strategy for SRE to align with organisational goals, balancing reliability, scalability and development velocity.
Service Uptime: Ensure systems and services meet agreed-upon service level agreements (SLAs) and SLOs for uptime and performance.
Incident Management: Lead efforts to establish effective incident response protocols, including detection, triage, resolution, and post-incident reviews.
Disaster Recovery: Oversee the development and testing of disaster recovery plans and procedures.
Infrastructure as Code: Drive adoption and best practices for automation, ensuring repeatability and consistency in infrastructure provisioning.
CI/CD Pipeline Optimisation: Ensure seamless integration and delivery pipelines to support development and deployment at scale.
Observability: Ensure comprehensive monitoring, logging, and alerting systems are in place to track the health and performance of systems.
Incident Resolution: Lead and coordinate major incident responses, ensuring swift recovery while minimizing impact.
Root Cause Analysis: Oversee post-mortem processes to identify root causes, document lessons learned, and implement preventive measures.

Looking for candidates with similar experience with the following:

Ideally background in .net development as you will be resolving incidents and working with the developers to fix code problems
SRE/Web operations engineering exp
Exp of working with 3rd party Infrastructure Management (Azure MSPs)
Experienced with .NET technology - ideally
Experienced at working with large scale codebase platforms
SRE Practices and Principles
Automation and Tooling
Monitoring & Observability
Performance Optimisation
Incident & DR Management
Proven experience in scaling infrastructure
Excellent communication skills both verbal and written
Strong organisational expertise and the ability to effectively multi-task
Strategic thinker
Data driven decision maker
Security & Compliance - ideally
Cloud Native Architectures (eg: Kubernetes, Docker) - ideally
Cloud Certifications - ideally

Excellent benefits, training and career progression.

Site Reliability Engineer - Digital SaaS Platform, .net, Cloud, CI/CD, Infrastructure, Scalability - London/Hybrid - Perm - 65k Plus benefits

Intra in cont

Intra in cont

Cont nou?

Site Reliability Engineer - Digital SaaS Platform, .net, Cloud, CI/CD, Infrastructure, Scalability

TSP Group

Site Reliability Engineer - Digital SaaS Platform, .net, Cloud, CI/CD, Infrastructure, Scalability

Descriere companie

Detalii oferta de angajare

Locatia jobului

Tip job

Categorie job

Salariu lunar

Cauta joburi dupa:

Helpful Resources

Servicii angajatori

Instrumente candidati

Joburi internationale

Intra in cont

Intra in cont

Cont nou?

Intra in cont

Alerte joburi

Site Reliability Engineer - Digital SaaS Platform, .net, Cloud, CI/CD, Infrastructure, Scalability

TSP Group

Site Reliability Engineer - Digital SaaS Platform, .net, Cloud, CI/CD, Infrastructure, Scalability

Descriere companie

Detalii oferta de angajare

Locatia jobului

Tip job

Categorie job

Salariu lunar

Modal Window

Cauta joburi dupa:

Helpful Resources

Servicii angajatori

Instrumente candidati

Joburi internationale