pt

Serviços

Os principais empregadores de Portugal confiam em nós para fornecer soluções de contratação rápidas e eficientes, adaptadas às suas necessidades exatas. Navegue pela nossa gama de serviços e recursos personalizados.

Saiba mais

Contacte-nos

Verdadeiramente global e orgulhosamente local, estamos em Portugal há cerca de 7 anos sempre prontos para oferecer-lhe as melhores soluções de recrutamento.

Fale connosco
Ofertas de emprego

Os nossos especialistas do setor irão ouvir as suas aspirações e partilhar a sua história com as organizações de maior prestígio em Portugal. Juntos, vamos escrever o próximo capítulo da sua carreira.

Ver todas as ofertas de emprego
Candidatos

Juntos, iremos mapear os caminhos que vão definir a sua carreira e mudar a sua vida para que alcance as suas ambições profissionais. Navegue pela nossa gama de serviços, conselhos e recursos.

Saiba mais
Serviços

Os principais empregadores de Portugal confiam em nós para fornecer soluções de contratação rápidas e eficientes, adaptadas às suas necessidades exatas. Navegue pela nossa gama de serviços e recursos personalizados.

Saiba mais
Sobre a Robert Walters Portugal

Para nós, o recrutamento é mais do que apenas um trabalho. Entendemos que por trás de cada oportunidade está a possibilidade de fazer a diferença na vida das pessoas.

Saiba mais

Trabalhe connosco

As pessoas são o coração do nosso negócio. Ouça histórias da nossa equipa para saber mais acerca de uma carreira na Robert Walters Portugal.

Saiba mais
Contacte-nos

Verdadeiramente global e orgulhosamente local, estamos em Portugal há cerca de 7 anos sempre prontos para oferecer-lhe as melhores soluções de recrutamento.

Fale connosco

Site Reliability Engineer

Salvar vaga

Our client is a global e-commerce company providing an online platform where businesses can easily create and order customised marketing materials. They're active in multiple international markets across Europe and North America.

They’re looking for a Site Reliability Engineer (SRE) to lead their monitoring and observability efforts. You'll define and improve SLOs and SLIs, guide teams on best practices, and help maintain a stable, reliable platform through modern monitoring solutions.

Key Responsibilities

  • Lead Monitoring & Observability Strategy: Develop and lead the implementation of the company’s monitoring and observability approach.
  • Define & Maintain SLOs/SLIs: Set, implement, and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical services.
  • Mentor Product Managers & Engineering Leads: Guide teams on the definition and optimisation of SLOs/SLIs.
  • Collaborate Across Teams: Work closely with engineering, product, quality, and monitoring teams to manage incidents and maintain system health.
  • Set Up Monitoring Tools: Configure and manage tools like Datadog, Cloudflare, and Azure Cloud to monitor platform performance.
  • Improve Incident Management: Continuously improve processes to identify and resolve performance bottlenecks.
  • Optimise CI/CD Processes: Enhance CI/CD pipelines for better performance, reliability, and incident prevention.
  • Integrate Observability in Testing: Collaborate with QA teams to incorporate observability into testing processes for early issue detection.
  • Ensure High Availability & Security: Implement best practices to maintain high availability, performance, and security across the infrastructure.
  • Evolve SRE Practices: Drive the evolution of SRE practices and foster a culture of observability within the team.

What You Bring

  • Site Reliability Engineering Experience: Mid-level to senior experience in an SRE role, with a solid background as a developer.
  • E-commerce Experience: Experience working on high-traffic, customer-facing platforms such as e-commerce.
  • Monitoring & Observability Expertise: Strong experience with monitoring tools, observability frameworks, and related technologies.
  • Experience with Datadog or Similar Tools: Hands-on experience with Datadog or similar monitoring tools.
  • Cloud Experience: Experience working in a cloud-focused environment (e.g., Azure or similar).
  • Scripting Proficiency: Proficient in scripting for automation and system management.
  • SLO/SLI Implementation: Proven experience defining and implementing SLOs and SLIs for large-scale systems.
  • Incident Management & Collaboration: Deep understanding of incident management and effective collaboration with engineering teams.
  • Passion for System Reliability: Monitoring-focused and passionate about enhancing system reliability and visibility.
  • Mentorship Experience: Previous experience in mentoring and guiding teams on observability best practices.

Why Apply Now?
Don’t miss the opportunity to make a significant impact in a dynamic environment. This role allows you to mentor teams, implement best practices, and drive system improvements. Enjoy a flexible 4-day workweek and 100% remote work (Portugal-based).

Are you ready to take the next step in your career? Send your CV to ari.kilab@robertwalters.com

Tipo de contrato: Permanente

Especialização: Tecnologias de informação

Área: DevOps and Cloud

Indústria: Marketing

Salário: Negotiable

Tipo de trabalho: Remoto

Nível de experiência: Gerente

Local: Lisboa

Referência da vaga: 7RHWBR-024BCFDD

Data postada: 8 de maio de 2025

Consultor: Ari Kilab