pt

Serviços

Os principais empregadores de Portugal confiam em nós para fornecer soluções de contratação rápidas e eficientes, adaptadas às suas necessidades exatas. Navegue pela nossa gama de serviços e recursos personalizados.

Saiba mais

Contacte-nos

Verdadeiramente global e orgulhosamente local, estamos em Portugal há cerca de 7 anos sempre prontos para oferecer-lhe as melhores soluções de recrutamento.

Fale connosco
Ofertas de emprego

Os nossos especialistas do setor irão ouvir as suas aspirações e partilhar a sua história com as organizações de maior prestígio em Portugal. Juntos, vamos escrever o próximo capítulo da sua carreira.

Ver todas as ofertas de emprego
Candidatos

Juntos, iremos mapear os caminhos que vão definir a sua carreira e mudar a sua vida para que alcance as suas ambições profissionais. Navegue pela nossa gama de serviços, conselhos e recursos.

Saiba mais
Serviços

Os principais empregadores de Portugal confiam em nós para fornecer soluções de contratação rápidas e eficientes, adaptadas às suas necessidades exatas. Navegue pela nossa gama de serviços e recursos personalizados.

Saiba mais
Sobre a Robert Walters Portugal

Para nós, o recrutamento é mais do que apenas um trabalho. Entendemos que por trás de cada oportunidade está a possibilidade de fazer a diferença na vida das pessoas.

Saiba mais

Trabalhe connosco

As pessoas são o coração do nosso negócio. Ouça histórias da nossa equipa para saber mais acerca de uma carreira na Robert Walters Portugal.

Saiba mais
Contacte-nos

Verdadeiramente global e orgulhosamente local, estamos em Portugal há cerca de 7 anos sempre prontos para oferecer-lhe as melhores soluções de recrutamento.

Fale connosco

Site Reliability Engineer

Salvar vaga

Our client is a global marketing services company with an online platform focused on helping SMEs create and manage their marketing more easily and affordably, with fast production and delivery in multiple countries. They're active in multiple international markets across Europe and North America.

They’re looking for a Site Reliability Engineer (SRE) to lead their monitoring and observability efforts. You'll define and improve SLOs and SLIs, guide teams on best practices, and help maintain a stable, reliable platform through modern monitoring solutions.

Key Responsibilities

  • Lead Monitoring & Observability Strategy: Develop and lead the implementation of the company’s monitoring and observability approach.
  • Define & Maintain SLOs/SLIs: Set, implement, and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical services.
  • Mentor Product Managers & Engineering Leads: Guide teams on the definition and optimisation of SLOs/SLIs.
  • Collaborate Across Teams: Work closely with engineering, product, quality, and monitoring teams to manage incidents and maintain system health.
  • Set Up Monitoring Tools: Configure and manage tools like Datadog, Cloudflare, and Azure Cloud to monitor platform performance.
  • Improve Incident Management: Continuously improve processes to identify and resolve performance bottlenecks.
  • Optimise CI/CD Processes: Enhance CI/CD pipelines for better performance, reliability, and incident prevention.
  • Integrate Observability in Testing: Collaborate with QA teams to incorporate observability into testing processes for early issue detection.
  • Ensure High Availability & Security: Implement best practices to maintain high availability, performance, and security across the infrastructure.
  • Evolve SRE Practices: Drive the evolution of SRE practices and foster a culture of observability within the team.

What You Bring

  • Site Reliability Engineering Experience: Mid-level to senior experience in an SRE role, with a solid background as a developer.
  • E-commerce Experience: Experience working on high-traffic, customer-facing platforms such as e-commerce.
  • Monitoring & Observability Expertise: Strong experience with monitoring tools, observability frameworks, and related technologies.
  • Experience with Datadog or Similar Tools: Hands-on experience with Datadog or similar monitoring tools.
  • Cloud Experience: Experience working in a cloud-focused environment (e.g., Azure or similar).
  • Scripting Proficiency: Proficient in scripting for automation and system management.
  • SLO/SLI Implementation: Proven experience defining and implementing SLOs and SLIs for large-scale systems.
  • Incident Management & Collaboration: Deep understanding of incident management and effective collaboration with engineering teams.
  • Passion for System Reliability: Monitoring-focused and passionate about enhancing system reliability and visibility.
  • Mentorship Experience: Previous experience in mentoring and guiding teams on observability best practices.

Why Apply Now?
Don’t miss the opportunity to make a significant impact in a dynamic environment. This role allows you to mentor teams, implement best practices, and drive system improvements. Enjoy a flexible 4-day workweek and 100% remote work (Portugal-based).

Are you ready to take the next step in your career? Send your CV to ari.kilab@robertwalters.com

Tipo de contrato: FULL_TIME

Especialização: Tecnologias de informação

Área: DevOps and Cloud

Indústria: Marketing

Salário: Negotiable

Tipo de trabalho: Remoto

Nível de experiência: Gerente

Local: Lisboa

Referência da vaga: 7RHWBR-024BCFDD

Data postada: 8 de maio de 2025

Consultor: Ari Kilab