Travix - Job details
page-template,page-template-detailjob-page,page-template-detailjob-page-php,page,page-id-17228,bridge-core-1.0.5,ajax_fade,page_not_loaded,,qode-child-theme-ver-1.0.0,qode-theme-ver-18.1,qode-theme-bridge,qode_header_in_grid,wpb-js-composer js-comp-ver-6.0.3,vc_responsive
Back to overview


Site Reliability Engineer

Are you a Site Reliability Engineer and like to work in the fast-moving and highly challenging e-commerce environment? Do you love autonomy to introduce your own ideas and creativity to build a strong and solid technical environment? We are looking for people like you who know their way around the web!

For this interesting role, we are looking for the legendary super-smart can-do engineers who take pride in creating and running our infrastructure. The team’s core focus is on performance, reliability, scalability, and security. Together with the Systems Support Engineers and Software Engineers, there is no problem you cannot fix.

 What You Do

  • Work with the System Engineering team to help improve, maintain, monitor and scale Travix infrastructure, which runs almost entirely on Kubernetes and has over a thousand pods
  • Work closely with the product developers, understand their application needs regarding scalability, availability, and security, then help them to deliver their microservices to our cloud in GitOps style and by making sure it integrates smoothly with all other microservices and our monitoring and alerting systems
  • Reverse engineering our microservices mesh to help detect issues and fix them
  • Participate in the on-call schedule to support the system in a 24/7 environment
  • Enhance alert accuracy and create new alerts to detect issues before it impacts the business
  • Open, tech-minded, adapt to new tools, setups, design patterns, and platforms
  • Both agile with development teams and non-agile in dropping what you are doing to fix a production issue

 What You Have

Hands-on experience in most of the following:

  • Kubernetes
  • Cloud platform: Google Cloud, AWS OR Azure
  • Deployment: Docker, Packer
  • CI/CD
  • Monitoring and Logging: Prometheus, ELK, StackDriver, BigQuery
  • GitOps (infra-as-a-code): Terraform, Helm 
  • Programming/scripting languages: Bash, Python or Go
  • Alerting: ElastAlert, Prometheus Alertmanager, OpsGenie 
  • Web-Servers: nginx, Openresty, IIS
  • Caching: Varnish, CloudFlare
  • Database: MSSQL, MySQL, PostgreSQL, Couchbase
  • Linux|Windows-Core systems administration

What You Get

  • We have technically challenging green-field projects
  • A multinational team of specialists to inspire and support you
  • Your software is used everyday by thousands of consumers around the world
  • Employer contributions into a personal pension
  • Life insurance equating to 4 x salary
  • London zone 1 travel support
  • 25 days of paid leave, plenty of time to enjoy your global travel adventures

Note: NO VISA sponsorship provided for this role in UK


Of course, you are! Do you recognize yourself in the above profile? Apply now!


Apply now

Thank you

Your application has been sent. We will contact you shortly on how to proceed.

Go to homepage