Site Reliability Engineer

AB Trav och Galopp, Stockholm

Java Python C++UNIX Linux Perl Ruby

Notera att ansökningsdagen för den här annonsen kan ha passerat. Läs annonsen noggrant innan du går vidare med din ansökan.

We are looking for a software engineer to join our cloud-native team as a Site Reliability Engineer
As an SRE at ATG, you will be an integral part in an evolving team tasked to ensure that our platform and services have reliability, observability and availability built-in, that supports a high rate of improvement and that will keep scaling according to the needs of our customers.

SRE as a mindset involves finding engineering solutions to run a world-class production system. This includes building programmable infrastructure and eliminate manual work through automation. As an SRE you are responsible for the big picture of how our systems relate to each other, how they are observed and evolve over time.

You will be part of our culture of diversity, curiosity and problem solving, which we believe are vital aspects of our organisation. Not only to make work enjoyable but also to deliver continuously good results. We encourage collaboration, high perspectives and proactive work towards constant development in a blame-free environment.de

Change will always be part of our DNA and we recognise that we are not where we want to be, in terms of organisation, technology or improvements, yet. Therefore we would like you to embrace this challenge to flourish in an environment where you will be able to really make a difference.

Responsibilities

• Engage in the evolution of our cloud-native platform that supports mission-critical internal and external services, through automatic deployment and provisioning
• Engage in the full lifecycle of a service, from inception and design to deployment, operation and continuous refinement
• Engage in service capacity planning and forecasting, performance analysis and tuning
• Maintain services after go-life by measuring and monitoring key objectives such as availability, latency and overall system health
• Scale systems through automation, push for changes that improve reliability and release velocity
• Conduct periodic on-call duties

We wish that you have

• Experience in one or more of the following: C/C++, Java, Python, Go, Perl, Ruby or shell scripting
• Experience in one or more of the following: AWS, Kubernetes, Prometheus, Grafana, Terraform
• Experience with Unix/Linux operating systems administration from kernel to shell

It is also a plus if you have

• MS or BS degree in Computer Science or related technical field, e.g. physics or mathematics, or equivalent practical experience
• A systematic approach to problem-solving coupled with strong communication skills, a sense of ownership and drive
• Experience in designing, analyzing and debugging dynamic distributed systems
• In-depth knowledge of operating system internals (threads, concurrency, mutexes) or networking (e.g. TCP/IP, routing, SDN)

Your application
Contact [email protected]
Please feel free to attach links to projects you have worked on, to your Github profile or anything else that you would like to share. We are engineers at heart and we would love to learn more about you and your interests!

Detta är en jobbannons med titeln "Site Reliability Engineer" hos företaget AB Trav och Galopp och publicerades på webbjobb.io den 7 januari 2019 klockan 11:46.

Hur du söker jobbet

Klicka här för att ansöka

Jobbfakta

Hitta er nästa webbtalang idag

Sveriges kanske enklaste, snabbaste och billigast sätt att nå de bästa utvecklarna – från endast 499kr!
Inkl. moms.

Publicera

Prenumerera på liknande jobb

Du väljer själv när det är dags att avsluta eller ändra din prenumeration.

Jag godkänner att webbjobb.io behandlar min e-postadress enligt webbjobb.ios villkor.

Dela jobbet

Andra webbjobb i Stockholm

Se alla lediga webbjobb i Stockholm →

Liknande jobb