• Notera att ansökningsdagen för den här annonsen kan ha passerat. Läs annonsen noggrant innan du går vidare med din ansökan.

We're looking for a software engineer to join our cloud-native team as a Site Reliability Engineer at ATGAs an SRE at ATG, you will be an integral part in an evolving team tasked to ensure that our platform and services have reliability, observability and availability built-in, that supports a high rate of improvement and that will keep scaling according to the needs of our customers.

SRE as a mindset involves finding engineering solutions to run a world-class production system. This includes building programmable infrastructure and eliminate manual work through automation. As an SRE you are responsible for the big picture of how our systems relate to each other, how they are observed and evolve over time.

You will be part of our culture of diversity, curiosity and problem solving, which we believe are vital aspects of ourorganisation. Not only to make work enjoyablebut also to deliver continuously good results. We encourage collaboration,high perspectives and proactive work towards constant development in a blame-free environment.

Change will always be part of our DNA and we recognize that we are not where we want to be, in terms of organisation, technology or improvements, yet. Therefore we would like you to embrace this challenge to flourish in an environment where you will be able to really make a difference.

Responsibilities - Engage in the evolution of our cloud-native platform that supports mission-critical internal and external services, through automatic deployment and provisioning

- Engage in the full life cycleof a service, from inception and design to deployment, operation and continuous refinement

- Engage in service capacity planning and forecasting, performance analysis and tuning

- Maintain services after go-life by measuring and monitoring key objectives such as availability, latency and overall system health

- Scale systems through automation, push for changes that improve reliability and release velocity

- Conduct periodic on-call duties

Minimum qualifications - Experience in one or more of the following: C/C++, Java, Python, Go, Perl, Ruby or shell scripting

- Experience in one or more of the following: AWS, Kubernetes, Prometheus, Grafana, Terraform

- Experience with Unix/Linux operating systems administration from kernel to shell

Preferred qualifications - MS or BS degree in Computer Science or related technical field, e.g. physics or mathematics, or equivalent practical experience

- A systematic approach to problem-solving coupled with strong communication skills, a sense of ownership and drive

- Experience in designing, analyzing and debugging dynamic distributed systems

- In-depth knowledge of operating system internals (threads, concurrency, mutexes) or networking (e.g. TCP/IP, routing, SDN)

Your application
Contact [email protected]
Please feel free to attach links to projects you have worked on, to your Github profile or anything else that you would like to share.We are engineers at heart and we would love to learn more about you and your interests!

Detta är en jobbannons med titeln "Site Reliability Engineer" hos företaget Ants och publicerades på webbjobb.io den 27 november 2018 klockan 15:51.

Hur du söker jobbet

webbjobb-logo-white webbjobb-logo-grey webbjobb-logo-black