• Notera att ansökningsdagen för den här annonsen kan ha passerat. Läs annonsen noggrant innan du går vidare med din ansökan.

A few words about us

Peltarion provides an operational AI platform for producing real-world AI applications at scale and at speed. Our goal is to make machine learning widely available: be it rapid prototyping for data scientists, using our wizard to build an image classifier for a hobby app project, or a sentiment analysis system for a business case. It is the first platform to provide fast, efficient and scalable production of commercially viable AI applications without extensive prior knowledge about machine learning.

We have barely scratched the surface of what is possible and AI will change the world fundamentally. At Peltarion we have been helping doctors fight cancer, carmakers optimize battery power, curators identify moods in music, farmers keep their crops secure... The platform plays a key role in this. And the opportunities to do more expand every day. Today, AI and deep learning is certainly not for everyone. We want to change that. Enable a wide audience to solve new classes of problems that were previously hard or impossible to solve, even without prior AI experience.


About the role

As a Site Reliability Engineer in our Infrastructure team, you'll provide support on application design and development. You’ll help us to manage and maintain all of Peltarions infrastructure and to make sure that we have all the tools, automation and monitoring in place to make our platform reliable and scalable. We work close to developers and architects and in the last few years we have moved from running our services natively on VMs to a hybrid container/Kubernetes setup with the goal of moving as many services as possible to cloud Kubernetes solutions.

To succeed, you will need to be an experienced problem solver with a generalist mindset and a genuine interest for cloud environments, so that you can design and build stellar infrastructure spanning multiple clouds. You enjoy debugging complex problems and to find solutions for how to fix them. You will work in a team of like-minded people where innovations are at the core. You will also share responsibility for on-call with the team.

Our ever evolving tech stack currently consists of (to mention a few) Ubuntu, Terraform, SaltStack, GCP, GKE, Cloud Build, Azure, AKS, Docker, Prometheus, Grafana, Graylog, PostgreSQL, Nginx, HAproxy, Python and Java.

We need you to have/be:

  • A good understanding of Linux
  • Programming skills in Bash, Python and/or Go
  • Experience from configuration management and CI/CD
  • Knowledge of logging, monitoring & alerting
  • Experience with Kubernetes and container orchestration
  • An understanding of cloud and traditional networking
  • Experience from administration of cloud environments such as GCP and Azure
  • A desire to automate manual and repetitive tasks
  • Able to both work independently and as part of a team
  • Fluent in English

And if you also are/have, then even better!

  • Experience with working in a remote team
  • A past in maintaining machine learning and/or GPU systems
  • Experience from building tools and services
  • Experience with security principles
  • Skills in Java

If you have a peculiar hobby, a kind heart, and a dedicated mind, and are looking for a new challenge, there’s a good chance we’ll be a great match. Please reach out if this strikes your interest

Detta är en jobbannons med titeln "Site reliability engineer" hos företaget Peltarion AB och publicerades på webbjobb.io den 29 juni 2021 klockan 11:42.

Hur du söker jobbet

webbjobb-logo-white webbjobb-logo-grey webbjobb-logo-black