Site Reliability Engineer Lead

Job description

PLEASE NOTE: This position is based in our Swiss HQ in Mendrisio, Switzerland, just 7km over the border from Como, Italy. Mendrisio is easily commutable from Milan, Como, Varese or Lugano and Cloud Academy provides the train ticket for you!

Cloud Academy is the leading digital skills development platform that enables every enterprise to become a tech company through guided Learning Paths, Hands-on Labs, and Skill Assessment. Cloud Academy delivers role-specific training on leading cloud technologies (AWS, Azure, Google Cloud Platform), essential methodologies needed to operate on and between clouds (DevOps, security, containers), and capabilities that are unlocked by the cloud (big data, machine learning).

Companies like Turner, Cognizant, SAS, and ThermoFisher customize Cloud Academy to contextualize learning and leverage the platform to assign, manage, and measure cloud enablement at scale.

 

Role description:

SREs are responsible for creating and maintaining a fully automated pipeline where developers can release features when they are ready without any changes required by an operations team.

In this role, you’ll be responsible for creating the architecture to make the above a reality, and lead a team to make it all work.

This will require you to actively drive collaboration and mentoring of other team members and create an atmosphere of success where we can quickly achieve goals without breaking things in production in the process.

This team works closely with our engineers to make sure they have the tools they need for success.

Automation will start with CI/CD, and extend to a fully serviced container delivery system where an engineer can build, test, and release on demand.

Another area this team will cover will be Chaos Engineering. This will include everything from doing production load testing to vulnerability injection to site and regional disaster simulation.

All of this will need lots of attention to monitoring and alerting so it’s clear when we have potential problems and we can head them off before they impact customers. You’ll define this solution and lead its implementation.

Requirements

  • Passion for coding, web technologies, and shipping features that drive users adoption
  • Strong understanding of server-side technologies such as: Python, Django, Flask, Celery and both Relational and NoSQL databases: PostgreSQL, MongoDB, Redis
  • Experience with distributed version control systems: mostly Git (Github and/or Bitbucket)
  • Expert-level familiarity with cloud provider systems in either AWS or Google Cloud
  • Experience with container automation technologies such as ECS and Kubernetes
  • Experience with building CI/CD pipelines using technologies such as Jenkins or Circle CI
  • Experience with managing an on-call rotation using PagerDuty or similar tools
  • Familiar with cloud architecture patterns and best practices for designing highly available, scalable and secure systems
  • High level of English proficiency, both spoken and written
  • Ability to work independently and as part of a team, with a sense of urgency and integrity
  • Background leading small teams

 

Bonus Points:

  • Passion and experience in e-learning projects is a strong bonus.
  • Able to find creative solutions to interesting problems.
  • Curious with a constant desire to learn and collaborate.

 

Benefits

  • Competitive salary including a bonus plan
  • Train ticket paid for by the company
  • Budget for professional development
  • 4 weeks paid vacation and 15 paid holidays per year
  • Great company culture and work environment!
  • Highly-skilled teammates and lots of opportunities for growth and development