Site Reliability Engineer

Job description

PLEASE NOTE: This position is based in our Swiss HQ in Mendrisio, Switzerland, just 7km over the border from Como, Italy. Mendrisio is easily commutable from Milan, Como, Varese or Lugano and Cloud Academy provides the train ticket for you!


Cloud Academy is the leading digital skills development platform that enables every enterprise to become a tech company through guided Learning Paths, Hands-on Labs, and Skill Assessment. Cloud Academy delivers role-specific training on leading cloud technologies (AWS, Azure, Google Cloud Platform), essential methodologies needed to operate on and between clouds (DevOps, security, containers), and capabilities that are unlocked by the cloud (big data, machine learning).


Companies like Turner, Cognizant, SAS, and ThermoFisher customize Cloud Academy to contextualize learning and leverage the platform to assign, manage, and measure cloud enablement at scale.

We are looking for a strong SRE to add to our current SRE team. The ideal candidate is someone who is excited about working for a growing, international company that is building an amazing product.


Role description:

SREs are responsible for creating and maintaining fully automated pipelines that enable developers to release features when they are ready. They are also responsible for architecting, implementing and maintaining a performing and elastic infrastructure.

In this role, you will be responsible for contributing to the definition and the implementation of both the infrastructure and CI/CD pipelines architecture to make the above a reality. This will require you to proactively collaborate with other team members and contribute to an atmosphere of success, where we can quickly achieve goals without breaking things in production in the process.

This team works closely with our engineers to make sure they have the tools they need for success. Automation will start with CI/CD, and extend to a fully-serviced container delivery system where an engineer can build, test, and release on demand.

Another area this team covers is Chaos Engineering. This includes everything from production load testing to vulnerability injection to site and regional disaster simulation. Additional focus will be dedicated to implementing an effective monitoring-and-alerting strategy to identify potential problems as soon as possible, and troubleshooting them before they impact customers.

Requirements

  • Passion for coding, web technologies, and shipping features that drive user adoption
  • Strong understanding of server-side technologies such as: Python, Django, Flask, Celery and both Relational and NoSQL databases: PostgreSQL, MongoDB, Redis
  • Experience with distributed version control systems: mostly Git (Github and/or Bitbucket)
  • Expert-level familiarity with cloud provider systems in either AWS or Google Cloud
  • Experience with container automation technologies such as AWS ECS and Kubernetes
  • Experience with building CI/CD pipelines using technologies such as Jenkins or Circle CI
  • Experience with managing an on-call rotation using PagerDuty or similar tools
  • Familiar with cloud architecture patterns and best practices for designing highly available, scalable and secure systems
  • High level of English proficiency, both spoken and written
  • Ability to work independently and as part of a team, with a sense of urgency and integrity

    Bonus Points:

    • Passion and experience in e-learning projects is a strong bonus.
    • Able to find creative solutions to interesting problems.
    • Curious with a constant desire to learn and collaborate.

    Benefits

    • Competitive salary including a bonus plan
    • Train ticket paid for by the company
    • Budget for professional development
    • 4 weeks paid vacation and 15 paid holidays per year
    • Great company culture and work environment!
    • Highly-skilled teammates and lots of opportunities for growth and development