Reliability Engineer


Wattpad is a mobile social app that connects people all over the world with stories that matter to them. It enhances the storytelling experience, and makes it possible for people to be captivated by something they love. We’re proudly based in Toronto, but our reach is global. Every month, 45 million people spend over 15 billion minutes on Wattpad to create and discover stories they can’t find anywhere else.

Come write your next chapter with us!

You have heard the terms DevOps, System Admin, and Test Engineer throughout our industry and as a Wattpad Reliability Engineer you will be involved in various aspects of all three of these disciplines in this unique position. We strive to reduce silos of teams and responsibilities while remaining flexible and agile. As part of the Reliability team, you will have full access to fix code and will help strengthen it. You will design systems that validate and run code from other teams, and will design tools that monitor the state of our systems. You have a passion for operations and software development along with the dedication to improve how things are done to get the best results.

The Reliability Engineering Team combines production, testing systems and aids in development. Our approach is to consider the pipeline required to ship code into production in order to build, test and release reliable Wattpad Services (Web, iOS & Android apps, Microservices, API) using creative and ever evolving solutions.

What you will be doing:

Design and develop software:

  • Code testing automation
  • Deploy mechanisms
  • Test and Production infrastructures
  • Resource management
  • Monitor latency, and availability of services
  • Perform ongoing performance monitoring, resource management and optimization
  • Solve problems regarding critical systems
  • Influence software design and architecture at Wattpad
  • Be part of on call rotation

What we are looking for:

  • Experience developing in any of the following languages (Python, Go, PHP, Javascript, or Java)
  • Strong grasp of security, privacy and monitoring concepts
  • Understanding of Unix/Linux systems from kernel to shell and beyond
  • Experience designing and implementing tasks in Continuous Integration systems (Jenkins, Travis, etc.)
  • Overall 5 years of experience with at least 3 years a DevOps/Site Reliability engineer
  • Excellent communication and English language skills (oral and written)
  • High energy, self-starting individual with an entrepreneurial spirit
  • Troubleshooting systems / software
  • Internal Unix systems and networking (DNS, TCP/IP, UDP, etc)
  • Cloud Based services such as Amazon Web Services, Azure, or Google Cloud Provider
  • Docker or similar container services

What we offer:

  • Competitive salary & career growth
  • Meaningful equity (stock options)
  • Health benefits, fully covered on us!
  • Transit pass, choice of technology, flexible hours
  • A beautiful office in downtown Toronto, with easy access to transit
  • And a whole lot more!
Posted: Nov 1, 2016
Apply Now