As a Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation. You’ll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment, you’ll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE, you’ll be focused on running better production applications and systems.
- Develop, test, and debug automated tasks (Apps, Systems, Infrastructure)
- Troubleshoot priority incidents, facilitate blameless post-mortems
- Work with development teams throughout the software life cycle ensuring sustainable software releases
- Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions
- Build and drive adoption for greater self-healing and resiliency patterns
- Lead and participate in performance tests; identify bottlenecks, opportunities for optimization, and capacity demands
- Participate in the 24x7 support coverage as needed
- Bachelor's degree in Computer Science, Information Technology, or equivalent technical field
- 8 or more years relevant engineering experience
- Great communications and soft skills
- In-Depth OS experience (RHEL, Ubuntu, Windows Server) with strong debugging, troubleshooting, and problem-solving skills
- Experience in site reliability engineering in one of the following languages: Python or Java
- Hand-on experience with cloud-based technologies and tools especially in deployment, monitoring and operations, such as Data Dog, Prometheus, Splunk, Elasticsearch or Grafana
- Strong working knowledge of modern development technologies and tools such Agile, CI/CD, Git, Terraform and Jenkins
Additional Preferred Skills:
- AWS/Kubernetes certification is highly desirable
- 3 or more years of Enterprise Cloud infrastructure experience (AWS, Azure or GCP) in a mission critical environment
- Experience in GO, powershell or shell scripting
- Deep knowledge of Internet protocols and web services technologies such as HTTP, DNS, TCP/UDP, SOAP, JSON and REST
- Good understanding of networking protocols and cybersecurity best practices in cloud environment
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as any mental health or physical disability needs.
About the Team
The Chief Technology Office oversees enabling components inclusive of the top quality engineering and architecture tools and practices, key program management and processes as well as the technology workforce strategy required to make us a leading technology company for our customers, clients and colleagues around the world.
High Risk Roles (HRR) are sensitive roles within the technology organization that require high assurance of the integrity of staff by virtue of 1) sensitive cybersecurity and technology functions they perform within systems or 2) information they receive regarding sensitive cybersecurity or technology matters. Users in these roles are subject to enhanced pre-hire screening which includes both criminal and credit background checks (as allowed by law). The enhanced screening will need to be successfully completed prior to commencing employment or assignment.