As a Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation. You'll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment, you'll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE, you'll be focused on running better production applications and systems.
You'll be a member of a global support team that operates from United States, Singapore and India. In this role, you will undertake a Tier 3 position that will involve day to day incident and problem management of major impacting data issues globally and proactive infrastructure management. You will be attending bridge calls, address tickets, and work hand in hand with the GNS Tier 4/PE organizations on items such as bug fixes, design changes, and code upgrades.
This role requires a wide variety of strengths and capabilities, including:
- Bachelor's degree or equivalent experience.
- 5 or more years of experience in supporting large scale global network infrastructure running mission critical applications and services.
- Experienced in large scale software development in one or more of the programming languages such as C, C++, Perl, Python, Java, C# or .net.
- Ability to articulate and share clear, concise complex technical findings to management in understandable terms.
- Experience in these key areas:
- Hardware Architecture (performance testing, monitoring, operations)
- Hardware Benchmarking (Agile, program managements, network management)
- Design (compliance, security)
- Network Engineering (planning, provision)
- Experience in more than one specific infrastructure technology, for example network, server, linux, windows, cloud or storage.
- Experience in one of following the Infrastructure automation technologies: Ansible, Puppet or Chef. In addition, experience building APIs and services using REST or SOAP, etc.
- Ability to execute with understanding of best practices and policies at top of mind.
- Ability to collaborate with high-performing teams and individuals throughout the firm to accomplish common goals.
- Proven experience managing various large-scale enterprise network topologies including LAN, WAN, Wireless, Network Security & Services. Examples include Routers (Cisco - ASR/GSR/Nexus/Juniper) Switches (Cisco), Wireless (Cisco), Transport (Cisco/Ciena). Firewall (Check Point and Cisco), Load Balancers (Cisco and F5), Proxy (Blue Coat), Etc. Certifications in these areas are preferable.
- Experience in Packet Capture tools and analysis such as Niksun, ARX/AIX, Endace, and cPacket.
- Ability to identify manual operational tasks and develop automation to solve problems in a modern site reliability engineer support model.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as any mental health or physical disability needs.