Roles & Responsibilities
1. Work with engineers around the world to create strategy and continuous improvement for application reliability
2. Engineer solutions to monitor the health, availability, and capacity of our environment and software using industry standard tools and practices
3. Troubleshoot and diagnose issues (hardware or software); propose and implement solutions to ensure they occur with reduced frequency
4. Collaborate in vendor engagements to solve problems and meet application reliability requirements
5. Partner with Product Management and Application Engineering to align on operational needs and product roadmap
6. Standardize and automate change, validation, and deployment processes
7. Participate in support rotation to ensure uptime and SLO
1. BS. degree in Systems Engineering, Computer Science or Information Technology or equivalent related work experience required.
2. Strong hands-on experience of Linux and TCP/IP Networking
3. Prior experience with configuration and maintenance of common applications such as Nginx, MySQL, SQL Server, DNS, etc.
4. Working knowledge of scripting languages including shell and Python
5. Experience supporting infrastructure and services ranging from on-premises to public cloud environments(GCP or AWS)
6. Available on a 24x7x365 basis when needed for production impacting incidents or key customer events
7. A team player, fast learner, with a focus on getting work done
8. Excellent written, oral and expression skills in Chinese and English