The mission of the Shopee Tech Ops MRE (Machine Reliability Engineering) team is to ensure efficient and sustainable operation of the Shopee network and hardware level 24x7, building and maintaining massive hardware clusters for SRE and capacity, in terms of capacity, cost and hardware performance. The team provides sustainable hardware resources and stable network support services. MRE needs to communicate with the data center team to design and optimise network architecture; provide reasonable hardware configuration through hardware testing and selection according to business requirements; customize stable and efficient OS; optimize traditional operation through engineering and service means; and build a complete hardware monitoring system to improve the efficiency of fault handling.
- Maintain and take responsibility for the capacity, stability, availability and serviceability of Shopee virtual networks.
- Design and develop Shopee virtual networking platform, including SDN controlplane, Networking dataplane, Networking Function Virtualization and etc.
- Develop network functionalities on demand, including VPC, NAT, LB, QoS and etc.
- Develop and apply networking acceleration technologies like, DPDK, tc-flower offload, P4, ebpf, and RDMA.
- Bachelor's or higher degree in Computer Science or related fields.
- Passionate about coding and programming, innovation, and solving challenging problems.
- Understanding TCP/IP protocols, and VxLAN/BGP/OpenFlow protocols.
- Understanding common networking utilities in Linux, like OpenVSwitch/iptables/tc/ebpf and etc.
- Strong hands-on experience with at least one of the programming languages: Go, Python, C.
- Strong logical thinking abilities.
Skills below are optional but preferable:
- Proficient in TCP / IP, OSPF, BGP, ISIS, MPLS VPN / TE and other protocols
- Experiences in one common SDN controlplane software, like ODL, ONOS, RYU.
- Experiences in DPDK, tc, SmartNIC or ebpf.
- Experiences in the design and development of large-scale systems and platforms.
- Contributed to open-source projects.
- Published papers at top conferences like NSDI, SIGCOMM and etc.
- Have experience with network operation and maintenance