Job Summary

Job Type


Years of Experience
Information not provided

Tech Stacks

Job Description

The team is comprised of extremely talented (but humble) multidisciplinary individuals with unrestricted access across a large environment. We believe that one cannot build a truly great service without the ability to make changes across the stack. We take great care in focusing on solving real business problems, reducing operational overhead and working together as a team.

The platform and infrastructure team is responsible for the following areas – this includes both engineering and operations:
  • data modelling, database tuning & query optimization for SQL/no-SQL/columnar databases
  • real-time message bus (high volume, not ultra-low latency)
  • stream processing
  • HPC job scheduling
  • workflow management and batch processing
  • container orchestration
  • service discovery
  • POSIX and object storage systems
  • on premise:
  • bare metal compute (linux)
  • system tuning
  • configuration management and drift management
  • performance tuning
  • network configuration management
  • compute, storage, network system purchases / evaluations
  • cloud(s)
  • Environment provisioning and management
Qualifications/Skills Required

The team needs individuals like you with competencies in two or more of the following areas:
  • HPC job scheduling
    • Experience in environments at scale (eg. billions of jobs per week/month)
    • Understanding of cost metrics, preemption, job types, queuing, scheduler and optimizations
    • experience with products like HTCondor, slurm , spectrum LSF, nomad, AWS batch
  • Workflow management and batch processing
    • Experience in the challenges of workflow management in heavily multi-tenant environments
    • Mature approach to dealing with/avoiding task failure and system failure
    • experience with products like airflow, nifi, gnubatch, GCP cloud composer, AWS sagema
  • Container Orchestration (Kubernetes)
    • Experience with: PSPs, helm, admission/mutation controllers, PVs/PVCs, kube-router, BGP – generally demonstrated ability dig deep into the k8s projects to solve hard problems
    • Experience with docker & registries (eg. harbor, artifactory, GCP container registry, AWS container registry)
    • Mature approach to dealing with operational complexities and gaps of the k8s platform
  • Storage Systems
    • Experience deploying and managing petabyte scale systems supporting varied workloads
    • Mature approach to accessing price/performance, tiering and backup requirements
    • experience with products like GPFS, Lustre, Ceph, GCP PDs or other nvme specific products
    • familiarity with NVMEof, POSIX , object storage and various modes of permissioning data
  • Real time messaging & stream processing
    • experience in environments with 24/7 reliance on messaging systems
    • understanding of the messaging domain, congestion, delays, trade-offs and tech landscape
    • experience having deployed and/or used stream processing systems
    • experience with products like
    • kafka, pulsar, pravega, AWS sqs, GCP pubsub
  • Linux
    • Experience using configuration management systems (eg. saltstack, ansible)
    • Understanding of linux kernel components (eg. VFS, scheduler, memory mgmt., network)
    • Solid troubleshooting experience using gdb, OS & application tracing/profiling mechanisms
    • Experience with some of docker, lxd/lxc, kerberos, ebpf and virtualization technologies
  • Cloud
    • Experience deploying cloud infrastructure (eg. terraform)
    • Keen understanding of security considerations, multi-tenant deployments, financial optimization and cloud products
  • Software Engineering
    • Proficient in OO development (we use python), git and CI/CD concepts
    • Comfortable contributing to a large code-base with varied technologies
And In General, The Following Qualifications Always Apply

  • Ability to review and/or extend open source platforms to satisfy business requirements
  • A passion for technology and automation, deep sense of curiosity and willingness to always question
  • A passion for in-depth understanding of technology, and building large-scale systems.
  • Excellent verbal and written communication skills.


There are no salaries from Millennium that are similar to this job

View more salaries from Millennium

NodeFlair Insights of Millennium