>

Job Summary


Job Type
Permanent

Seniority
Senior

Years of Experience
Information not provided

Tech Stacks
Docker
Container
POSIX
Metal
CI
SQS
Helm
Composer
Git
Airflow
kafka
Ansible
Terraform
SQL
Linux
Python
AWS

Job Description


Apply
The team is comprised of extremely talented (but humble) multidisciplinary individuals with unrestricted access across a large environment. We believe that one cannot build a truly great service without the ability to make changes across the stack. We take great care in focusing on solving real business problems, reducing operational overhead and working together as a team.

The platform and infrastructure team is responsible for the following areas – this includes both engineering and operations:
  • data modelling, database tuning & query optimization for SQL/no-SQL/columnar databases
  • real-time message bus (high volume, not ultra-low latency)
  • stream processing
  • HPC job scheduling
  • workflow management and batch processing
  • container orchestration
  • service discovery
  • POSIX and object storage systems
  • on premise:
  • bare metal compute (linux)
  • system tuning
  • configuration management and drift management
  • performance tuning
  • network configuration management
  • compute, storage, network system purchases / evaluations
  • cloud(s)
  • Environment provisioning and management
Qualifications/Skills Required

The team needs individuals like you with competencies in two or more of the following areas:
  • HPC job scheduling
    • Experience in environments at scale (eg. billions of jobs per week/month)
    • Understanding of cost metrics, preemption, job types, queuing, scheduler and optimizations
    • experience with products like HTCondor, slurm , spectrum LSF, nomad, AWS batch
  • Workflow management and batch processing
    • Experience in the challenges of workflow management in heavily multi-tenant environments
    • Mature approach to dealing with/avoiding task failure and system failure
    • experience with products like airflow, nifi, gnubatch, GCP cloud composer, AWS sagema
  • Container Orchestration (Kubernetes)
    • Experience with: PSPs, helm, admission/mutation controllers, PVs/PVCs, kube-router, BGP – generally demonstrated ability dig deep into the k8s projects to solve hard problems
    • Experience with docker & registries (eg. harbor, artifactory, GCP container registry, AWS container registry)
    • Mature approach to dealing with operational complexities and gaps of the k8s platform
  • Storage Systems
    • Experience deploying and managing petabyte scale systems supporting varied workloads
    • Mature approach to accessing price/performance, tiering and backup requirements
    • experience with products like GPFS, Lustre, Ceph, GCP PDs or other nvme specific products
    • familiarity with NVMEof, POSIX , object storage and various modes of permissioning data
  • Real time messaging & stream processing
    • experience in environments with 24/7 reliance on messaging systems
    • understanding of the messaging domain, congestion, delays, trade-offs and tech landscape
    • experience having deployed and/or used stream processing systems
    • experience with products like
    • kafka, pulsar, pravega, AWS sqs, GCP pubsub
  • Linux
    • Experience using configuration management systems (eg. saltstack, ansible)
    • Understanding of linux kernel components (eg. VFS, scheduler, memory mgmt., network)
    • Solid troubleshooting experience using gdb, OS & application tracing/profiling mechanisms
    • Experience with some of docker, lxd/lxc, kerberos, ebpf and virtualization technologies
  • Cloud
    • Experience deploying cloud infrastructure (eg. terraform)
    • Keen understanding of security considerations, multi-tenant deployments, financial optimization and cloud products
  • Software Engineering
    • Proficient in OO development (we use python), git and CI/CD concepts
    • Comfortable contributing to a large code-base with varied technologies
And In General, The Following Qualifications Always Apply

  • Ability to review and/or extend open source platforms to satisfy business requirements
  • A passion for technology and automation, deep sense of curiosity and willingness to always question
  • A passion for in-depth understanding of technology, and building large-scale systems.
  • Excellent verbal and written communication skills.

Salaries

There are no salaries from Millennium that are similar to this job

View more salaries from Millennium


NodeFlair Insights of Millennium