Posted on 
Feb 5, 2025

Senior Systems Engineer, OS Automation

Mid-Senior ICs
Engineering, IT

Job Description

About the Role:

SysEng HAVOCK (Hardware - Acceleration - Virtualization - Operating Systems - Containerization - Kernel)

CoreWeave is seeking a highly skilled and motivated Senior Linux OS Automation Engineer to join our SysEng HAVOCK team. Reporting to the Engineering Manager for Systems Engineering, you will play a crucial part in the design, development, and optimization of our bare-metal systems from POST through joining a Kubernetes cluster. The team’s primary responsibilities include maintaining custom Linux kernels, various OS images, the virtualization stack, and the container/pod runtime stack. You will collaborate closely with cross-functional teams, upstack engineering teams, and stakeholders to ensure the successful delivery of highly performant and reliable software solutions.

 

Responsibilities:

  • Design, develop, and maintain automated tooling to reproducibly build, test, and release artifacts that support a variety of hardware platforms in complex environments

+ Custom Linux kernels

+ OS images

+ cloud-init modules

  • Define the artifact release-cycle roadmaps in coordination with the rest of the organization’s business needs
  • Leverage Kubernetes to automate the testing of OS images and cloud-init configurations
  • Document the tested and supported OS images and cloud-init configurations
  • Effectively communicate artifact releases to the rest of the organization
  • Automate packaging of critical components (drivers, microcode, components with out-of-tree patches, etc)
  • Serve as a primary point-of-contact for boot-time issue escalation and troubleshooting
  • Collaborate with cross-functional teams to define Linux and OS requirements, specifications, and system architecture
  • Contribute improvements to code quality

Minimum Qualifications:

  • Must have 6+ years of professional experience maintaining large fleets of physical Linux systems
  • Proficiency with Golang, Bash, and Python
  • Experience with the following:

+ API development using protobufs (ideally using Golang)

+ Developing custom modules for cloud-init

+ Github Actions or Gitlab Actions for CI/CD

+ Building the Linux kernel/complex C compilation toolchains

+ Packaging software into docker containers

+ Packaging software into Debian packages (.deb)

+ Deploying containerized software using Kubernetes

+ Using semantic releases to support LTS alongside non-LTS versioned artifacts

+ Developing frameworks for complex software tests

  • Demonstrated experience working collaboratively on shared codebases
  • Excellent documentation skills and attention to detail
  • Strong analytical and problem-solving abilities
  • Served in an on-call rotation supporting production services

Nice-to-haves:

  • Experience with the following:

+ Supporting both amd64 and arm64 architectures

+ Private Key Infrastructure

+ Different boot formats/mechanisms/tools (UEFI, ipxe, iso, grub, uboot, etc)

+ Ansible/AWX

+ Aptly

+ Packer

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $160,000-$185,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

About the Role:

SysEng HAVOCK (Hardware - Acceleration - Virtualization - Operating Systems - Containerization - Kernel)

CoreWeave is seeking a highly skilled and motivated Senior Linux OS Automation Engineer to join our SysEng HAVOCK team. Reporting to the Engineering Manager for Systems Engineering, you will play a crucial part in the design, development, and optimization of our bare-metal systems from POST through joining a Kubernetes cluster. The team’s primary responsibilities include maintaining custom Linux kernels, various OS images, the virtualization stack, and the container/pod runtime stack. You will collaborate closely with cross-functional teams, upstack engineering teams, and stakeholders to ensure the successful delivery of highly performant and reliable software solutions.

 

Responsibilities:

  • Design, develop, and maintain automated tooling to reproducibly build, test, and release artifacts that support a variety of hardware platforms in complex environments
    • Custom Linux kernels
    • OS images
    • cloud-init modules
  • Define the artifact release-cycle roadmaps in coordination with the rest of the organization’s business needs
  • Leverage Kubernetes to automate the testing of OS images and cloud-init configurations
  • Document the tested and supported OS images and cloud-init configurations
  • Effectively communicate artifact releases to the rest of the organization
  • Automate packaging of critical components (drivers, microcode, components with out-of-tree patches, etc)
  • Serve as a primary point-of-contact for boot-time issue escalation and troubleshooting
  • Collaborate with cross-functional teams to define Linux and OS requirements, specifications, and system architecture
  • Contribute improvements to code quality

Minimum Qualifications:

  • Must have 6+ years of professional experience maintaining large fleets of physical Linux systems
  • Proficiency with Golang, Bash, and Python
  • Experience with the following:
    • API development using protobufs (ideally using Golang)
    • Developing custom modules for cloud-init
    • Github Actions or Gitlab Actions for CI/CD
    • Building the Linux kernel/complex C compilation toolchains
    • Packaging software into docker containers
    • Packaging software into Debian packages (.deb)
    • Deploying containerized software using Kubernetes
    • Using semantic releases to support LTS alongside non-LTS versioned artifacts
    • Developing frameworks for complex software tests
  • Demonstrated experience working collaboratively on shared codebases
  • Excellent documentation skills and attention to detail
  • Strong analytical and problem-solving abilities
  • Served in an on-call rotation supporting production services

Nice-to-haves:

  • Experience with the following:
    • Supporting both amd64 and arm64 architectures
    • Private Key Infrastructure
    • Different boot formats/mechanisms/tools (UEFI, ipxe, iso, grub, uboot, etc)
    • Ansible/AWX
    • Aptly
    • Packer

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $160,000-$185,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

Receive Tech Ladies'
newest jobs in your inbox,
every week.

Join Tech Ladies for full-access to the job board, member-only events, and more!

If you're already a member, we haven't forgotten you. We promise. It's a new system. If you fill out the form once, it'll remember you going forward. Apologies for the inconvenience.

No items found.
No items found.
Engineering
Engineering
IT
IT
In-Person
In-Person