Senior Systems Engineer, OS Automation
Job Description
About the Role:
SysEng HAVOCK (Hardware - Acceleration - Virtualization - Operating Systems - Containerization - Kernel)
CoreWeave is seeking a highly skilled and motivated Senior Linux OS Automation Engineer to join our SysEng HAVOCK team. Reporting to the Engineering Manager for Systems Engineering, you will play a crucial part in the design, development, and optimization of our bare-metal systems from POST through joining a Kubernetes cluster. The team’s primary responsibilities include maintaining custom Linux kernels, various OS images, the virtualization stack, and the container/pod runtime stack. You will collaborate closely with cross-functional teams, upstack engineering teams, and stakeholders to ensure the successful delivery of highly performant and reliable software solutions.
Responsibilities:
- Design, develop, and maintain automated tooling to reproducibly build, test, and release artifacts that support a variety of hardware platforms in complex environments
+ Custom Linux kernels
+ OS images
+ cloud-init modules
- Define the artifact release-cycle roadmaps in coordination with the rest of the organization’s business needs
- Leverage Kubernetes to automate the testing of OS images and cloud-init configurations
- Document the tested and supported OS images and cloud-init configurations
- Effectively communicate artifact releases to the rest of the organization
- Automate packaging of critical components (drivers, microcode, components with out-of-tree patches, etc)
- Serve as a primary point-of-contact for boot-time issue escalation and troubleshooting
- Collaborate with cross-functional teams to define Linux and OS requirements, specifications, and system architecture
- Contribute improvements to code quality
Minimum Qualifications:
- Must have 6+ years of professional experience maintaining large fleets of physical Linux systems
- Proficiency with Golang, Bash, and Python
- Experience with the following:
+ API development using protobufs (ideally using Golang)
+ Developing custom modules for cloud-init
+ Github Actions or Gitlab Actions for CI/CD
+ Building the Linux kernel/complex C compilation toolchains
+ Packaging software into docker containers
+ Packaging software into Debian packages (.deb)
+ Deploying containerized software using Kubernetes
+ Using semantic releases to support LTS alongside non-LTS versioned artifacts
+ Developing frameworks for complex software tests
- Demonstrated experience working collaboratively on shared codebases
- Excellent documentation skills and attention to detail
- Strong analytical and problem-solving abilities
- Served in an on-call rotation supporting production services
Nice-to-haves:
- Experience with the following:
+ Supporting both amd64 and arm64 architectures
+ Private Key Infrastructure
+ Different boot formats/mechanisms/tools (UEFI, ipxe, iso, grub, uboot, etc)
+ Ansible/AWX
+ Aptly
+ Packer
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $160,000-$185,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.
About the Role:
SysEng HAVOCK (Hardware - Acceleration - Virtualization - Operating Systems - Containerization - Kernel)
CoreWeave is seeking a highly skilled and motivated Senior Linux OS Automation Engineer to join our SysEng HAVOCK team. Reporting to the Engineering Manager for Systems Engineering, you will play a crucial part in the design, development, and optimization of our bare-metal systems from POST through joining a Kubernetes cluster. The team’s primary responsibilities include maintaining custom Linux kernels, various OS images, the virtualization stack, and the container/pod runtime stack. You will collaborate closely with cross-functional teams, upstack engineering teams, and stakeholders to ensure the successful delivery of highly performant and reliable software solutions.
Responsibilities:
- Design, develop, and maintain automated tooling to reproducibly build, test, and release artifacts that support a variety of hardware platforms in complex environments
- Custom Linux kernels
- OS images
- cloud-init modules
- Define the artifact release-cycle roadmaps in coordination with the rest of the organization’s business needs
- Leverage Kubernetes to automate the testing of OS images and cloud-init configurations
- Document the tested and supported OS images and cloud-init configurations
- Effectively communicate artifact releases to the rest of the organization
- Automate packaging of critical components (drivers, microcode, components with out-of-tree patches, etc)
- Serve as a primary point-of-contact for boot-time issue escalation and troubleshooting
- Collaborate with cross-functional teams to define Linux and OS requirements, specifications, and system architecture
- Contribute improvements to code quality
Minimum Qualifications:
- Must have 6+ years of professional experience maintaining large fleets of physical Linux systems
- Proficiency with Golang, Bash, and Python
- Experience with the following:
- API development using protobufs (ideally using Golang)
- Developing custom modules for cloud-init
- Github Actions or Gitlab Actions for CI/CD
- Building the Linux kernel/complex C compilation toolchains
- Packaging software into docker containers
- Packaging software into Debian packages (.deb)
- Deploying containerized software using Kubernetes
- Using semantic releases to support LTS alongside non-LTS versioned artifacts
- Developing frameworks for complex software tests
- Demonstrated experience working collaboratively on shared codebases
- Excellent documentation skills and attention to detail
- Strong analytical and problem-solving abilities
- Served in an on-call rotation supporting production services
Nice-to-haves:
- Experience with the following:
- Supporting both amd64 and arm64 architectures
- Private Key Infrastructure
- Different boot formats/mechanisms/tools (UEFI, ipxe, iso, grub, uboot, etc)
- Ansible/AWX
- Aptly
- Packer
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $160,000-$185,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.