The Sr Systems Engineer HPC is responsible for designing and maintaining HPC infrastructure, optimizing performance, and collaborating with scientists to meet computational needs.
Job Summary: Rackspace seeking a highly skilled and motivated HPC System Engineer to join our team. You’ll be responsible for working directly for one of flagship clients and designing, implementing, maintaining, and optimizing their high-performance computing (HPC) infrastructure. You will work closely with researchers, scientists, and other engineers to ensure the efficient and reliable operation of the HPC systems.
Work Location: 100% Remote. Due to this role supporting a customer in the Seattle area we prefer to hire in either PST or MST time zones.
Travel: There may be minimal travel to either San Antonio, TX or Seattle WA.
Discover your inner Racker: Racker Life
Information on benefits offered is here.
Responsibilities:
- Install, configure, and maintain HPC clusters, including hardware and software components.
- Monitor system performance, identify bottlenecks, and implement solutions to optimize performance.
- Manage user accounts, permissions, and resource allocation.
- Perform regular system maintenance, updates, and patching.
- Troubleshoot and resolve hardware and software issues in a timely manner.
- Participate in the design and planning of HPC infrastructure upgrades and expansions.
- Evaluate and recommend hardware and software solutions to meet evolving computational needs.
- Implement and manage storage systems, networking infrastructure, and interconnects (e.g., InfiniBand).
- Optimize system configurations and application performance for HPC workloads.
- Profile and analyze application performance to identify areas for improvement.
- Implement and utilize performance monitoring tools and techniques.
- Provide technical support and training to HPC users.
- Collaborate with researchers and scientists to understand their computational requirements.
- Work closely with HPC architects and engineers to ensure that research needs are met.
- Document system configurations, procedures, and best practices.
- Assist HPC engineers and architects with day-to-day operations and ticket management.
- Implement and maintain security measures to protect HPC infrastructure and data.
- Ensure compliance with relevant security policies and regulations.
- Manage data backups and disaster recovery procedures.
Qualifications:
- Bachelor's degree in computer science, engineering, or a related field. Experience may substitute for the degree.
- Minimum of 10 yrs experience working with systems; 5yrs specifically with HPC.
- Strong knowledge of Linux operating systems (e.g., Rocky, Ubuntu).
- Experience with cluster management tools (e.g., Slurm, PBS).
- Familiarity with high-speed interconnects (e.g., InfiniBand, Ethernet).
- Knowledge of parallel file systems (e.g., Lustre, SEPH, GPFS).
- Proficiency in scripting languages (e.g., R, Python, Bash).
- Understanding of HPC hardware architectures and technologies (e.g., CPUs, GPUs, memory).
- Strong demonstrated experience with a major configuration management software (e.g. Terraform, Ansible), including application packaging and installation.
- Must have strong knowledge of Linux security and Linux shell scripting.
- Strong communication and interpersonal skills.
- Knowledge of data transfer protocols and large-scale storage solutions.
The following information is required by pay transparency legislation in the following states: CA, CO, HI, NY, and WA. This information applies only to individuals working in these states.
· The anticipated starting pay range for Colorado is: $116,100 - $170-280.
· The anticipated starting pay range for the states of Hawaii and New York (not including NYC) is: $123,600 - $181,280.
· The anticipated starting pay range for California, New York City and Washington is: $135,300 - $198,440.
Unless already included in the posted pay range and based on eligibility, the role may include variable compensation in the form of bonus, commissions, or other discretionary payments. These discretionary payments are based on company and/or individual performance and may change at any time. Actual compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location.#LI-MF1 #LI-Remote
About Rackspace Technology
We are the multicloud solutions experts. We combine our expertise with the world’s leading technologies — across applications, data and security — to deliver end-to-end solutions. We have a proven record of advising customers based on their business challenges, designing solutions that scale, building and managing those solutions, and optimizing returns into the future. Named a best place to work, year after year according to Fortune, Forbes and Glassdoor, we attract and develop world-class talent. Join us on our mission to embrace technology, empower customers and deliver the future.
More on Rackspace Technology
Though we’re all different, Rackers thrive through our connection to a central goal: to be a valued member of a winning team on an inspiring mission. We bring our whole selves to work every day. And we embrace the notion that unique perspectives fuel innovation and enable us to best serve our customers and communities around the globe. We welcome you to apply today and want you to know that we are committed to offering equal employment opportunity without regard to age, color, disability, gender reassignment or identity or expression, genetic information, marital or civil partner status, pregnancy or maternity status, military or veteran status, nationality, ethnic or national origin, race, religion or belief, sexual orientation, or any legally protected characteristic. If you have a disability or special need that requires accommodation, please let us know.
Top Skills
Ansible
Bash
Ethernet
Gpfs
Infiniband
Linux
Lustre
Pbs
Python
R
Rocky
Seph
Slurm
Terraform
Ubuntu
Similar Jobs
Big Data • Cloud • Digital Media • Machine Learning • Mobile • Software • Industrial
Lead the design and implementation of Evidence Collection and Compliance automation systems, collaborating with teams to enhance system reliability and adhere to regulatory standards.
Top Skills:
AWSCi/CdCircleCICloudFormationEc2Gitlab CiIamJenkinsLambdaPythonRdsS3ServerlessTerraform
Artificial Intelligence • Healthtech • Analytics • Biotech
The Field Engineer I will perform installations, troubleshooting, and repairs on medical equipment, assist senior engineers, and manage customer relations.
Top Skills:
Active DirectoryInformation TechnologyMicrosoft CertificationsNetworkingSQL Server
Security • Cybersecurity
The Engineer will support design projects, conduct facility assessments, develop technical documentation, and interface with clients. Responsibilities include project planning, document standards compliance, and troubleshooting engineering issues.
Top Skills:
AutocadCost Estimating SoftwareEngineering Computer Scheduling Software
What you need to know about the Chicago Tech Scene
With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.
Key Facts About Chicago Tech
- Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
- Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
- Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
- Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory