Senior High Performance Computing Systems Administrator
I. JOB OVERVIEW
Job Description Summary:
The Division of Information Technology (it.gwu.edu) is the chief provider of technology infrastructure, services and applications at GW. The Division partners with stakeholders across GW to equip students, staff and faculty with the technology know-how and tools necessary to achieve academic excellence.
Reporting to the Director of Research Technology Services, the Senior High Performance Computing Systems Administrator supports the Division's Planning & Strategic Initiatives department. This role will serve as a great opportunity for the incumbent to collaboratively establish and grow a shared high performance computing (HPC) service in support of a fast growing research profile at a major research university. The Sr. HPC Systems Administrator is responsible for the following:
- System administration and service ownership of HPC clusters and infrastructure to include specialized servers, storage, and networking. Monitors HPC infrastructure health and utilization and is responsible for both proactively and reactively addressing operational issues. Executes regular system maintenance and enhancement activities to include diagnosing and solving various system operational problems and automating common processes when possible.
- By working with HPC Specialists, develops and manages service level agreements for the HPC services, and implements operational procedures including web-based access to HPC resources.
- In collaboration with other service and research partners, takes part in augmenting the traditional HPC models to create flexible hybrid HPC infrastructures by integrating it with public and private cloud services, high-performance networks, and distributed storage systems.
- Focuses on improving big-data processing characteristics of the HPC cluster and provides VM system for both user front ends and for the management of node images and updates.
- Coordinates with the HPC user community to establish service roadmap recommendations and service enhancements.
- Establishes and manages training materials for services desk, local support partners and the HPC user community; delivery of training will be coordinated with training and development groups.
- Collects, analyzes, and reports usage data to relevant parties including the HPC user communities and interested administrators.
- Manages service related vendor relationships and contracts.
- Provides technical and functional supervision over other team members.
The omission of specific duties does not preclude the supervisor from assigning duties that are logically related to the position.
This position is primarily located at the data center location on the GW Virginia Science and Technology Campus (VSTC) in Ashburn, Virginia, however time may be split between this location and the Foggy Bottom Campus in Washington DC as required.Minimum Qualifications:
Bachelor's degree in an appropriate area of specialization plus 5 years of relevant professional experience. Degree requirements may be substituted with an equivalent combination of education, training and experience.Required Licenses/Certifications/Posting Specific Minimum Qualifications: Preferred Qualifications:
Experience in a large-scale production high performance computing environment.
Familiarity with a variety of the HPC subject area concepts and practices in the context of academic research, to include basic understanding of sponsored research compliance requirements.
Excellent oral and written communication skills; ability to prepare and present comprehensive presentations to IT and business executives.
Demonstrated experience working in an environment with rapidly changing job priorities.
Strong analytical and troubleshooting skills.
Ability to creatively improve workflows and processes.
Experience scripting in Perl, Python, or bash.
Experience with Linux kernel modules, preferably for Lustre, NVIDIA GPUs, and Mellanox InfiniBand cards.
Familiarity with the Simple Linux Utility for Resource Management (Slurm) workload manager, or other job schedulers, including the setup and maintenance of a multi-factor fair-share priority scheme.
Familiarization with virtualization environments for front-end and maintenance image management.
Familiarity with a ticket tracking systems and service level management.
II. JOB DETAILS
Campus Location: Ashburn College/School/Department: Division of IT Family Information Technology Sub-Family High Performance Computing Stream Individual Contributor Level Level 3 Full-Time/Part-Time: Full-Time Hours Per Week: 40 Work Schedule: Monday-Friday Position Designation: Essential: Employees who perform functions that have been deemed essential to maintaining business or academic operations. Employees are generally expected to work from home during an event and may be asked to physically report to work. Telework: No Required Background Check: Criminal History Screening, Education/Degree/Certifications Verification, Social Security Number Trace, and Sex Offender Registry Search, Credit Special Instructions to Applicants: Internal Applicants Only? No Posting Number: S006247 Job Open Date: 04/04/2017 Job Close Date: If temporary, grant funded or limited term appointment, position funded until: Background Screening Successful Completion of a Background Screening will be required as a condition of hire. EEO Statement:
The university is an Equal Employment Opportunity/Affirmative Action employer that does not unlawfully discriminate in any of its programs or activities on the basis of race, color, religion, sex, national origin, age, disability, veteran status, sexual orientation, gender identity or expression, or on any other basis prohibited by applicable law.
Posting Specific Questions
Required fields are indicated with an asterisk (*).
- * What is your expected salary range?
(Open Ended Question)
- Cover Letter
Documents needed to Apply
- Cover Letter