The KillDevil cluster is a Linux-based computing system available to researchers across the campus. With more than 9500 computing cores across 774 servers and a large scratch disk space, it provides an environment that can accommodate many types of computational problems. The compute nodes are interconnected with a high speed Infiniband network, making this especially appropriate for large parallel jobs. Killdevil is a heterogeneous cluster with at least 48 GB of memory per node. In addition, there are nodes with extended memory, extremely large memory, and GPGPU computing.

 

You may request an account on KillDevil by selecting Subscribe to Services from the Onyen Services page and select KillDevil Cluster. Accounts on KillDevil are primarily available to faculty, graduate students, and staff as well as to research team members of current KillDevil faculty patrons.


Operating System

  • RedHat Enterprise Linux 5.6

System Maintenance Guidelines

  • Maintenance will be scheduled and announced in advance with a Change Notice and an email to all KillDevil account holders. Depending on the nature of the maintenance, outages may last several hours up to several days.
  • Unscheduled maintenance may involve little or no advance notice depending on the nature of the problem. An Emergency or Follow Up Change Notice will be issued as soon as possible after unscheduled outages.
  • Dell blade-based Linux Cluster
  • Largely focused on parallel and GPGPU computing
  • Machine Name: killdevil.its.unc.edu
  • Two Login Nodes
  • 604 Compute Nodes: 48GB RAM
  • 68 Compute Nodes: 96GB RAM
  • 2 Compute Nodes: 1TB RAM
  • 32 GPU compute nodes with 64 Nvidia Tesla GPU cards
  • Access to 42TB NetApp NAS RAID array used for scratch mounted as /netscr
  • Access to 42TB NetApp NAS FC array used for departmental space, mounted as /nas01
  • Access to 36TB NetApp NAS SATA array used for home directories, mounted as /nas02/home
  • Access to 36TB NetApp NAS SATA array for departmental space, mounted as /nas02/depts
  • Access to 138TB Lustre file system, mounted as /lustre
We use LSF (Load Sharing Facility) from Platform Computing, Inc. for job management. It helps us balance the workload on our central computational servers while giving you access to the software and hardware you need to get your work done regardless of where you are logged in. We are available to assist users with using LSF to submit jobs in a fashion that make optimal use of cluster resources.