Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add new node types

...

  1. 24-core 512GB
  2. 32-core 64GB
  3. 32-core 256GB
  4. 32-core 512GB
  5. 40-core 96GB
  6. 40-core 192GB
  7. 56-core 128GB
  8. 56-core 256GB
  9. 56-core 512GB
  10. 64-core 192GB
  11. 64-core 384GB
  12. 64-core 768GB
  13. 80-core 96GB
  14. 80-core 192GB
  15. 80-core 384GB
  16. 80-core 768GB
  17. 80-core 1.5TB

The Argon cluster is split between two data centers,

  • ITF → Information Technology Facility
  • LC→ Lindquist Center

Most of the nodes in the LC datacenter are connected with the OmniPath high speed interconnect fabric, while most of those in the ITF data center are connected with the InfiniPath fabric, with the latest nodes having a Mellanox Infiniband EDR fabric. There are imany machines with varying types of GPU accelerators:

  1. 21 machines with Nvidia P100 accelerators

...

  1. 2 machines with Nvidia K80 accelerators

...

  1. 11 machines with NVidia K20 accelerators

...

  1. 2 machines with Nvidia P40 accelerators

...

  1. 17 machines with 1080Ti accelerators

...

  1. 19 machines with Titan V accelerators

...

  1. 4 machines with V100 accelerators
  2. 28 machines with 2080Ti accelerators
Info

The Titan V is now considered as a supported configuration in Argon phase 1 GPU-capable compute nodes but is restricted to a single card per node. Staff have completed the qualification process for the 1080 Ti and concluded that it is not a viable solution to add to phase 1 Argon compute nodes.

Info

The Rpeak needs to be updated.

The Rpeak (theoretical Flops) is 385.0 TFlops, not including the accelerators, with 112 TB of memory. In addition, there are 2 login nodes of the Broadwell system architecture, with 256GB of memory each. 

While on the backend Argon was completely new architecture in Feb. 2017, the frontend should be very familiar to those who have used previous generation HPC systems at the University of Iowa. There are, however, a few key differences that will be discussed in this page. 


Heterogeneity

While previous HPC cluster systems at UI have been very homogenous, the Argon HPC system has a heterogeneous mix of compute node types. In addition to the variability in the GPU accelerator types listed above, there are also differences in CPU architecture. We generally follow Intel marketing names, with the most important distinction being the AVX (Advanced Vector Extensions) unit on the processor. The following table lists the processors in increasing generational order.

ArchitectureAVX levelFloating Point Operations per cycle
Sandybridge
Ivybridge
AVX8
Haswell
Broadwell
AVX216
Skylake SilverAVX51216 (1) AVX unit per processor core
Skylake GoldAVX51232 (2) AVX units per processor core
Cascade Lake GoldAVX51232

Note that code must be optimized during compilation to take advantage of AVX instructions. The CPU architecture is important to keep in mind both in terms of potential performance and compatibility. For instance, code optimized for AVX2 instructions will not run on the Sandybridge/Ivybridge architecture because it only supports AVX, not AVX2. However, each successive generation is backward compatible so code optimized with AVX instructions will run on Haswell/Broadwell systems.

...

Table plus


Node memory (GB)Job slotsMemory (GB) per slot
64322
96402
96801
128562
192404
192643
192802
256328
256564
384646
384804
51224 (no HT)20
51232 (no HT)16
512568
7686412
768809
15368018


Using the Basic Job Submission and Advanced Job Submission pages as a reference, how would one submit jobs taking HT into account? For single process high throughput type jobs it probably does not matter, just request one slot per job. For multithreaded or MPI jobs, request one job slot per thread or process. So if your application runs best with 4 threads then request something like the following.

...