Download PDFOpen PDF in browser

Exploring Strategies to Improve Locality Across Many-Core Affinities

EasyChair Preprint no. 7471

12 pagesDate: February 15, 2022

Abstract

Several recent rank one systems in the Top500 include many-core chips with complex memory systems, including intermediate levels of memory, multiple memory channels, and explicit affinity of specific memory channels to specific sub-blocks of cores. Creating codes to utilize these features efficiently is thus a significant challenge. This paper uses Intel's Knights Landing (KNL) processor as a testbed, as it includes both intermediate memory and multiple architectural knobs to adjust affinity. This paper also uses a 2D Fast Fourier Transform (FFT) as a test case to explore what combination of architectural and algorithmic techniques are of most benefit. Several codes are used, including state-of-the-art FFT codes FFTW and MKL, along with two additional simple parallel 2D FFT codes exploring explicit options. The conclusions are that intermediate memory does provide a significant boost, that there are architectural modes in the memory subsystem that are better suited to FFT than others, and that a cache-oblivious FFT performs consistently across affinity modes.

Keyphrases: affinity, buffering, cache-oblivious, FFT, multilevel memory

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@Booklet{EasyChair:7471,
  author = {Neil Butcher and Peter Kogge},
  title = {Exploring Strategies to Improve Locality Across Many-Core Affinities},
  howpublished = {EasyChair Preprint no. 7471},

  year = {EasyChair, 2022}}
Download PDFOpen PDF in browser