Showing 1 entry
We report our experiences in porting and tuning the Apache Spark data analytics framework on the Cray XC30 (Edison) and XC40 (Cori) systems, installed at NERSC. We find that design decisions made in the development of Spark are based on the assumption that Spark is constrained primarily by network latency, and that disk I/O is comparatively cheap. These assumptions are not valid on Edison or Cori, which feature advanced low-latency networks but have diskless compute nodes. Lustre metadata access latency is a major bottleneck, severely constraining scalability. We characterize this problem with benchmarks run on a system with both Lustre and local disks, and show how to mitigate high metadata access latency by using per-node loopback filesystems for temporary storage. With this technique, we reduce the shuffle time and improve application scalability from O(100) to O(10, 000) cores on Cori. For shuffle-intensive machine learning workloads, we show better performance than clusters with local disks.
Modified: Fri Jan 20 13:49:49 2017
Created: Fri Jan 20 13:36:29 2017
Return to the ParaDucks Research Group Publications page.