Session: High Performance Computing – Algorithms and Applications
- “Fast Parallel Tensor Times Same Vector for Hypergraphs”
- “Reduce, Reuse and Adapt — Accelerating Graph Processing on GPUs”
- “Reduce Computational Complexity for Convolutional Layers by Skipping Zeros”
- “A Lossless Compression Pipeline for Petabyte-scale Whole Genome Sequencing Data”
- “SpikeNC: An Accurate and Scalable Simulator for Spiking Neural Network on Multi-Core Neuromorphic Hardware”
- “DAGit : A Platform For Enabling Serverless Applications”
- “Efficient GPU Implementation of Automatic Differentiation for Computational Fluid Dynamics”
Session: High Performance Computing – Architecture
- “DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference”
- “A 118 GOPS/mm^2 3D eDRAM TensorCore Architecture for Large-scale Matrix Multiplication”
- “PARAG: PIM Architecture for Real-time Acceleration of GCNs”
- “Hybrid CUDA Unified Memory Management in Fully Homomorphic Encryption Workloads”
- “Mobile Gaming Experience: An Approach based on Thread Scheduler & Thread Priority Manager”
- “Optimized All-to-all Connection Establishment for High-Performance MPI Libraries over InifiniBand”
- “Data Locality Aware Computation Offloading in Near Memory Processing Architecture for Big Data Applications”
Session: High Performance Computing – Systems
- “Towards Efficient I/O Pipelines using Accumulated Compression”
- “Quartermaster: A Reinforcement-Learning, Resource-Recommendation System for Cloud HPC”
- “TOOL – Programming Model and Parallel Runtime for Optimization Problems”
- “Towards Enhanced I/O Performance of NVM File Systems”
- “MOSAIC : A Multi-Objective Optimization Framework for Sustainable Datacenter Management”
- “Benesh: Choreographic Coordination for In-situ Workflows”
- “Profit Maximization using Colloborative Storage Management in Multi-tier Cloud System”
Session: Data Science – Scalable Algorithms and Analytics
- “Contour Algorithm for Connected Components”
- “CAPTURE: Memory-Centric Partitioning for Distributed DNN Training with Hybrid Parallelism”
- “MiCRO: Near-Zero Cost Gradient Sparsification for Scaling and Accelerating Distributed DNN Training”
- “Patterns of Model Evolution in Network Architecture Search”
- “Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference”
- “Performance Characterization of Containerized DNN Training and Inference on Edge Accelerators”
Session: Data Science – Scalable Systems and Software
- “SECRE: Surrogate-based Error-controlled Lossy Compression Ratio Estimation Framework”
- “Fast Algorithms for Scientific Data Compression”
- “CAPIO: a Middleware for Transparent I/O Streaming in Data-Intensive Workflows”
- “Multi-Streamed Metadata-Integrity Verification For Cloud Migration In Deduplication Systems”
- “CPU-GPU Tuning to Improve Scientific Applications run on Heterogeneous Nodes”
- “JASS: A Tunable Checkpointing System for NVM-based Systems”
- “DDIOSim: A Microarchitecture Simulator for Data Direct I/O Technology”
- “FPGA Accelerated Bi-Cubic Convolution for Image Interpolation”
Session: Best Paper Nominees
- “DeltaSPARSE: High-Performance Sparse General Matrix-Matrix Multiplication on Multi-GPU Systems”
- “Strategies for Fast I/O Throughput in Large-scale Climate Modeling Applications”
- “ME-ViT: A Single-Load Memory-Efficient FPGA Accelerator for Vision Transformers”
- “Graph Pattern Mining Paradigms: Consolidation and Renewed Bearing”
- “Accelerating Time to Science using CRADLE: A Framework for Materials Data Science”
- “Optimizing the Training of Co-Located Deep Learning Models Using Cache-Aware Staggering”