Monday, Dec 19, 2022 – Day 2
Technical Session 1: Neural Networks (10:00-12:00)
(Session chair: TBD)
Split-Knit Convolution: Enabling Dense Evaluation of Transpose and Dilated Convolutions on GPUs
Arjun Menon Vadakkeveedu, Debabrata Mandal, Pradeep Ramachandran and Nitin Chandrachoodan
Low-latency Mini-batch GNN Inference on CPU-FPGA Heterogeneous Platform
Bingyi Zhang, Hanqing Zeng and Viktor Prasanna
Accelerating Broadcast Communication with GPU Compression for Deep Learning Workloads
Qinghua Zhou, Quentin Anthony, Aamir Shafi, Hari Subramoni and Dhabaleswar K. Panda.
AccDP: Accelerated Data-Parallel Distributed DNN Training for Modern GPU-Based HPC Clusters
Nawras Alnaasan, Arpan Jain, Aamir Shafi, Hari Subramoni and Dhabaleswar K. Panda
Joint Partitioning and Sampling Algorithm for Scaling Graph Neural Networks
Manohar Lal Das, Vishwesh Jatala and Gagan Raj Gupta
Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
Zhongyi Lin, Louis Feng, Ehsan K. Ardestani, Jaewon Lee, John Lundell, Changkyu Kim, Arun Kejariwal and John D. Owens
Technical Session 2: HPC Architecture & Communication (13:00-15:00)
(Session chair: TBD)
Accelerating Prefix Scan with in-network computing on Intel PIUMA
Kartik Lakhotia, Fabrizio Petrini, Rajgopal Kannan and Viktor Prasanna
memwalkd : Accelerating Key-value stores using Page Table Walkers
Ravi Shreyas Anupindi, Swaroop Kotni and Arkaprava Basu
Energy Consumption Evaluation of Optane DC Persistent Memory for Indexing Data Structures
Manolis Katsaragakis, Christos Baloukas, Lazaros Papadopoulos, Verena Kantere, Francky Catthoor and Dimitrios Soudris
LDT: Lightweight Dirty Tracking of Memory Pages for x86 Systems
Rohit Singh, Arun Kp and Debadatta Mishra
Designing Efficient Pipelined Communication Schemes using Compression in MPI Libraries
Bharath Ramesh, Qinghua Zhou, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni and Dhabaleswar K. Panda
Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters
Kaushik Kandadi Suresh, Akshay Paniraja Guptha, Benjamin Michalowicz, Bharath Ramesh, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni and Dhabaleswar Panda
Technical Session 3: HPC Algorithms & solvers (15:30-17:10)
Productive Truss Analytics and Performance Optimization in Arkouda
Zhihui Du, David A. Bader, Joseph Patchett and Oliver Alvarado Rodriguez.
Parallel Vertex Color Update on Large Dynamic Networks
Arindam Khanda, Sanjukta Bhowmick, Xin Liang and Sajal K. Das
IMpart: A Partitioning-based Parallel Approach to Accelerate Influence Maximization
Reet Barik, Marco Minutoli, Mahantesh Halappanavar and Ananth Kalyananaraman
Leveraging GPU Tensor Cores for Double Precision Euclidean Distance Calculations
Benoit Gallet and Michael Gowanlock
A Portable Sparse Solver Framework for Large Matrices on Heterogeneous Architectures
Md Fazlay Rabbi, Hasan Metin Aktulga, Christopher Daley and Umit Catalyurek
Performance analysis of GPU accelerated meshfree q-LSKUM solvers in Fortran, C, Python, and Julia
Nischay Mamidi, Dhruv Saxena, Kumar Prasun, Anil Nemili, Bharatkumar Sharma and Suresh Deshpande
Tuesday, Dec 20, 2022 – Day 3
Technical Session 4: High Performance and Data Science Applications (10:00-12:00)
(Session chair: TBD)
A Deep Learning-Based In Situ Analysis Framework for Tropical Cyclogenesis Prediction
Abir Mukherjee and Preeti Malakar
HiBGT: High-Performance Bayesian Group Testing for COVID-19
Weicong Chen, Xiaoyi Lu and Curtis Tatsuoka
Customer Churn Prediction in Telecommunications Industry Based on Conditional Wasserstein GAN
Chang Su, Linglin Wei and Xianzhong Xie
A Real-time Flood Inundation Prediction on SX-Aurora TSUBASA
Yoichi Shimomura, Akihiro Musa, Yoshihiko Sato, Atsuhiko Konja, Guoqing Cui, Rei Aoyagi, Keichi Takahashi and Hiroyuki Takizawa
Precise Parallel FEM-based Interactive Cutting Simulation of Deformable Bodies
Harshvardhan Das, Suraj Kumar and Subodh Kumar
Scaling the SOO Global Blackbox Optimizer on a 128-core Architecture
David Redon, Bilel Derbel and Pierre Fortin
Wednesday, Dec 21, 2022 – Day 4
Technical Session 5: HPC System Software and Libraries (10:00-12:00)
(Session chair: TBD)
A GPU-accelerated Data Transformation Framework Rooted in Pushdown Transducers
Tri Nguyen and Michela Becchi
An Algorithmic and Software Pipeline for Very Large Scale Scientific Data Compression with Error Guarantees
Tania Banerjee, Jong Choi, Jaemoon Lee, Qian Gong, Ruonan Wang, Scott Klasky, Anand Rangarajan and Sanjay Ranka
COMPLACE: Automated Thread Placement via Dynamic Binary Instrumentation for Shared-Memory Systems
Ryan Kirkpatrick, Christopher Brown and Vladimir Janjic
LuxIO: Intelligent Resource Provisioning and Auto-Configuration for Storage Services
Keith Bateman, Neeraj Rajesh, Jaime Cernuda Garcia, Luke Logan, Jie Ye, Stephen Herbein, Anthony Kougkas and Xian-He Sun.
IRIS-BLAS: Towards a Performance Portable and Heterogeneous BLAS Library
Narasinga Rao Miniskar, Mohammad Alaul Haque Monil, Pedro Valero Lara, Frank Liu and Jeffrey Vetter
Towards Efficient Cache Allocation for High-Frequency Checkpointing
Avinash Maurya, Bogdan Nicolae, M. Mustafa Rafique, Amr M. Elsayed, Thierry Tonellot and Franck Cappello
Technical Session 6: Data Science Methods (13:00-15:00)
(Session chair: TBD)
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB’s Convergence Speed
Conglong Li, Ammar Ahmad Awan, Hanlin Tang, Samyam Rajbhandari and Yuxiong He
Input Feature Pruning for Accelerating GNN Inference on Heterogeneous Platforms
Jason Yik, Sanmukh Kuppannagari, Hanqing Zeng and Viktor Prasanna
Dynamic Density based Anomaly Detection
Yash Verma
ProvScope: Efficient Differential of Distributed Provenance
Yuta Nakamura, Tanu Malik, Iyad Kanj and Ashish Gehani
Efficient Edge-Computing based Anonymous Authentication Protocol for IoV
Himani Sikarwar and Debasis Das