generated by bibbase.org
  2023 (1)
xCCL: A Survey of Industry-Led Collective Communication Libraries for Deep Learning. Weingram, A.; Li, Y.; Qi, H.; Ng, D.; Dai, L.; and Lu, X. J. Comput. Sci. Technol., 38(1): 166-195. 02 2023.
xCCL: A Survey of Industry-Led Collective Communication Libraries for Deep Learning. [link]Link   xCCL: A Survey of Industry-Led Collective Communication Libraries for Deep Learning. [link]Paper   link   bibtex  
  2022 (2)
A Study of Database Performance Sensitivity to Experiment Settings. Wang, Y.; Yu, M.; Hui, Y.; Zhou, F.; Huang, Y.; Zhu, R.; Ren, X.; Li, T.; and Lu, X. Proc. VLDB Endow., 15(7): 1439-1452. 2022.
A Study of Database Performance Sensitivity to Experiment Settings. [link]Link   A Study of Database Performance Sensitivity to Experiment Settings. [link]Paper   link   bibtex  
Arcadia: A Fast and Reliable Persistent Memory Replicated Log. Gugnani, S.; Guthridge, S.; Schmuck, F.; Anderson, O.; Bhagwat, D.; and Lu, X. CoRR, abs/2206.12495. 2022.
Arcadia: A Fast and Reliable Persistent Memory Replicated Log. [link]Link   Arcadia: A Fast and Reliable Persistent Memory Replicated Log. [link]Paper   link   bibtex  
  2021 (1)
Towards Offloadable and Migratable Microservices on Disaggregated Architectures: Vision, Challenges, and Research Roadmap. Lu, X.; and Kashyap, A. CoRR, abs/2104.11272. 2021.
Towards Offloadable and Migratable Microservices on Disaggregated Architectures: Vision, Challenges, and Research Roadmap. [link]Link   Towards Offloadable and Migratable Microservices on Disaggregated Architectures: Vision, Challenges, and Research Roadmap. [link]Paper   link   bibtex  
  2020 (4)
On mass conservation and solvability of the discretized variable-density zero-Mach Navier-Stokes equations. Lu, X.; and Pantano, C. J. Comput. Phys., 404. 2020.
On mass conservation and solvability of the discretized variable-density zero-Mach Navier-Stokes equations. [link]Link   On mass conservation and solvability of the discretized variable-density zero-Mach Navier-Stokes equations. [link]Paper   link   bibtex   1 download  
Understanding the Idiosyncrasies of Real Persistent Memory. Gugnani, S.; Kashyap, A.; and Lu, X. Proc. VLDB Endow., 14(4): 626-639. 2020.
Understanding the Idiosyncrasies of Real Persistent Memory. [link]Link   Understanding the Idiosyncrasies of Real Persistent Memory. [link]Paper   link   bibtex  
Workshop 7: HPBDC High-Performance Big Data and Cloud Computing. Lu, X.; and Zhan, J. In IPDPS Workshops, pages 385, 2020. IEEE
Workshop 7: HPBDC High-Performance Big Data and Cloud Computing. [link]Link   Workshop 7: HPBDC High-Performance Big Data and Cloud Computing. [link]Paper   link   bibtex  
CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance. Jin, Z.; Shi, H.; Hu, Y.; Zha, L.; and Lu, X. J. Comput. Sci. Technol., 35(1): 194-208. 2020.
CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance. [link]Link   CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance. [link]Paper   link   bibtex  
  2019 (9)
Exploiting Hardware Multicast and GPUDirect RDMA for Efficient Broadcast. Chu, C.; Lu, X.; Awan, A. A.; Subramoni, H.; Elton, B.; and Panda, D. K. IEEE Trans. Parallel Distributed Syst., 30(3): 575-588. 2019.
Exploiting Hardware Multicast and GPUDirect RDMA for Efficient Broadcast. [link]Link   Exploiting Hardware Multicast and GPUDirect RDMA for Efficient Broadcast. [link]Paper   link   bibtex  
Performance analysis of deep learning workloads using roofline trajectories. Javed, M. H.; Ibrahim, K. Z.; and Lu, X. CCF Trans. High Perform. Comput., 1(3-4): 224-239. 2019.
Performance analysis of deep learning workloads using roofline trajectories. [link]Link   Performance analysis of deep learning workloads using roofline trajectories. [link]Paper   link   bibtex  
Early Experience in Benchmarking Edge AI Processors with Object Detection Workloads. Hui, Y.; Lien, J.; and Lu, X. In Gao, W.; Zhan, J.; Fox, G. C.; Lu, X.; and Stanzione, D., editor(s), Bench, volume 12093, of Lecture Notes in Computer Science, pages 32-48, 2019. Springer
Early Experience in Benchmarking Edge AI Processors with Object Detection Workloads. [link]Link   Early Experience in Benchmarking Edge AI Processors with Object Detection Workloads. [link]Paper   link   bibtex   1 download  
SimdHT-Bench: Characterizing SIMD-Aware Hash Table Designs on Emerging CPU Architectures. Shankar, D.; Lu, X.; and Panda, D. K. D. K. In IISWC, pages 178-188, 2019. IEEE
SimdHT-Bench: Characterizing SIMD-Aware Hash Table Designs on Emerging CPU Architectures. [link]Link   SimdHT-Bench: Characterizing SIMD-Aware Hash Table Designs on Emerging CPU Architectures. [link]Paper   link   bibtex   1 download  
SCOR-KV: SIMD-Aware Client-Centric and Optimistic RDMA-Based Key-Value Store for Emerging CPU Architectures. Shankar, D.; Lu, X.; and Panda, D. K. In HiPC, pages 257-266, 2019. IEEE
SCOR-KV: SIMD-Aware Client-Centric and Optimistic RDMA-Based Key-Value Store for Emerging CPU Architectures. [link]Link   SCOR-KV: SIMD-Aware Client-Centric and Optimistic RDMA-Based Key-Value Store for Emerging CPU Architectures. [link]Paper   link   bibtex  
TriEC: tripartite graph based erasure coding NIC offload. Shi, H.; and Lu, X. In Taufer, M.; Balaji, P.; and Peña, A. J., editor(s), SC, pages 44:1-44:34, 2019. ACM
TriEC: tripartite graph based erasure coding NIC offload. [link]Link   TriEC: tripartite graph based erasure coding NIC offload. [link]Paper   link   bibtex   1 download  
C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA Networks. Zhang, J.; Lu, X.; Chu, C.; and Panda, D. K. In IPDPS, pages 242-251, 2019. IEEE
C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA Networks. [link]Link   C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA Networks. [link]Paper   link   bibtex  
Introduction to HPBDC 2019. Lu, X.; Zhan, J.; and Panda, D. K. In IPDPS Workshops, pages 394, 2019. IEEE
Introduction to HPBDC 2019. [link]Link   Introduction to HPBDC 2019. [link]Paper   link   bibtex  
UMR-EC: A Unified and Multi-Rail Erasure Coding Library for High-Performance Distributed Storage Systems. Shi, H.; Lu, X.; Shankar, D.; and Panda, D. K. In Weissman, J. B.; Butt, A. R.; and Smirni, E., editor(s), HPDC, pages 219-230, 2019. ACM
UMR-EC: A Unified and Multi-Rail Erasure Coding Library for High-Performance Distributed Storage Systems. [link]Link   UMR-EC: A Unified and Multi-Rail Erasure Coding Library for High-Performance Distributed Storage Systems. [link]Paper   link   bibtex   1 download  
  2018 (15)
MR-Advisor: A comprehensive tuning, profiling, and prediction tool for MapReduce execution frameworks on HPC clusters. ur Rahman, M. W.; Islam, N. S.; Lu, X.; Shankar, D.; and Panda, D. K. J. Parallel Distributed Comput., 120: 237-250. 2018.
MR-Advisor: A comprehensive tuning, profiling, and prediction tool for MapReduce execution frameworks on HPC clusters. [link]Link   MR-Advisor: A comprehensive tuning, profiling, and prediction tool for MapReduce execution frameworks on HPC clusters. [link]Paper   link   bibtex  
DLoBD: A Comprehensive Study of Deep Learning over Big Data Stacks on HPC Clusters. Lu, X.; Shi, H.; Biswas, R.; Javed, M. H.; and Panda, D. K. IEEE Trans. Multi Scale Comput. Syst., 4(4): 635-648. 2018.
DLoBD: A Comprehensive Study of Deep Learning over Big Data Stacks on HPC Clusters. [link]Link   DLoBD: A Comprehensive Study of Deep Learning over Big Data Stacks on HPC Clusters. [link]Paper   link   bibtex  
Designing a Micro-Benchmark Suite to Evaluate gRPC for TensorFlow: Early Experiences. Biswas, R.; Lu, X.; and Panda, D. K. CoRR, abs/1804.01138. 2018.
Designing a Micro-Benchmark Suite to Evaluate gRPC for TensorFlow: Early Experiences. [link]Link   Designing a Micro-Benchmark Suite to Evaluate gRPC for TensorFlow: Early Experiences. [link]Paper   link   bibtex  
Networking and communication challenges for post-exascale systems. Panda, D. K.; Lu, X.; and Subramoni, H. Frontiers Inf. Technol. Electron. Eng., 19(10): 1230-1235. 2018.
Networking and communication challenges for post-exascale systems. [link]Link   Networking and communication challenges for post-exascale systems. [link]Paper   link   bibtex  
HPC AI500: A Benchmark Suite for HPC AI Systems. Jiang, Z.; Gao, W.; Wang, L.; Xiong, X.; Zhang, Y.; Wen, X.; Luo, C.; Ye, H.; Lu, X.; Zhang, Y.; Feng, S.; Li, K.; Xu, W.; and Zhan, J. In Zheng, C.; and Zhan, J., editor(s), Bench, volume 11459, of Lecture Notes in Computer Science, pages 10-22, 2018. Springer
HPC AI500: A Benchmark Suite for HPC AI Systems. [link]Link   HPC AI500: A Benchmark Suite for HPC AI Systems. [link]Paper   link   bibtex  
A Survey on Deep Learning Benchmarks: Do We Still Need New Ones?. Zhang, Q.; Zha, L.; Lin, J.; Tu, D.; Li, M.; Liang, F.; Wu, R.; and Lu, X. In Zheng, C.; and Zhan, J., editor(s), Bench, volume 11459, of Lecture Notes in Computer Science, pages 36-49, 2018. Springer
A Survey on Deep Learning Benchmarks: Do We Still Need New Ones? [link]Link   A Survey on Deep Learning Benchmarks: Do We Still Need New Ones? [link]Paper   link   bibtex  
EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures. Shi, H.; Lu, X.; and Panda, D. K. In Zheng, C.; and Zhan, J., editor(s), Bench, volume 11459, of Lecture Notes in Computer Science, pages 215-230, 2018. Springer
EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures. [link]Link   EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures. [link]Paper   link   bibtex  
Accelerating TensorFlow with Adaptive RDMA-Based gRPC. Biswas, R.; Lu, X.; and Panda, D. K. In HiPC, pages 2-11, 2018. IEEE
Accelerating TensorFlow with Adaptive RDMA-Based gRPC. [link]Link   Accelerating TensorFlow with Adaptive RDMA-Based gRPC. [link]Paper   link   bibtex  
Analyzing, Modeling, and Provisioning QoS for NVMe SSDs. Gugnani, S.; Lu, X.; and Panda, D. K. In Sill, A.; and Spillner, J., editor(s), UCC, pages 247-256, 2018. IEEE Computer Society
Analyzing, Modeling, and Provisioning QoS for NVMe SSDs. [link]Link   Analyzing, Modeling, and Provisioning QoS for NVMe SSDs. [link]Paper   link   bibtex  
OC-DNN: Exploiting Advanced Unified Memory Capabilities in CUDA 9 and Volta GPUs for Out-of-Core DNN Training. Awan, A. A.; Chu, C.; Subramoni, H.; Lu, X.; and Panda, D. K. In HiPC, pages 143-152, 2018. IEEE
OC-DNN: Exploiting Advanced Unified Memory Capabilities in CUDA 9 and Volta GPUs for Out-of-Core DNN Training. [link]Link   OC-DNN: Exploiting Advanced Unified Memory Capabilities in CUDA 9 and Volta GPUs for Out-of-Core DNN Training. [link]Paper   link   bibtex  
Spark-uDAPL: Cost-Saving Big Data Analytics on Microsoft Azure Cloud with RDMA Networks*. Lu, X.; Shankar, D.; Shi, H.; and Panda, D. K. In Abe, N.; Liu, H.; Pu, C.; Hu, X.; Ahmed, N. K.; Qiao, M.; Song, Y.; Kossmann, D.; Liu, B.; Lee, K.; Tang, J.; He, J.; and Saltz, J. S., editor(s), BigData, pages 321-326, 2018. IEEE
Spark-uDAPL: Cost-Saving Big Data Analytics on Microsoft Azure Cloud with RDMA Networks*. [link]Link   Spark-uDAPL: Cost-Saving Big Data Analytics on Microsoft Azure Cloud with RDMA Networks*. [link]Paper   link   bibtex  
Cutting the Tail: Designing High Performance Message Brokers to Reduce Tail Latencies in Stream Processing. Javed, M. H.; Lu, X.; and Panda, D. K. In CLUSTER, pages 223-233, 2018. IEEE Computer Society
Cutting the Tail: Designing High Performance Message Brokers to Reduce Tail Latencies in Stream Processing. [link]Link   Cutting the Tail: Designing High Performance Message Brokers to Reduce Tail Latencies in Stream Processing. [link]Paper   link   bibtex  
High-Performance Multi-Rail Erasure Coding Library over Modern Data Center Architectures: Early Experiences. Shi, H.; Lu, X.; Shankar, D.; and Panda, D. K. In SoCC, pages 530-531, 2018. ACM
High-Performance Multi-Rail Erasure Coding Library over Modern Data Center Architectures: Early Experiences. [link]Link   High-Performance Multi-Rail Erasure Coding Library over Modern Data Center Architectures: Early Experiences. [link]Paper   link   bibtex  
Multi-Threading and Lock-Free MPI RMA Based Graph Processing on KNL and POWER Architectures. Li, M.; Lu, X.; Subramoni, H.; and Panda, D. K. In EuroMPI, pages 4:1-4:10, 2018. ACM
Multi-Threading and Lock-Free MPI RMA Based Graph Processing on KNL and POWER Architectures. [link]Link   Multi-Threading and Lock-Free MPI RMA Based Graph Processing on KNL and POWER Architectures. [link]Paper   link   bibtex  
Introduction to HPBDC 2018. Lu, X.; Zhan, J.; and Panda, D. K. In IPDPS Workshops, pages 446, 2018. IEEE Computer Society
Introduction to HPBDC 2018. [link]Link   Introduction to HPBDC 2018. [link]Paper   link   bibtex  
  2017 (21)
Scalable and Distributed Key-Value Store-based Data Management Using RDMA-Memcached. Lu, X.; Shankar, D.; and Panda, D. K. IEEE Data Eng. Bull., 40(1): 50-61. 2017.
Scalable and Distributed Key-Value Store-based Data Management Using RDMA-Memcached. [pdf]Link   Scalable and Distributed Key-Value Store-based Data Management Using RDMA-Memcached. [link]Paper   link   bibtex  
A Comprehensive Study of MapReduce Over Lustre for Intermediate Data Placement and Shuffle Strategies on HPC Clusters. ur Rahman, M. W.; Islam, N. S.; Lu, X.; and Panda, D. K. IEEE Trans. Parallel Distributed Syst., 28(3): 633-646. 2017.
A Comprehensive Study of MapReduce Over Lustre for Intermediate Data Placement and Shuffle Strategies on HPC Clusters. [link]Link   A Comprehensive Study of MapReduce Over Lustre for Intermediate Data Placement and Shuffle Strategies on HPC Clusters. [link]Paper   link   bibtex  
Research on Millimeter Wave Communication Interference Suppression of UAV Based on Beam Optimization. Zhong, W.; Xu, L.; Lu, X.; and Wang, L. In Gu, X.; Liu, G.; and Li, B., editor(s), MLICOM (2), volume 227, of Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, pages 472-481, 2017. Springer
Research on Millimeter Wave Communication Interference Suppression of UAV Based on Beam Optimization. [link]Link   Research on Millimeter Wave Communication Interference Suppression of UAV Based on Beam Optimization. [link]Paper   link   bibtex  
Building Efficient HPC Cloud with SR-IOV-Enabled InfiniBand: The MVAPICH2 Approach. Lu, X.; Zhang, J.; and Panda, D. K. In Chaudhary, S.; Somani, G.; and Buyya, R., editor(s), Research Advances in Cloud Computing, pages 115-140. Springer, 2017.
Building Efficient HPC Cloud with SR-IOV-Enabled InfiniBand: The MVAPICH2 Approach. [link]Link   Building Efficient HPC Cloud with SR-IOV-Enabled InfiniBand: The MVAPICH2 Approach. [link]Paper   link   bibtex  
Performance characterization and acceleration of big data workloads on OpenPOWER system. Lu, X.; Shi, H.; Shankar, D.; and Panda, D. K. In Nie, J.; Obradovic, Z.; Suzumura, T.; Ghosh, R.; Nambiar, R.; Wang, C.; Zang, H.; Baeza-Yates, R.; Hu, X.; Kepner, J.; Cuzzocrea, A.; Tang, J.; and Toyoda, M., editor(s), BigData, pages 213-222, 2017. IEEE Computer Society
Performance characterization and acceleration of big data workloads on OpenPOWER system. [link]Link   Performance characterization and acceleration of big data workloads on OpenPOWER system. [link]Paper   link   bibtex  
Characterizing and accelerating indexing techniques on distributed ordered tables. Gugnani, S.; Lu, X.; Qi, H.; Zha, L.; and Panda, D. K. In Nie, J.; Obradovic, Z.; Suzumura, T.; Ghosh, R.; Nambiar, R.; Wang, C.; Zang, H.; Baeza-Yates, R.; Hu, X.; Kepner, J.; Cuzzocrea, A.; Tang, J.; and Toyoda, M., editor(s), BigData, pages 173-182, 2017. IEEE Computer Society
Characterizing and accelerating indexing techniques on distributed ordered tables. [link]Link   Characterizing and accelerating indexing techniques on distributed ordered tables. [link]Paper   link   bibtex  
NVMD: Non-volatile memory assisted design for accelerating MapReduce and DAG execution frameworks on HPC systems. ur Rahman, M. W.; Islam, N. S.; Lu, X.; and Panda, D. K. In Nie, J.; Obradovic, Z.; Suzumura, T.; Ghosh, R.; Nambiar, R.; Wang, C.; Zang, H.; Baeza-Yates, R.; Hu, X.; Kepner, J.; Cuzzocrea, A.; Tang, J.; and Toyoda, M., editor(s), BigData, pages 369-374, 2017. IEEE Computer Society
NVMD: Non-volatile memory assisted design for accelerating MapReduce and DAG execution frameworks on HPC systems. [link]Link   NVMD: Non-volatile memory assisted design for accelerating MapReduce and DAG execution frameworks on HPC systems. [link]Paper   link   bibtex  
High-Performance Virtual Machine Migration Framework for MPI Applications on SR-IOV Enabled InfiniBand Clusters. Zhang, J.; Lu, X.; and Panda, D. K. In IPDPS, pages 143-152, 2017. IEEE Computer Society
High-Performance Virtual Machine Migration Framework for MPI Applications on SR-IOV Enabled InfiniBand Clusters. [link]Link   High-Performance Virtual Machine Migration Framework for MPI Applications on SR-IOV Enabled InfiniBand Clusters. [link]Paper   link   bibtex  
Introduction to HPBDC Workshop. Lu, X.; Zhan, J.; and Panda, D. K. In IPDPS Workshops, pages 1020, 2017. IEEE Computer Society
Introduction to HPBDC Workshop. [link]Link   Introduction to HPBDC Workshop. [link]Paper   link   bibtex  
Swift-X: Accelerating OpenStack Swift with RDMA for Building an Efficient HPC Cloud. Gugnani, S.; Lu, X.; and Panda, D. K. In CCGrid, pages 238-247, 2017. IEEE Computer Society / ACM
Swift-X: Accelerating OpenStack Swift with RDMA for Building an Efficient HPC Cloud. [link]Link   Swift-X: Accelerating OpenStack Swift with RDMA for Building an Efficient HPC Cloud. [link]Paper   link   bibtex  
HPC Meets Cloud: Building Efficient Clouds for HPC, Big Data, and Deep Learning Middleware and Applications. Panda, D. K.; and Lu, X. In Anjum, A.; Sill, A.; Fox, G. C.; and Chen, Y., editor(s), UCC, pages 189-190, 2017. ACM
HPC Meets Cloud: Building Efficient Clouds for HPC, Big Data, and Deep Learning Middleware and Applications. [link]Link   HPC Meets Cloud: Building Efficient Clouds for HPC, Big Data, and Deep Learning Middleware and Applications. [link]Paper   link   bibtex  
Characterization of Big Data Stream Processing Pipeline: A Case Study using Flink and Kafka. Javed, M. H.; Lu, X.; and Panda, D. K. In Anjum, A.; Sill, A.; Zhao, X.; Farid, M. M.; Pallickara, S.; and Cao, J., editor(s), BDCAT, pages 1-10, 2017. ACM
Characterization of Big Data Stream Processing Pipeline: A Case Study using Flink and Kafka. [link]Link   Characterization of Big Data Stream Processing Pipeline: A Case Study using Flink and Kafka. [link]Paper   link   bibtex  
Is Singularity-based Container Technology Ready for Running MPI Applications on HPC Clouds?. Zhang, J.; Lu, X.; and Panda, D. K. In Anjum, A.; Sill, A.; Fox, G. C.; and Chen, Y., editor(s), UCC, pages 151-160, 2017. ACM
Is Singularity-based Container Technology Ready for Running MPI Applications on HPC Clouds? [link]Link   Is Singularity-based Container Technology Ready for Running MPI Applications on HPC Clouds? [link]Paper   link   bibtex  
Scalable reduction collectives with data partitioning-based multi-leader design. Bayatpour, M.; Chakraborty, S.; Subramoni, H.; Lu, X.; and Panda, D. K. In Mohr, B.; and Raghavan, P., editor(s), SC, pages 64:1-64:11, 2017. ACM
Scalable reduction collectives with data partitioning-based multi-leader design. [link]Link   Scalable reduction collectives with data partitioning-based multi-leader design. [link]Paper   link   bibtex  
Designing Locality and NUMA Aware MPI Runtime for Nested Virtualization based HPC Cloud with SR-IOV Enabled InfiniBand. Zhang, J.; Lu, X.; and Panda, D. K. In VEE, pages 187-200, 2017. ACM
Designing Locality and NUMA Aware MPI Runtime for Nested Virtualization based HPC Cloud with SR-IOV Enabled InfiniBand. [link]Link   Designing Locality and NUMA Aware MPI Runtime for Nested Virtualization based HPC Cloud with SR-IOV Enabled InfiniBand. [link]Paper   link   bibtex  
High-Performance and Resilient Key-Value Store with Online Erasure Coding for Big Data Workloads. Shankar, D.; Lu, X.; and Panda, D. K. In Lee, K.; and Liu, L., editor(s), ICDCS, pages 527-537, 2017. IEEE Computer Society
High-Performance and Resilient Key-Value Store with Online Erasure Coding for Big Data Workloads. [link]Link   High-Performance and Resilient Key-Value Store with Online Erasure Coding for Big Data Workloads. [link]Paper   link   bibtex  
A Scalable Network-Based Performance Analysis Tool for MPI on Large-Scale HPC Systems. Subramoni, H.; Lu, X.; and Panda, D. K. In CLUSTER, pages 354-358, 2017. IEEE Computer Society
A Scalable Network-Based Performance Analysis Tool for MPI on Large-Scale HPC Systems. [link]Link   A Scalable Network-Based Performance Analysis Tool for MPI on Large-Scale HPC Systems. [link]Paper   link   bibtex  
Designing Registration Caching Free High-Performance MPI Library with Implicit On-Demand Paging (ODP) of InfiniBand. Li, M.; Lu, X.; Subramoni, H.; and Panda, D. K. In HiPC, pages 62-71, 2017. IEEE Computer Society
Designing Registration Caching Free High-Performance MPI Library with Implicit On-Demand Paging (ODP) of InfiniBand. [link]Link   Designing Registration Caching Free High-Performance MPI Library with Implicit On-Demand Paging (ODP) of InfiniBand. [link]Paper   link   bibtex  
MPI-LiFE: Designing High-Performance Linear Fascicle Evaluation of Brain Connectome with MPI. Gugnani, S.; Lu, X.; Pestilli, F.; Caiafa, C. F.; and Panda, D. K. In HiPC, pages 213-222, 2017. IEEE Computer Society
MPI-LiFE: Designing High-Performance Linear Fascicle Evaluation of Brain Connectome with MPI. [link]Link   MPI-LiFE: Designing High-Performance Linear Fascicle Evaluation of Brain Connectome with MPI. [link]Paper   link   bibtex  
Characterizing Deep Learning over Big Data (DLoBD) Stacks on RDMA-Capable Networks. Lu, X.; Shi, H.; Javed, M. H.; Biswas, R.; and Panda, D. K. In Hot Interconnects, pages 87-94, 2017. IEEE Computer Society
Characterizing Deep Learning over Big Data (DLoBD) Stacks on RDMA-Capable Networks. [link]Link   Characterizing Deep Learning over Big Data (DLoBD) Stacks on RDMA-Capable Networks. [link]Paper   link   bibtex  
Efficient and Scalable Multi-Source Streaming Broadcast on GPU Clusters for Deep Learning. Chu, C.; Lu, X.; Awan, A. A.; Subramoni, H.; Hashmi, J. M.; Elton, B.; and Panda, D. K. In ICPP, pages 161-170, 2017. IEEE Computer Society
Efficient and Scalable Multi-Source Streaming Broadcast on GPU Clusters for Deep Learning. [link]Link   Efficient and Scalable Multi-Source Streaming Broadcast on GPU Clusters for Deep Learning. [link]Paper   link   bibtex  
  2016 (19)
Characterizing and benchmarking stand-alone Hadoop MapReduce on modern HPC clusters. Shankar, D.; Lu, X.; ur Rahman, M. W.; Islam, N. S.; and Panda, D. K. J. Supercomput., 72(12): 4573-4600. 2016.
Characterizing and benchmarking stand-alone Hadoop MapReduce on modern HPC clusters. [link]Link   Characterizing and benchmarking stand-alone Hadoop MapReduce on modern HPC clusters. [link]Paper   link   bibtex  
Efficient data access strategies for Hadoop and Spark on HPC cluster with heterogeneous storage. Islam, N. S.; ur Rahman, M. W.; Lu, X.; and Panda, D. K. In Joshi, J.; Karypis, G.; Liu, L.; Hu, X.; Ak, R.; Xia, Y.; Xu, W.; Sato, A.; Rachuri, S.; Ungar, L. H.; Yu, P. S.; Govindaraju, R.; and Suzumura, T., editor(s), BigData, pages 223-232, 2016. IEEE Computer Society
Efficient data access strategies for Hadoop and Spark on HPC cluster with heterogeneous storage. [link]Link   Efficient data access strategies for Hadoop and Spark on HPC cluster with heterogeneous storage. [link]Paper   link   bibtex  
Can Non-volatile Memory Benefit MapReduce Applications on HPC Clusters?. ur Rahman, M. W.; Islam, N. S.; Lu, X.; and Panda, D. K. In PDSW-DISCS@SC, pages 19-24, 2016. IEEE Computer Society
Can Non-volatile Memory Benefit MapReduce Applications on HPC Clusters? [link]Link   Can Non-volatile Memory Benefit MapReduce Applications on HPC Clusters? [link]Paper   link   bibtex  
Mizan-RMA: Accelerating Mizan Graph Processing Framework with MPI RMA. Li, M.; Lu, X.; Hamidouche, K.; Zhang, J.; and Panda, D. K. In HiPC, pages 42-51, 2016. IEEE Computer Society
Mizan-RMA: Accelerating Mizan Graph Processing Framework with MPI RMA. [link]Link   Mizan-RMA: Accelerating Mizan Graph Processing Framework with MPI RMA. [link]Paper   link   bibtex  
Impact of HPC Cloud Networking Technologies on Accelerating Hadoop RPC and HBase. Lu, X.; Shankar, D.; Gugnani, S.; Subramoni, H.; and Panda, D. K. In CloudCom, pages 310-317, 2016. IEEE Computer Society
Impact of HPC Cloud Networking Technologies on Accelerating Hadoop RPC and HBase. [link]Link   Impact of HPC Cloud Networking Technologies on Accelerating Hadoop RPC and HBase. [link]Paper   link   bibtex  
Designing Virtualization-Aware and Automatic Topology Detection Schemes for Accelerating Hadoop on SR-IOV-Enabled Clouds. Gugnani, S.; Lu, X.; and Panda, D. K. In CloudCom, pages 152-159, 2016. IEEE Computer Society
Designing Virtualization-Aware and Automatic Topology Detection Schemes for Accelerating Hadoop on SR-IOV-Enabled Clouds. [link]Link   Designing Virtualization-Aware and Automatic Topology Detection Schemes for Accelerating Hadoop on SR-IOV-Enabled Clouds. [link]Paper   link   bibtex  
Experiences and Benefits of Running RDMA Hadoop and Spark on SDSC Comet. Tatineni, M.; Lu, X.; Choi, D. J.; Majumdar, A.; and Panda, D. K. In XSEDE, pages 23:1-23:5, 2016. ACM
Experiences and Benefits of Running RDMA Hadoop and Spark on SDSC Comet. [link]Link   Experiences and Benefits of Running RDMA Hadoop and Spark on SDSC Comet. [link]Paper   link   bibtex  
Performance characterization of hadoop workloads on SR-IOV-enabled virtualized InfiniBand clusters. Gugnani, S.; Lu, X.; and Panda, D. K. In Anjum, A.; and Zhao, X., editor(s), BDCAT, pages 36-45, 2016. ACM
Performance characterization of hadoop workloads on SR-IOV-enabled virtualized InfiniBand clusters. [link]Link   Performance characterization of hadoop workloads on SR-IOV-enabled virtualized InfiniBand clusters. [link]Paper   link   bibtex  
High Performance Design for HDFS with Byte-Addressability of NVM and RDMA. Islam, N. S.; ur Rahman, M. W.; Lu, X.; and Panda, D. K. In Ozturk, O.; Ebcioglu, K.; Kandemir, M. T.; and Mutlu, O., editor(s), ICS, pages 8:1-8:14, 2016. ACM
High Performance Design for HDFS with Byte-Addressability of NVM and RDMA. [link]Link   High Performance Design for HDFS with Byte-Addressability of NVM and RDMA. [link]Paper   link   bibtex  
Performance Characterization of Hypervisor-and Container-Based Virtualization for HPC on SR-IOV Enabled InfiniBand Clusters. Zhang, J.; Lu, X.; and Panda, D. K. In IPDPS Workshops, pages 1777-1784, 2016. IEEE Computer Society
Performance Characterization of Hypervisor-and Container-Based Virtualization for HPC on SR-IOV Enabled InfiniBand Clusters. [link]Link   Performance Characterization of Hypervisor-and Container-Based Virtualization for HPC on SR-IOV Enabled InfiniBand Clusters. [link]Paper   link   bibtex  
High Performance MPI Library for Container-Based HPC Cloud on InfiniBand Clusters. Zhang, J.; Lu, X.; and Panda, D. K. In ICPP, pages 268-277, 2016. IEEE Computer Society
High Performance MPI Library for Container-Based HPC Cloud on InfiniBand Clusters. [link]Link   High Performance MPI Library for Container-Based HPC Cloud on InfiniBand Clusters. [link]Paper   link   bibtex  
Slurm-V: Extending Slurm for Building Efficient HPC Cloud with SR-IOV and IVShmem. Zhang, J.; Lu, X.; Chakraborty, S.; and Panda, D. K. In Dutot, P.; and Trystram, D., editor(s), Euro-Par, volume 9833, of Lecture Notes in Computer Science, pages 349-362, 2016. Springer
Slurm-V: Extending Slurm for Building Efficient HPC Cloud with SR-IOV and IVShmem. [link]Link   Slurm-V: Extending Slurm for Building Efficient HPC Cloud with SR-IOV and IVShmem. [link]Paper   link   bibtex  
Designing MPI library with on-demand paging (ODP) of infiniband: challenges and benefits. Li, M.; Hamidouche, K.; Lu, X.; Subramoni, H.; Zhang, J.; and Panda, D. K. In West, J.; and Pancake, C. M., editor(s), SC, pages 433-443, 2016. IEEE Computer Society
Designing MPI library with on-demand paging (ODP) of infiniband: challenges and benefits. [link]Link   Designing MPI library with on-demand paging (ODP) of infiniband: challenges and benefits. [link]Paper   link   bibtex  
High-performance design of apache spark with RDMA and its benefits on various workloads. Lu, X.; Shankar, D.; Gugnani, S.; and Panda, D. K. In Joshi, J.; Karypis, G.; Liu, L.; Hu, X.; Ak, R.; Xia, Y.; Xu, W.; Sato, A.; Rachuri, S.; Ungar, L. H.; Yu, P. S.; Govindaraju, R.; and Suzumura, T., editor(s), BigData, pages 253-262, 2016. IEEE Computer Society
High-performance design of apache spark with RDMA and its benefits on various workloads. [link]Link   High-performance design of apache spark with RDMA and its benefits on various workloads. [link]Paper   link   bibtex  
Boldio: A hybrid and resilient burst-buffer over lustre for accelerating big data I/O. Shankar, D.; Lu, X.; and Panda, D. K. In Joshi, J.; Karypis, G.; Liu, L.; Hu, X.; Ak, R.; Xia, Y.; Xu, W.; Sato, A.; Rachuri, S.; Ungar, L. H.; Yu, P. S.; Govindaraju, R.; and Suzumura, T., editor(s), BigData, pages 404-409, 2016. IEEE Computer Society
Boldio: A hybrid and resilient burst-buffer over lustre for accelerating big data I/O. [link]Link   Boldio: A hybrid and resilient burst-buffer over lustre for accelerating big data I/O. [link]Paper   link   bibtex  
HPBDC Introduction and Committees. Panda, D. K.; Zhan, J.; and Lu, X. In IPDPS Workshops, pages 1596, 2016. IEEE Computer Society
HPBDC Introduction and Committees. [link]Link   HPBDC Introduction and Committees. [link]Paper   link   bibtex  
High-Performance Hybrid Key-Value Store on Modern Clusters with RDMA Interconnects and SSDs: Non-blocking Extensions, Designs, and Benefits. Shankar, D.; Lu, X.; Islam, N. S.; ur Rahman, M. W.; and Panda, D. K. In IPDPS, pages 393-402, 2016. IEEE Computer Society
High-Performance Hybrid Key-Value Store on Modern Clusters with RDMA Interconnects and SSDs: Non-blocking Extensions, Designs, and Benefits. [link]Link   High-Performance Hybrid Key-Value Store on Modern Clusters with RDMA Interconnects and SSDs: Non-blocking Extensions, Designs, and Benefits. [link]Paper   link   bibtex  
INAM2: InfiniBand Network Analysis and Monitoring with MPI. Subramoni, H.; Augustine, A. M.; Arnold, M. D.; Perkins, J. L.; Lu, X.; Hamidouche, K.; and Panda, D. K. In Kunkel, J. M.; Balaji, P.; and Dongarra, J. J., editor(s), ISC, volume 9697, of Lecture Notes in Computer Science, pages 300-320, 2016. Springer
INAM2: InfiniBand Network Analysis and Monitoring with MPI. [link]Link   INAM2: InfiniBand Network Analysis and Monitoring with MPI. [link]Paper   link   bibtex  
MR-Advisor: A Comprehensive Tuning Tool for Advising HPC Users to Accelerate MapReduce Applications on Supercomputers. ur Rahman, M. W.; Islam, N. S.; Lu, X.; Shankar, D.; and Panda, D. K. In SBAC-PAD, pages 198-205, 2016. IEEE Computer Society
MR-Advisor: A Comprehensive Tuning Tool for Advising HPC Users to Accelerate MapReduce Applications on Supercomputers. [link]Link   MR-Advisor: A Comprehensive Tuning Tool for Advising HPC Users to Accelerate MapReduce Applications on Supercomputers. [link]Paper   link   bibtex  
  2015 (16)
Modeling and Designing Fault-Tolerance Mechanisms for MPI-Based MapReduce Data Computing Framework. Lin, J.; Liang, F.; Lu, X.; Zha, L.; and Xu, Z. In BigDataService, pages 176-183, 2015. IEEE Computer Society
Modeling and Designing Fault-Tolerance Mechanisms for MPI-Based MapReduce Data Computing Framework. [link]Link   Modeling and Designing Fault-Tolerance Mechanisms for MPI-Based MapReduce Data Computing Framework. [link]Paper   link   bibtex  
Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM. Lin, J.; Hamidouche, K.; Zhang, J.; Lu, X.; Vishnu, A.; and Panda, D. K. In Venkata, M. G.; Shamis, P.; Imam, N.; and Lopez, M. G., editor(s), OpenSHMEM, volume 9397, of Lecture Notes in Computer Science, pages 164-177, 2015. Springer
Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM. [link]Link   Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM. [link]Paper   link   bibtex  
High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation. Lin, J.; Hamidouche, K.; Lu, X.; Li, M.; and Panda, D. K. In IPDPS Workshops, pages 225-234, 2015. IEEE Computer Society
High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation. [link]Link   High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation. [link]Paper   link   bibtex  
Accelerating Apache Hive with MPI for Data Warehouse Systems. Chao, L.; Li, C.; Liang, F.; Lu, X.; and Xu, Z. In ICDCS, pages 664-673, 2015. IEEE Computer Society
Accelerating Apache Hive with MPI for Data Warehouse Systems. [link]Link   Accelerating Apache Hive with MPI for Data Warehouse Systems. [link]Paper   link   bibtex  
Performance characterization and acceleration of in-memory file systems for Hadoop and Spark applications on HPC clusters. Islam, N. S.; ur Rahman, M. W.; Lu, X.; Shankar, D.; and Panda, D. K. In BigData, pages 243-252, 2015. IEEE Computer Society
Performance characterization and acceleration of in-memory file systems for Hadoop and Spark applications on HPC clusters. [link]Link   Performance characterization and acceleration of in-memory file systems for Hadoop and Spark applications on HPC clusters. [link]Paper   link   bibtex  
Benchmarking key-value stores on high-performance storage and interconnects for web-scale workloads. Shankar, D.; Lu, X.; ur Rahman, M. W.; Islam, N. S.; and Panda, D. K. In BigData, pages 539-544, 2015. IEEE Computer Society
Benchmarking key-value stores on high-performance storage and interconnects for web-scale workloads. [link]Link   Benchmarking key-value stores on high-performance storage and interconnects for web-scale workloads. [link]Paper   link   bibtex  
MVAPICH2 over OpenStack with SR-IOV: An Efficient Approach to Build HPC Clouds. Zhang, J.; Lu, X.; Arnold, M. D.; and Panda, D. K. In CCGRID, pages 71-80, 2015. IEEE Computer Society
MVAPICH2 over OpenStack with SR-IOV: An Efficient Approach to Build HPC Clouds. [link]Link   MVAPICH2 over OpenStack with SR-IOV: An Efficient Approach to Build HPC Clouds. [link]Paper   link   bibtex  
High Performance OpenSHMEM Strided Communication Support with InfiniBand UMR. Li, M.; Hamidouche, K.; Lu, X.; Zhang, J.; Lin, J.; and Panda, D. K. In HiPC, pages 244-253, 2015. IEEE Computer Society
High Performance OpenSHMEM Strided Communication Support with InfiniBand UMR. [link]Link   High Performance OpenSHMEM Strided Communication Support with InfiniBand UMR. [link]Paper   link   bibtex  
A Plugin-Based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS. Bhat, A.; Islam, N. S.; Lu, X.; ur Rahman, M. W.; Shankar, D.; and Panda, D. K. In Zhan, J.; Han, R.; and Zicari, R. V., editor(s), BPOE, volume 9495, of Lecture Notes in Computer Science, pages 119-132, 2015. Springer
A Plugin-Based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS. [link]Link   A Plugin-Based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS. [link]Paper   link   bibtex  
High-Performance Design of YARN MapReduce on Modern HPC Clusters with Lustre and RDMA. ur Rahman, M. W.; Lu, X.; Islam, N. S.; Rajachandrasekar, R.; and Panda, D. K. In IPDPS, pages 291-300, 2015. IEEE Computer Society
High-Performance Design of YARN MapReduce on Modern HPC Clusters with Lustre and RDMA. [link]Link   High-Performance Design of YARN MapReduce on Modern HPC Clusters with Lustre and RDMA. [link]Paper   link   bibtex  
High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters. Li, M.; Hamidouche, K.; Lu, X.; Lin, J.; and Panda, D. K. In Träff, J. L.; Hunold, S.; and Versaci, F., editor(s), Euro-Par, volume 9233, of Lecture Notes in Computer Science, pages 625-637, 2015. Springer
High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters. [link]Link   High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters. [link]Paper   link   bibtex  
Can RDMA benefit online data processing workloads on memcached and MySQL?. Shankar, D.; Lu, X.; Jose, J.; ur Rahman, M. W.; Islam, N. S.; and Panda, D. K. In ISPASS, pages 159-160, 2015. IEEE Computer Society
Can RDMA benefit online data processing workloads on memcached and MySQL? [link]Link   Can RDMA benefit online data processing workloads on memcached and MySQL? [link]Paper   link   bibtex  
High Performance MPI Datatype Support with User-Mode Memory Registration: Challenges, Designs, and Benefits. Li, M.; Subramoni, H.; Hamidouche, K.; Lu, X.; and Panda, D. K. In CLUSTER, pages 226-235, 2015. IEEE Computer Society
High Performance MPI Datatype Support with User-Mode Memory Registration: Challenges, Designs, and Benefits. [link]Link   High Performance MPI Datatype Support with User-Mode Memory Registration: Challenges, Designs, and Benefits. [link]Paper   link   bibtex  
Triple-H: A Hybrid Approach to Accelerate HDFS on HPC Clusters with Heterogeneous Storage Architecture. Islam, N. S.; Lu, X.; ur Rahman, M. W.; Shankar, D.; and Panda, D. K. In CCGRID, pages 101-110, 2015. IEEE Computer Society
Triple-H: A Hybrid Approach to Accelerate HDFS on HPC Clusters with Heterogeneous Storage Architecture. [link]Link   Triple-H: A Hybrid Approach to Accelerate HDFS on HPC Clusters with Heterogeneous Storage Architecture. [link]Paper   link   bibtex  
Accelerating I/O Performance of Big Data Analytics on HPC Clusters through RDMA-Based Key-Value Store. Islam, N. S.; Shankar, D.; Lu, X.; ur Rahman, M. W.; and Panda, D. K. In ICPP, pages 280-289, 2015. IEEE Computer Society
Accelerating I/O Performance of Big Data Analytics on HPC Clusters through RDMA-Based Key-Value Store. [link]Link   Accelerating I/O Performance of Big Data Analytics on HPC Clusters through RDMA-Based Key-Value Store. [link]Paper   link   bibtex  
Accelerating Iterative Big Data Computing Through MPI. Liang, F.; and Lu, X. J. Comput. Sci. Technol., 30(2): 283-294. 2015.
Accelerating Iterative Big Data Computing Through MPI. [link]Link   Accelerating Iterative Big Data Computing Through MPI. [link]Paper   link   bibtex  
  2014 (22)
Performance Characterization of Hadoop and Data MPI Based on Amdahl's Second Law. Liang, F.; Feng, C.; Lu, X.; and Xu, Z. In NAS, pages 207-215, 2014. IEEE Computer Society
Performance Characterization of Hadoop and Data MPI Based on Amdahl's Second Law. [link]Link   Performance Characterization of Hadoop and Data MPI Based on Amdahl's Second Law. [link]Paper   link   bibtex  
Performance Benefits of DataMPI: A Case Study with BigDataBench. Liang, F.; Feng, C.; Lu, X.; and Xu, Z. CoRR, abs/1403.3480. 2014.
Performance Benefits of DataMPI: A Case Study with BigDataBench. [link]Link   Performance Benefits of DataMPI: A Case Study with BigDataBench. [link]Paper   link   bibtex  
Performance Benefits of DataMPI: A Case Study with BigDataBench. Liang, F.; Feng, C.; Lu, X.; and Xu, Z. In Zhan, J.; Han, R.; and Weng, C., editor(s), BPOE@ASPLOS/VLDB, volume 8807, of Lecture Notes in Computer Science, pages 111-123, 2014. Springer
Performance Benefits of DataMPI: A Case Study with BigDataBench. [link]Link   Performance Benefits of DataMPI: A Case Study with BigDataBench. [link]Paper   link   bibtex  
DataMPI: Extending MPI to Hadoop-Like Big Data Computing. Lu, X.; Liang, F.; Wang, B.; Zha, L.; and Xu, Z. In IPDPS, pages 829-838, 2014. IEEE Computer Society
DataMPI: Extending MPI to Hadoop-Like Big Data Computing. [link]Link   DataMPI: Extending MPI to Hadoop-Like Big Data Computing. [link]Paper   link   bibtex  
HOMR: a hybrid approach to exploit maximum overlapping in MapReduce over high performance interconnects. ur Rahman, M. W.; Lu, X.; Islam, N. S.; and Panda, D. K. In Bode, A.; Gerndt, M.; Stenström, P.; Rauchwerger, L.; Miller, B. P.; and Schulz, M., editor(s), ICS, pages 33-42, 2014. ACM
HOMR: a hybrid approach to exploit maximum overlapping in MapReduce over high performance interconnects. [link]Link   HOMR: a hybrid approach to exploit maximum overlapping in MapReduce over high performance interconnects. [link]Paper   link   bibtex  
SOR-HDFS: a SEDA-based approach to maximize overlapping in RDMA-enhanced HDFS. Islam, N. S.; Lu, X.; ur Rahman, M. W.; and Panda, D. K. In Plale, B.; Ripeanu, M.; Cappello, F.; and Xu, D., editor(s), HPDC, pages 261-264, 2014. ACM
SOR-HDFS: a SEDA-based approach to maximize overlapping in RDMA-enhanced HDFS. [link]Link   SOR-HDFS: a SEDA-based approach to maximize overlapping in RDMA-enhanced HDFS. [link]Paper   link   bibtex  
Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models. Jose, J.; Potluri, S.; Subramoni, H.; Lu, X.; Hamidouche, K.; Schulz, K. W.; Sundar, H.; and Panda, D. K. In Malony, A. D.; and Hammond, J. R., editor(s), PGAS, pages 7:1-7:9, 2014. ACM
Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models. [link]Link   Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models. [link]Paper   link   bibtex  
Scalable MiniMD Design with Hybrid MPI and OpenSHMEM. Li, M.; Lin, J.; Lu, X.; Hamidouche, K.; Tomko, K.; and Panda, D. K. In Malony, A. D.; and Hammond, J. R., editor(s), PGAS, pages 24:1-24:4, 2014. ACM
Scalable MiniMD Design with Hybrid MPI and OpenSHMEM. [link]Link   Scalable MiniMD Design with Hybrid MPI and OpenSHMEM. [link]Paper   link   bibtex  
In-memory I/O and replication for HDFS with Memcached: Early experiences. Islam, N. S.; Lu, X.; ur Rahman, M. W.; Rajachandrasekar, R.; and Panda, D. K. In Lin, J. J.; Pei, J.; Hu, X.; Chang, W.; Nambiar, R.; Aggarwal, C. C.; Cercone, N.; Honavar, V. G.; Huan, J.; Mobasher, B.; and Pyne, S., editor(s), BigData, pages 213-218, 2014. IEEE
In-memory I/O and replication for HDFS with Memcached: Early experiences. [link]Link   In-memory I/O and replication for HDFS with Memcached: Early experiences. [link]Paper   link   bibtex  
On Big Data Benchmarking. Han, R.; and Lu, X. CoRR, abs/1402.5194. 2014.
On Big Data Benchmarking. [link]Link   On Big Data Benchmarking. [link]Paper   link   bibtex  
On Big Data Benchmarking. Han, R.; Lu, X.; and Xu, J. In Zhan, J.; Han, R.; and Weng, C., editor(s), BPOE@ASPLOS/VLDB, volume 8807, of Lecture Notes in Computer Science, pages 3-18, 2014. Springer
On Big Data Benchmarking. [link]Link   On Big Data Benchmarking. [link]Paper   link   bibtex  
HAND: A Hybrid Approach to Accelerate Non-contiguous Data Movement Using MPI Datatypes on GPU Clusters. Shi, R.; Lu, X.; Potluri, S.; Hamidouche, K.; Zhang, J.; and Panda, D. K. In ICPP, pages 221-230, 2014. IEEE Computer Society
HAND: A Hybrid Approach to Accelerate Non-contiguous Data Movement Using MPI Datatypes on GPU Clusters. [link]Link   HAND: A Hybrid Approach to Accelerate Non-contiguous Data Movement Using MPI Datatypes on GPU Clusters. [link]Paper   link   bibtex  
High performance OpenSHMEM for Xeon Phi clusters: Extensions, runtime designs and application co-design. Jose, J.; Hamidouche, K.; Lu, X.; Potluri, S.; Zhang, J.; Tomko, K.; and Panda, D. K. In CLUSTER, pages 10-18, 2014. IEEE Computer Society
High performance OpenSHMEM for Xeon Phi clusters: Extensions, runtime designs and application co-design. [link]Link   High performance OpenSHMEM for Xeon Phi clusters: Extensions, runtime designs and application co-design. [link]Paper   link   bibtex  
Can Inter-VM Shmem Benefit MPI Applications on SR-IOV Based Virtualized Infiniband Clusters?. Zhang, J.; Lu, X.; Jose, J.; Shi, R.; and Panda, D. K. In Silva, F. M. A.; de Castro Dutra, I.; and Costa, V. S., editor(s), Euro-Par, volume 8632, of Lecture Notes in Computer Science, pages 342-353, 2014. Springer
Can Inter-VM Shmem Benefit MPI Applications on SR-IOV Based Virtualized Infiniband Clusters? [link]Link   Can Inter-VM Shmem Benefit MPI Applications on SR-IOV Based Virtualized Infiniband Clusters? [link]Paper   link   bibtex  
High performance MPI library over SR-IOV enabled infiniband clusters. Zhang, J.; Lu, X.; Jose, J.; Li, M.; Shi, R.; and Panda, D. K. In HiPC, pages 1-10, 2014. IEEE Computer Society
High performance MPI library over SR-IOV enabled infiniband clusters. [link]Link   High performance MPI library over SR-IOV enabled infiniband clusters. [link]Paper   link   bibtex  
In-memory I/O and replication for HDFS with Memcached: Early experiences. Islam, N. S.; Lu, X.; ur Rahman, M. W.; Rajachandrasekar, R.; and Panda, D. K. In Lin, J. J.; Pei, J.; Hu, X.; Chang, W.; Nambiar, R.; Aggarwal, C. C.; Cercone, N.; Honavar, V. G.; Huan, J.; Mobasher, B.; and Pyne, S., editor(s), BigData Conference, pages 213-218, 2014. IEEE
In-memory I/O and replication for HDFS with Memcached: Early experiences. [link]Link   In-memory I/O and replication for HDFS with Memcached: Early experiences. [link]Paper   link   bibtex  
A Micro-benchmark Suite for Evaluating Hadoop MapReduce on High-Performance Networks. Shankar, D.; Lu, X.; ur Rahman, M. W.; Islam, N. S.; and Panda, D. K. In Zhan, J.; Han, R.; and Weng, C., editor(s), BPOE@ASPLOS/VLDB, volume 8807, of Lecture Notes in Computer Science, pages 19-33, 2014. Springer
A Micro-benchmark Suite for Evaluating Hadoop MapReduce on High-Performance Networks. [link]Link   A Micro-benchmark Suite for Evaluating Hadoop MapReduce on High-Performance Networks. [link]Paper   link   bibtex  
MapReduce over Lustre: Can RDMA-Based Approach Benefit?. ur Rahman, M. W.; Lu, X.; Islam, N. S.; Rajachandrasekar, R.; and Panda, D. K. In Silva, F. M. A.; de Castro Dutra, I.; and Costa, V. S., editor(s), Euro-Par, volume 8632, of Lecture Notes in Computer Science, pages 644-655, 2014. Springer
MapReduce over Lustre: Can RDMA-Based Approach Benefit? [link]Link   MapReduce over Lustre: Can RDMA-Based Approach Benefit? [link]Paper   link   bibtex  
Scalable Graph500 design with MPI-3 RMA. Li, M.; Lu, X.; Potluri, S.; Hamidouche, K.; Jose, J.; Tomko, K.; and Panda, D. K. In CLUSTER, pages 230-238, 2014. IEEE Computer Society
Scalable Graph500 design with MPI-3 RMA. [link]Link   Scalable Graph500 design with MPI-3 RMA. [link]Paper   link   bibtex  
Accelerating Spark with RDMA for Big Data Processing: Early Experiences. Lu, X.; ur Rahman, M. W.; Islam, N. S.; Shankar, D.; and Panda, D. K. In Hot Interconnects, pages 9-16, 2014. IEEE Computer Society
Accelerating Spark with RDMA for Big Data Processing: Early Experiences. [link]Link   Accelerating Spark with RDMA for Big Data Processing: Early Experiences. [link]Paper   link   bibtex  
Performance Modeling for RDMA-Enhanced Hadoop MapReduce. ur Rahman, M. W.; Lu, X.; Islam, N. S.; and Panda, D. K. In ICPP, pages 50-59, 2014. IEEE Computer Society
Performance Modeling for RDMA-Enhanced Hadoop MapReduce. [link]Link   Performance Modeling for RDMA-Enhanced Hadoop MapReduce. [link]Paper   link   bibtex  
Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems. Luo, M.; Lu, X.; Hamidouche, K.; Kandalla, K. C.; and Panda, D. K. In Moreira, J. E.; and Larus, J. R., editor(s), PPOPP, pages 395-396, 2014. ACM
Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems. [link]Link   Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems. [link]Paper   link   bibtex  
  2013 (8)
Does RDMA-based enhanced Hadoop MapReduce need a new performance model?. ur Rahman, M. W.; Lu, X.; Islam, N. S.; and Panda, D. K. In Lohman, G. M., editor(s), SoCC, pages 45:1-45:2, 2013. ACM
Does RDMA-based enhanced Hadoop MapReduce need a new performance model? [link]Link   Does RDMA-based enhanced Hadoop MapReduce need a new performance model? [link]Paper   link   bibtex  
High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand. ur Rahman, M. W.; Islam, N. S.; Lu, X.; Jose, J.; Subramoni, H.; Wang, H.; and Panda, D. K. In IPDPS Workshops, pages 1908-1917, 2013. IEEE
High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand. [link]Link   High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand. [link]Paper   link   bibtex  
A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks. Lu, X.; ur Rahman, M. W.; Islam, N. S.; and Panda, D. K. In Rabl, T.; Jacobsen, H.; Nambiar, R.; Poess, M.; Bhandarkar, M. A.; and Baru, C. K., editor(s), WBDB, volume 8585, of Lecture Notes in Computer Science, pages 32-42, 2013. Springer
A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks. [link]Link   A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks. [link]Paper   link   bibtex  
High-Performance Design of Hadoop RPC with RDMA over InfiniBand. Lu, X.; Islam, N. S.; ur Rahman, M. W.; Jose, J.; Subramoni, H.; Wang, H.; and Panda, D. K. In ICPP, pages 641-650, 2013. IEEE Computer Society
High-Performance Design of Hadoop RPC with RDMA over InfiniBand. [link]Link   High-Performance Design of Hadoop RPC with RDMA over InfiniBand. [link]Paper   link   bibtex  
Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects?. Islam, N. S.; Lu, X.; ur Rahman, M. W.; and Panda, D. K. In Hot Interconnects, pages 75-78, 2013. IEEE Computer Society
Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects? [link]Link   Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects? [link]Paper   link   bibtex  
Tutorials. Panda, D. K.; and Lu, X. In Hot Interconnects, 2013. IEEE Computer Society
Tutorials. [link]Link   Tutorials. [link]Paper   link   bibtex  
A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters. Shi, R.; Potluri, S.; Hamidouche, K.; Lu, X.; Tomko, K.; and Panda, D. K. In CLUSTER, pages 1-8, 2013. IEEE Computer Society
A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters. [link]Link   A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters. [link]Paper   link   bibtex  
SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience. Jose, J.; Li, M.; Lu, X.; Kandalla, K. C.; Arnold, M. D.; and Panda, D. K. In CCGRID, pages 385-392, 2013. IEEE Computer Society
SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience. [link]Link   SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience. [link]Paper   link   bibtex  
  2012 (1)
A Micro-benchmark Suite for Evaluating HDFS Operations on Modern Clusters. Islam, N. S.; Lu, X.; ur Rahman, M. W.; Jose, J.; and Panda, D. K. In Rabl, T.; Poess, M.; Baru, C. K.; and Jacobsen, H., editor(s), WBDB, volume 8163, of Lecture Notes in Computer Science, pages 129-147, 2012. Springer
A Micro-benchmark Suite for Evaluating HDFS Operations on Modern Clusters. [link]Link   A Micro-benchmark Suite for Evaluating HDFS Operations on Modern Clusters. [link]Paper   link   bibtex  
  2011 (2)
Can MPI Benefit Hadoop and MapReduce Applications?. Lu, X.; Wang, B.; Zha, L.; and Xu, Z. In Sheu, J.; and Wang, C., editor(s), ICPP Workshops, pages 371-379, 2011. IEEE Computer Society
Can MPI Benefit Hadoop and MapReduce Applications? [link]Link   Can MPI Benefit Hadoop and MapReduce Applications? [link]Paper   link   bibtex  
Vega LingCloud: A Resource Single Leasing Point System to Support Heterogeneous Application Modes on Shared Infrastructure. Lu, X.; Lin, J.; Zha, L.; and Xu, Z. In ISPA, pages 99-106, 2011. IEEE Computer Society
Vega LingCloud: A Resource Single Leasing Point System to Support Heterogeneous Application Modes on Shared Infrastructure. [link]Link   Vega LingCloud: A Resource Single Leasing Point System to Support Heterogeneous Application Modes on Shared Infrastructure. [link]Paper   link   bibtex  
  2010 (3)
VegaWarden: A Uniform User Management System for Cloud Applications. Lin, J.; Lu, X.; Yu, L.; Zou, Y.; and Zha, L. In NAS, pages 457-464, 2010. IEEE Computer Society
VegaWarden: A Uniform User Management System for Cloud Applications. [link]Link   VegaWarden: A Uniform User Management System for Cloud Applications. [link]Paper   link   bibtex  
JAMILA: A Usable Batch Job Management System to Coordinate Heterogeneous Clusters and Diverse Applications over Grid or Cloud Infrastructure. Peng, J.; Lu, X.; Cheng, B.; and Zha, L. In Ding, C.; Shao, Z.; and Zheng, R., editor(s), NPC, volume 6289, of Lecture Notes in Computer Science, pages 412-422, 2010. Springer
JAMILA: A Usable Batch Job Management System to Coordinate Heterogeneous Clusters and Diverse Applications over Grid or Cloud Infrastructure. [link]Link   JAMILA: A Usable Batch Job Management System to Coordinate Heterogeneous Clusters and Diverse Applications over Grid or Cloud Infrastructure. [link]Paper   link   bibtex  
Investigating, Modeling, and Ranking Interface Complexity of Web Services on the World Wide Web. Lu, X.; Lin, J.; Zou, Y.; Peng, J.; Liu, X.; and Zha, L. In SERVICES, pages 375-382, 2010. IEEE Computer Society
Investigating, Modeling, and Ranking Interface Complexity of Web Services on the World Wide Web. [link]Link   Investigating, Modeling, and Ranking Interface Complexity of Web Services on the World Wide Web. [link]Paper   link   bibtex  
  2009 (2)
A Model of Message-Based Debugging Facilities for Web or Grid Services. Yue, Q.; Lu, X.; Shan, Z.; Xu, Z.; Yu, H.; and Zha, L. In SERVICES I, pages 155-162, 2009. IEEE Computer Society
A Model of Message-Based Debugging Facilities for Web or Grid Services. [link]Link   A Model of Message-Based Debugging Facilities for Web or Grid Services. [link]Paper   link   bibtex  
ICOMC: Invocation Complexity Of Multi-Language Clients for Classified Web Services and its Impact on Large Scale SOA Applications. Lu, X.; Zou, Y.; Xiong, F.; Lin, J.; and Zha, L. In PDCAT, pages 186-194, 2009. IEEE Computer Society
ICOMC: Invocation Complexity Of Multi-Language Clients for Classified Web Services and its Impact on Large Scale SOA Applications. [link]Link   ICOMC: Invocation Complexity Of Multi-Language Clients for Classified Web Services and its Impact on Large Scale SOA Applications. [link]Paper   link   bibtex  
  2008 (1)
An Experimental Analysis for Memory Usage of GOS Core. Lu, X.; Yue, Q.; Zou, Y.; and Wang, X. In PDCAT, pages 33-36, 2008. IEEE Computer Society
An Experimental Analysis for Memory Usage of GOS Core. [link]Link   An Experimental Analysis for Memory Usage of GOS Core. [link]Paper   link   bibtex