generated by bibbase.org
  2024 (4)
[ICCD'24] PCCL: Energy-efficient LLM Training with Power-aware Collective Communication.
Ziyang Jia, Laxmi Bhuyan, & Daniel Wong.
In Proceedings of the 2024 IEEE International Conference on Computer Design (ICCD), 2024. (Acceptance Rate: 25.0%)

[ICCD'24] PCCL: Energy-efficient LLM Training with Power-aware Collective Communication [link] paper   link   bibtex   3 downloads  
[ISPASS'24] Characterizing In-Kernel Observability of Latency-Sensitive Request-level Metrics with eBPF.
Mohammadreza Rezvani, Ali Jahanshahi, & Daniel Wong.
In Proceedings of the 2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2024. (Best Paper Nominee)

[ISPASS'24] Characterizing In-Kernel Observability of Latency-Sensitive Request-level Metrics with eBPF [pdf] paper   link   bibtex   1 download  
[ECRTS'24] GCAPS: GPU Context-Aware Preemptive Priority-based Scheduling for Real-Time Tasks.
Yidi Wang, Cong Liu, Daniel Wong, & Hyoseung Kim.
In Proceedings of the 35th Euromicro Conference on Real-Time Systems (ECRTS), 2024.
link   bibtex  
[HotCarbon'24] Geographical Server Relocation: Opportunities and Challenges.
Yejia Liu, Pengfei Li, Daniel Wong, & Shaolei Ren.
In Proceedings of the 3rd Workshop on Sustainable Computer Systems (HotCarbon), 2024.
link   bibtex  
  2023 (4)
[HPCA'23] KRISP: Enabling Kernel-wise Right-sizing for Spatial Partitioned GPU Inference Servers.
Marcus Chow, Ali Jahanshahi, & Daniel Wong.
In Proceedings of the 29th IEEE International Symposium on High Performance Computer Architecture (HPCA), 2023. (Acceptance Rate: 25.0%)

[HPCA'23] KRISP: Enabling Kernel-wise Right-sizing for Spatial Partitioned GPU Inference Servers [pdf] paper   link   bibtex   1 download  
[IGSC'23] CoFRIS: Coordinated Frequency and Resource Scaling for GPU Inference Servers.
Marcus Chow, & Daniel Wong.
In Proceedings of the 14th International Green and Sustainable Computing Conference (IGSC), 2023.
[IGSC'23] CoFRIS: Coordinated Frequency and Resource Scaling for GPU Inference Servers [pdf] paper   link   bibtex  
[IGSC'23] WattWiser: Power Resource-Efficient Scheduling for Multi-Model Multi-GPU Inference Servers.
Ali Jahanshahi, Mohammadreza Rezvani, & Daniel Wong.
In Proceedings of the 14th International Green and Sustainable Computing Conference (IGSC), 2023.
[IGSC'23] WattWiser: Power Resource-Efficient Scheduling for Multi-Model Multi-GPU Inference Servers [pdf] paper   link   bibtex  
[AI4Dev'23] VSCuda: LLM based CUDA extension for Visual Studio Code.
Brian Chen, Nafis Mustakin, Alvin Hoang, Sakib Fuad, & Daniel Wong.
In First Workshop on AI Assisted Software Development for HPC (AI4Dev), 2023.
[AI4Dev'23] VSCuda: LLM based CUDA extension for Visual Studio Code [link] paper   link   bibtex   2 downloads  
  2022 (3)
[ACM TACO'22] PowerMorph: QoS-Aware Server Power Reshaping For Data Center Regulation Service.
Ali Jahanshahi, Nanpeng Yu, & Daniel Wong.
ACM Transactions on Architecture and Code Optimization (TACO), Volume 19(Issue 3): 1–27. September 2022.

[ACM TACO'22] PowerMorph: QoS-Aware Server Power Reshaping For Data Center Regulation Service [pdf] paper   link   bibtex  
[ISPASS'22] GPUCalorie: Floorplan Estimation for GPU Thermal Evaluation.
Marcus Chow, Ali Jahanshahi, Ana Cardenas Beltran, Sheldon Tan, & Daniel Wong.
In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2022. (Poster)

link   bibtex  
[GPGPU'22] Scaleserve: A Scalable Multi-GPU Machine Learning Inference System And Benchmarking Suite.
Ali Jahanshahi, Marcus Chow, & Daniel Wong.
In Proceedings of the 14th Workshop on General Purpose Processing Using GPU (GPGPU), 2022. (Short paper)

link   bibtex  
  2021 (7)
[ISCA'21] BlockMaestro: Enabling Programmer-Transparent Task-Based Execution In GPU Systems.
AmirAli Abdolrashidi, Hodjat Asghari Esfeden, Ali Jahanshahi, Kaustubh Singh, Nael Abu-Ghazaleh, & Daniel Wong.
In Proceedings of the 48th ACM/IEEE International Symposium on Computer Architecture (ISCA), 2021. (Acceptance Rate: 18.7%)

[ISCA'21] BlockMaestro: Enabling Programmer-Transparent Task-Based Execution In GPU Systems [pdf] paper   link   bibtex   1 download  
[SC'21] MAPA: Multi-Accelerator Pattern Allocation Policy For Multi-Tenant GPU Servers.
Kiran Ranganath, Joshua D Suetterlein, Joseph B Manzano, Shuaiwen Leon Song, & Daniel Wong.
In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2021. (Acceptance Rate: 26.8%)

[SC'21] MAPA: Multi-Accelerator Pattern Allocation Policy For Multi-Tenant GPU Servers [pdf] paper   link   bibtex   2 downloads  
[ACM TACO'21] PAVER: Locality Graph-Based Thread Block Scheduling For GPUs.
Devashree Tripathy, Amirali Abdolrashidi, Laxmi Narayan Bhuyan, Liang Zhou, & Daniel Wong.
ACM Transactions on Architecture and Code Optimization (TACO), Volume 18(Issue 3): 1–26. June 2021.

[ACM TACO'21] PAVER: Locality Graph-Based Thread Block Scheduling For GPUs [pdf] paper   link   bibtex  
[NAS'21] LocalityGuru: A Ptx Analyzer For Extracting Thread Block-Level Locality In GPGPUs.
Devashree Tripathy, Amirali Abdolrashidi, Quan Fan, Daniel Wong, & Manoranjan Satpathy.
In Proceedings of the 15th IEEE International Conference on Networking, Architecture and Storage (NAS), 2021.
link   bibtex  
[NAS'21] ICAP: Designing Inrush Current Aware Power Gating Switch For GPGPU.
Hadi Zamani, Devashree Tripathy, Ali Jahanshahi, & Daniel Wong.
In Proceedings of the 15th IEEE International Conference on Networking, Architecture and Storage (NAS), 2021.
link   bibtex  
[LCPC'21] LC-MEMENTO: A Memory Model for Accelerated Architectures.
Kiran Ranganath, Jesun Firoz, Joshua Suetterlein, Joseph Manzano, Andres Marquez, Mark Raugas, & Daniel Wong.
In Languages and Compilers for Parallel Computing (LCPC), 2021.
link   bibtex  
[RSDHA'21] Energy Efficient Task Graph Execution Using Compute Unit Masking In GPUs.
Marcus Chow, Kiran Ranganath, Robert Lerias, Mika Shanela Carodan, & Daniel Wong.
In Workshop on Redefining Scalability for Diversely Heterogeneous Architectures (RSDHA), 2021.
link   bibtex  
  2020 (3)
[MICRO'20] BOW: Breathing Operand Windows To Exploit Bypassing In GPUs.
Hodjat Asghari Esfeden, Amirali Abdolrashidi, Shafiur Rahman, Daniel Wong, & Nael Abu-Ghazaleh.
In Proceedings of the 53rd IEEE/ACM International Symposium on Microarchitecture (MICRO), 2020. (Acceptance Rate: 19.4%)

[MICRO'20] BOW: Breathing Operand Windows To Exploit Bypassing In GPUs [pdf] paper   link   bibtex  
[IEEE CAL'20] GPU-NEST: Characterizing Energy Efficiency Of Multi-GPU Inference Servers.
Ali Jahanshahi, Hadi Zamani Sabzi, Chester Lau, & Daniel Wong.
IEEE Computer Architecture Letters, Volume 19(Issue 2): 139–142. 2020.
link   bibtex  
[FCCM'20] High-Performance Parallel Radix Sort On FPGA.
Bashar Romanous, Mohammadreza Rezvani, Junjie Huang, Daniel Wong, Evangelos E Papalexakis, Vassilis J Tsotras, & Walid Najjar.
In Proceedings of the 28th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2020. (poster)

link   bibtex  
  2019 (6)
[ASPLOS'19] CORF: Coalescing Operand Register File For GPUs.
Hodjat Asghari Esfeden, Farzad Khorasani, Hyeran Jeon, Daniel Wong, & Nael Abu-Ghazaleh.
In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2019. (Acceptance Rate: 21.1%)

[ASPLOS'19] CORF: Coalescing Operand Register File For GPUs [pdf] paper   link   bibtex  
[HPCA'19] μDPM: Dynamic Power Management For The Microsecond Era.
Chih-Hsun Chou, Laxmi N Bhuyan, & Daniel Wong.
In Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture (HPCA), 2019. (Acceptance Rate: 19.7%)

[HPCA'19] μDPM: Dynamic Power Management For The Microsecond Era [pdf] paper   [HPCA'19] μDPM: Dynamic Power Management For The Microsecond Era [pdf] slides   link   bibtex  
[IEEE CAL'19] Speeding Up Collective Communications Through Inter-Gpu Re-Routing.
Kiran Ranganath, AmirAli Abdolrashidi, Shuaiwen Leon Song, & Daniel Wong.
IEEE Computer Architecture Letters, Volume 18(Issue 2): 128–131. 2019.
[IEEE CAL'19] Speeding Up Collective Communications Through Inter-Gpu Re-Routing [pdf] paper   link   bibtex  
[IEEE CAL'19] Locality-Aware GPU Register File.
Hyeran Jeon, Hodjat Asghari Esfeden, Nael Abu-Ghazaleh, Daniel Wong, & Sindhuja Elango.
IEEE Computer Architecture Letters, Volume 18(Issue 2): 153–156. 2019.
link   bibtex  
[Applied Energy'19] Frequency Regulation Service Provision In Data Center With Computational Flexibility.
Wei Wang, Amirali Abdolrashidi, Nanpeng Yu, & Daniel Wong.
Applied Energy, Volume 251. October 2019. (IF: 8.4)

[Applied Energy'19] Frequency Regulation Service Provision In Data Center With Computational Flexibility [pdf] paper   [Applied Energy'19] Frequency Regulation Service Provision In Data Center With Computational Flexibility [link] link   link   bibtex  
[SMACD'19] Long-Term Reliability Management For Multitasking GPGPUs.
Zeyu Sun, Taeyoung Kim, Marcus Chow, Shaoyi Peng, Han Zhou, Hyoseung Kim, Daniel Wong, & Sheldon X-D Tan.
In Proceedings of the 16th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD), 2019.
link   bibtex  
  2018 (2)
[IPDPS'18] Joint Server And Network Energy Saving In Data Centers For Latency-Sensitive Applications.
Liang Zhou, Chih-Hsun Chou, Laxmi N Bhuyan, KK Ramakrishnan, & Daniel Wong.
In Proceedings of the 32nd IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2018. (Acceptance Rate: 24.5%)

[IPDPS'18] Joint Server And Network Energy Saving In Data Centers For Latency-Sensitive Applications [pdf] paper   link   bibtex  
[ISLPED'18] Load-Triggered Warp Approximation On GPU.
Zhenhong Liu, Daniel Wong, & Nam Sung Kim.
In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), 2018. (Acceptance Rate: 23.3%)

link   bibtex  
  2017 (1)
[MICRO'17] Wireframe: Supporting Data-Dependent Parallelism Through Dependency Graph Execution In GPUs.
AmirAli Abdolrashidi, Devashree Tripathy, Mehmet Esat Belviranli, Laxmi Narayan Bhuyan, & Daniel Wong.
In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2017. (Acceptance Rate: 18.6%)

[MICRO'17] Wireframe: Supporting Data-Dependent Parallelism Through Dependency Graph Execution In GPUs [pdf] paper   [MICRO'17] Wireframe: Supporting Data-Dependent Parallelism Through Dependency Graph Execution In GPUs [pdf] slides   link   bibtex  
  2016 (6)
[ISCA'16] Peak Efficiency Aware Scheduling for Highly Energy Proportional Servers.
Daniel Wong.
In Proceedings of the 43rd ACM/IEEE International Symposium on Computer Architecture (ISCA), 2016. (Acceptance Rate: 19.5%)

[ISCA'16] Peak Efficiency Aware Scheduling for Highly Energy Proportional Servers [pdf] paper   [ISCA'16] Peak Efficiency Aware Scheduling for Highly Energy Proportional Servers [pdf] slides   link   bibtex  
[HPCA'16] Approximating Warps with Intra-Warp Operand Value Similarity.
Daniel Wong, Nam Sung Kim, & Murali Annavaram.
In Proceedings of the 22nd IEEE International Symposium on High Performance Computer Architecture (HPCA), 2016. (Acceptance Rate: 22%)

[HPCA'16] Approximating Warps with Intra-Warp Operand Value Similarity [pdf] paper   [HPCA'16] Approximating Warps with Intra-Warp Operand Value Similarity [pptx] slides   link   bibtex  
[ICS'16] Origami: Folding Warps For Energy Efficient GPUs.
Mohammad Abdel-Majeed, Daniel Wong, Justin Kuang, & Murali Annavaram.
In Proceedings of the ACM International Conference on Supercomputing (ICS), 2016. (Acceptance Rate: 24%)

[ICS'16] Origami: Folding Warps For Energy Efficient GPUs [pdf] paper   [ICS'16] Origami: Folding Warps For Energy Efficient GPUs [pdf] slides   link   bibtex  
[ISLPED'16] Dynsleep: Fine-Grained Power Management For A Latency-Critical Data Center Application.
Chih-Hsun Chou, Daniel Wong, & Laxmi N Bhuyan.
In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2016. (Acceptance Rate: 23%)

[ISLPED'16] Dynsleep: Fine-Grained Power Management For A Latency-Critical Data Center Application [pptx] slides   link   bibtex  
[DAC'16] Invited - Cross-Layer Modeling And Optimization For Electromigration Induced Reliability.
Taeyoung Kim, Zeyu Sun, Chase Cook, Hengyang Zhao, Ruiwen Li, Daniel Wong, & Sheldon X-D Tan.
In Proceedings of the 53rd Annual Design Automation Conference (DAC)), 2016.
link   bibtex  
[SBAC-PAD'16] STOMP: Statistical Techniques For Optimizing And Modeling Performance Of Blocked Sparse Matrix Vector Multiplication.
Steena Monteiro, Forrest Iandola, & Daniel Wong.
In Proceedings of the 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2016.
link   bibtex  
  2015 (1)
[IISWC'15] A Retrospective Look Back On The Road Towards Energy Proportionality.
Daniel Wong, Julia Chen, & Murali Annavaram.
In Proceedings of the 2015 IEEE International Symposium on Workload Characterization (IISWC), 2015. (Short paper with presentation)

[IISWC'15] A Retrospective Look Back On The Road Towards Energy Proportionality [pdf] paper   link   bibtex  
  2014 (1)
[HPCA'14] Implications of High Energy Proportional Servers on Cluster-Wide Energy Proportionality.
Daniel Wong, & Murali Annavaram.
In Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture (HPCA), 2014. (Acceptance Rate: 25.6%)

[HPCA'14] Implications of High Energy Proportional Servers on Cluster-Wide Energy Proportionality [pdf] paper   [HPCA'14] Implications of High Energy Proportional Servers on Cluster-Wide Energy Proportionality [pptx] slides   link   bibtex  
  2013 (2)
[Top Picks'13] Scaling The Energy Proportionality Wall With Knightshift.
Daniel Wong, & Murali Annavaram.
IEEE Micro's "Top Picks from the Computer Architecture Conferences of 2012", Volume 33(Issue 3): 28–37. 2013.
link   bibtex  
[MICRO'13] Warped Gates: Gating Aware Scheduling and Power Gating For GPGPUs.
Mohammad Abdel-Majeed*, Daniel Wong*, & Murali Annavaram.
In Proceedings of the 46th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2013. (Acceptance Rate: 16.3%)
* Authors contributed equally

[MICRO'13] Warped Gates: Gating Aware Scheduling and Power Gating For GPGPUs [pdf] paper   [MICRO'13] Warped Gates: Gating Aware Scheduling and Power Gating For GPGPUs [pptx] slides   link   bibtex  
  2012 (2)
[MICRO'12] KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity.
Daniel Wong, & Murali Annavaram.
In Proceedings of the 45th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2012. (Acceptance Rate: 17.5%)
Selected as 1 of 11 IEEE Micro Top Pick in Computer Architecture 2013

[MICRO'12] KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity [pdf] paper   [MICRO'12] KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity [pptx] slides   link   bibtex  
[WEED'12] Evaluating A Prototype KnightShift-enabled server.
Daniel Wong, & Murali Annavaram.
In Workshop on Energy-Efficient Design (WEED), 2012.
[WEED'12] Evaluating A Prototype KnightShift-enabled server [pdf] paper   link   bibtex  
  2010 (4)
[MICRO'10] Adaptive and Speculative Slack Simulations of CMPs on CMPs.
Jainwei Chen, Lakshmi Kumar Dabbiru, Daniel Wong, Murali Annavaram, & Michel Dubois.
In Proceedings of the 43rd IEEE/ACM International Symposium on Microarchitecture (MICRO), 2010. (Acceptance Rate: 17.4%)

[MICRO'10] Adaptive and Speculative Slack Simulations of CMPs on CMPs [pdf] paper   link   bibtex  
[FDG'10] Implementing Games On Pinball Machines.
Daniel Wong, Darren Earl, Fred Zyda, Ryan Zink, Sven Koenig, Allen Pan, Selby Shlosberg, Jaspreet Singh, & Nathan Sturtevant.
In Proceedings of the Fifth International Conference on the Foundations of Digital Games (FDG), 2010. (Acceptance Rate: 34%)

link   bibtex  
[AAAI Spring'10] Teaching Robotics And Computer Science With Pinball Machines.
Daniel Wong, Darren Earl, Fred Zyda, & Sven Koenig.
In Papers of the 2010 AAAI Spring Symposium Series, 2010.
link   bibtex  
[EAAI'10] Teaching Artificial Intelligence and Robotics Via Games.
Daniel Wong, Ryan Zink, & Sven Koenig.
In Proceedings of the First AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI), 2010.
link   bibtex