publications

publications by categories in reversed chronological order. generated by jekyll-scholar. *equal contribution.

2024

  1. APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
    Bowen ZhaoHannaneh Hajishirzi, and Qingqing Cao
    Jan 2024
  2. BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models
    Qingqing CaoSewon MinYizhong Wang, and Hannaneh Hajishirzi
    In The Twelfth International Conference on Learning Representations, Jan 2024

2023

  1. Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation
    Hao PengQingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, Iz Beltagy, Evan Pete Walsh, Noah A. Smith, and Hannaneh Hajishirzi
    Jul 2023
  2. AdANNS: A Framework for Adaptive Semantic Search
    Aniket RegeAditya Kusupati, Sharan Ranjit S, Alan Fan, Qingqing Cao, Sham M. Kakade, Prateek Jain, and Ali Farhadi
    In Thirty-Seventh Conference on Neural Information Processing Systems, Nov 2023
  3. PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
    Qingqing CaoBhargavi Paranjape, and Hannaneh Hajishirzi
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul 2023
  4. A Survey for Efficient Open Domain Question Answering
    Qin Zhang, Shangsi Chen, Dongkuan Xu, Qingqing Cao, Xiaojun Chen, Trevor Cohn, and Meng Fang
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul 2023
  5. Efficient Methods for Natural Language Processing: A Survey
    Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, and Roy Schwartz
    Transactions of the Association for Computational Linguistics, Jul 2023

2022

  1. MobiVQA: Efficient On-Device Visual Question Answering
    Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Jul 2022

2021

  1. IrEne-viz: Visualizing Energy Consumption of Transformer Models
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Nov 2021
  2. IrEne: Interpretable Energy Prediction for Transformers
    In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Aug 2021
  3. Are Mobile DNN Accelerators Accelerating DNNs
    Qingqing Cao, Alexandru E. Irimiea, Mohamed Abdelfattah, Aruna Balasubramanian, and Nicholas D. Lane
    In Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning, Aug 2021

2020

  1. Towards Accurate and Reliable Energy Measurement of NLP Models
    In Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing, Nov 2020
  2. DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020

2019

  1. DeQA: On-Device Question Answering
    Qingqing Cao, Noah Weber, Niranjan Balasubramanian, and Aruna Balasubramanian
    In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services, Jul 2019

2017

  1. UIWear: Easily Adapting User Interfaces for Wearable Devices
    Jian Xu*Qingqing Cao*, Aditya Prakash, Aruna Balasubramanian, and Donald E. Porter
    In Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, Jul 2017
  2. Demo: UIWear: Easily Adapting User Interfaces for Wearable Devices
    In Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, Jul 2017
  3. MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU
    In Proceedings of the 1st International Workshop on Deep Learning for Mobile Systems and Applications, Jul 2017