[back]
2025
FACTORY: A Challenging Human-Verified Prompt Set for Long-Form Factuality
Mingda Chen, Yang Li, Xilun Chen, Adina Williams, Gargi Ghosh, Scott Wen-tau Yih
arXiv Preprint, 2025
arXiv / Dataset
😈ImpRAG: Retrieval-Augmented Generation with Implicit Queries
Wenzheng Zhang, Victoria Lin, Karl Stratos, Scott Wen-tau Yih, Mingda Chen
Findings of EMNLP, 2025
arXiv
Improving Factuality with Explicit Working Memory
Mingda Chen, Yang Li, Karthik Padthe, Rulin Shao, Alicia Sun, Luke Zettlemoyer, Gargi Ghosh, Scott Wen-tau Yih
Proceedings of ACL, 2025
arXiv / BibTex
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference
Yejin Lee, Alicia Golden, Anna Sun, Basil Hosmer, Bilge Acun, Can Balioglu, Changhan Wang, Charles David Hernandez, Christian Puhrsch, Daniel Haziza, Driss Guessous, Francisco Massa, Jacob Kahn, Jeffrey Wan, Jeremy Reizenstein, Jiaqi Zhai, Joe Isaacson, Joel Schlosser, Juan Pino, Kaushik Ram Sadagopan, Leonid Shamis, Linjian Ma, Min-Jae Hwang, Mingda Chen, Mostafa Elhoushi, Pedro Rodriguez, Ram Pasunuru, Samuel Hsia, Scott Yih, Sravya Popuri, Xing Liu, and Carole-Jean Wu
IEEE Micro, 2025
arXiv
2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Meta FAIR Chameleon Team
arXiv Preprint, 2024
arXiv
Few-Shot Data Synthesis for Open-Domain Multi-Hop Question Answering
Mingda Chen, Xilun Chen, Scott Wen-tau Yih
Proceedings of EACL, 2024 (Oral)
arXiv / BibTex
2023
RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Victoria Lin*, Xilun Chen*, Mingda Chen*, Weijia Shi, Maria Lomeli, Rich James, Pedro Rodriguez, Jacob Kahn, Gergely Szilvasy, Mike Lewis, Luke Zettlemoyer, Scott Wen-tau Yih
Proceedings of ICLR, 2023
arXiv / BibTex
Findings of the IWSLT 2023 Evaluation Campaign
Milind Agarwal, Sweta Agrawal, Antonios Anastasopoulos, Luisa Bentivogli, Ondrej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, Mingda Chen, William Chen, Khalid Choukri, Alexandra Chronopoulou, Anna Currey, Thierry Declerck, Qianqian Dong, Kevin Duh, Yannick Estève, Marcello Federico, Souhir Gahbiche, Barry Haddow, Benjamin Hsu, Phu Mon Htut, Hirofumi Inaguma, Dávid Javorský, John Judge, Yasumasa Kano, Tom Ko, Rishu Kumar, Pengwei Li, Xutai Ma, Prashant Mathur, Evgeny Matusov, Paul McNamee, John P. McCrae, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Ha Nguyen, Jan Niehues, Xing Niu, Atul Kr. Ojha, John E. Ortega, Proyag Pal, Juan Pino, Lonneke van der Plas, Peter Polák, Elijah Rippeth, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Yun Tang, Brian Thompson, Kevin Tran, Marco Turchi, Alex Waibel, Mingxuan Wang, Shinji Watanabe, and Rodolfo Zevallos
Proceedings of IWSLT, 2023
PDF / BibTex
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
Mingda Chen, Paul-Ambroise Duquenne, Pierre Andrews, Justine Kao, Alexandre Mourachko, Holger Schwenk, Marta R. Costa-jussÃ
Proceedings of ACL, 2023
arXiv / BibTex
xSIM: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages
Mingda Chen*, Kevin Heffernan*, Onur Çelebi, Alexandre Mourachko, Holger Schwenk
Proceedings of ACL, 2023 (Oral)
arXiv / BibTex
2022
Leveraging Natural Supervision for Language Representation Learning and Generation
Mingda Chen
PhD Thesis, 2022
arXiv / BibTex
Improving In-Context Few-Shot Learning via Self-Supervised Training
Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva
Proceedings of NAACL, 2022
arXiv / Poster / Slides / BibTex
SummScreen: A Dataset for Abstractive Screenplay Summarization
Mingda Chen, Zewei Chu, Sam Wiseman, Kevin Gimpel
Proceedings of ACL, 2022 (Oral)
arXiv / Poster / Slides / Data / BibTex
2021
TVRecap: A Dataset for Generating Stories with Character Descriptions
Mingda Chen, Kevin Gimpel
arXiv Preprint, 2021
arXiv / Data / BibTex
WikiTableT: A Large-Scale Data-to-Text Dataset for Generating Wikipedia Article Sections
Mingda Chen, Sam Wiseman, Kevin Gimpel
Findings of ACL, 2021
arXiv / Code / BibTex
2020
Exemplar-Controllable Paraphrasing and Translation using Bitext
Mingda Chen, Sam Wiseman, Kevin Gimpel
arXiv Preprint, 2020
arXiv / Data / BibTex
Mining Knowledge for Natural Language Inference from Wikipedia Categories
Mingda Chen*, Zewei Chu*, Karl Stratos, Kevin Gimpel, Kevin Gimpel
Findings of EMNLP, 2020
arXiv / Code / BibTex
Learning Probabilistic Sentence Representations from Paraphrases
Mingda Chen, Kevin Gimpel
Proceedings of RepL4NLP at ACL, 2020
arXiv / Slides / BibTex
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut
Proceedings of ICLR, 2020 (Spotlight)
arXiv / Code / BibTex
How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions
Zewei Chu, Mingda Chen*, Jing Chen*, Miaosen Wang*, Kevin Gimpel, Manaal Faruqui, Xiance Si
Proceedings of AAAI, 2020 (Oral)
arXiv / Data / BibTex
2019
EntEval: A Holistic Evaluation Benchmark for Entity Representations
Mingda Chen*, Zewei Chu*, Yang Chen, Karl Stratos, Kevin Gimpel
Proceedings of EMNLP, 2019
arXiv / Poster / Code / BibTex
Evaluation Benchmarks and Learning Criteria for Discourse-Aware Sentence Representations
Mingda Chen*, Zewei Chu*, Kevin Gimpel
Proceedings of EMNLP, 2019 (Oral)
arXiv / Slides / Code / BibTex
Controllable Paraphrase Generation with a Syntactic Exemplar
Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel
Proceedings of ACL, 2019
arXiv / Poster / Code / Train and Eval Data / BibTex
A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations
Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel
Proceedings of NAACL-HLT, 2019
arXiv / Poster / 1-Minute Slides / Code / Train Data / Eval Data / BibTex
2018
Variational Sequential Labelers for Semi-Supervised Learning
Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel
Proceedings of EMNLP, 2018 (Oral)
PDF / Appendix / Slides / Code / BibTex
Smaller Text Classifiers with Discriminative Cluster Embeddings
Mingda Chen, Kevin Gimpel
Proceedings of NAACL-HLT, 2018
PDF / Poster / Code / BibTex