Yizhe Zhang 张轶哲



Senior Researcher at Microsoft Research

Company: yizhe.zhang (at) microsoft.com

Personal: jeremy071242044 (at) gmail.com

Linkedin Profile

Google scholar


Room 3936, Building 99

One Microsoft way, Redmond, WA

98052, United States.

Research interest

I am working on the interplays among natural language processing (NLP), deep generative models (DGM) and Reinforcement learning (RL). I have particular interests in text generation and conversational system. I also have broad interests for other natural language generation (NLG) tasks such as text style transfer, paraphrasing, question generation, summarization and machine translation.

Specifically, my recent research focuses on:

1). Large-scale transformer-based pretraining for text.

2). Non-autoregressive text generation.

3). Adversarial attacking for NLP.

4). Constraint/controllable/knowledge-based text generation.

5). Self-playing for open-domain conversational agent.

6). Toxicity detection and prevention for NLG.

7). Long-form text understandinggenerationreasoning.

8). Other interplays between deep generative models, RL, MCMC and NLP.

Please send me CV if you are interested in above topics and looking for an internship.


Senior Researcher @ MSR NLP group. 2018 - present

Microsoft Research, Redmond, WA


Ph.D on Machine Learning. 2013-2018

Advisor: Lawrence Carin

M.S. on Statistical Science. 2016-2018

Advisors: David Dunson, Scott Schmidler and Katherine Heller

Duke University, Durham, NC


Consistent Dialogue Generation with Self-supervised Feature Learning [code]

Yizhe Zhang, Xiang Gao, Sungjin Lee, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan

Unsupervised Common Question Generation from Multiple Documents using Reinforced Contrastive Coordinator

Woon Sang Cho, Yizhe Zhang, Sudha Rao, Asli Celikyilmaz, Chenyan Xiong, Jianfeng Gao, Mengdi Wang, Bill Dolan

POINTER: Constrained Text Generation via Insertion-based Generative Pre-training

Yizhe Zhang*, Guoyin Wang*, Chunyuan Li, Zhe Gan, Chris Brockett, Bill Dolan

A Controllable Model of Grounded Response Generation

Zeqiu Wu, Michel Galley, Chris Brockett, Yizhe Zhang, Xiang Gao, Chris Quirk, Rik Koncel-Kedziorski, Jianfeng Gao, Hannaneh Hajishirzi, Mari Ostendorf, Bill Dolan

Contextual Text Style Transfer

Yu Cheng, Zhe Gan, Yizhe Zhang, Oussama Elachqar, Dianqi Li, Jingjing Liu

Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

Chunyuan Li, Xiang Gao, Yuan Li, Xiujun Li, Baolin Peng, Yizhe Zhang, Jianfeng Gao

Selected conference and workshop publications


DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation [code]

Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan  —  system demonstration, ACL 2020

INSET: Sentence Infilling with Inter-sentential Generative Pre-training

Yichen Huang*, Yizhe Zhang*, Oussama Elachqar, Yu Cheng  —  ACL 2020

Improving Disentangled Text Representation Learning with Information Theoretical Guidance

Pengyu Cheng, Renqiang Min, Dinghan Shen, Christopher Malon, Yizhe Zhang, Yitong Li and Lawrence Carin  —  ACL 2020

Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation

Xinjie Fan, Yizhe Zhang, Zhendong Wang, Mingyuan Zhou  —  ICLR 2020

Complementary Auxiliary Classifiers for Label-Conditional Text Generation

Yuan Li, Chunyuan Li, Yizhe Zhang, Xiujun Li, Guoqing Zheng, Lawrence Carin, Jianfeng Gao  —  AAAI 2020

Sequence Generation with Optimal-Transport-Enhanced Reinforcement Learning

Liqun Chen, Ke Bai, Chenyang Tao, Yizhe Zhang, Guoyin Wang, Wenlin Wang, Ricardo Henao, Lawrence Carin  —  AAAI 2020

Contextual Re-Ranking with Behavior Aware Transformers

Chen Qu, Chenyan Xiong, Yizhe Zhang, Corby Rosset, W. Bruce Croft and Paul Bennett  —  SIGIR 2020


Structuring latent spaces for stylized response generation [code]

Xiang Gao, Yizhe Zhang, Sungjin Lee, Michel Galley, Chris Brockett, Jianfeng Gao and Bill Dolan  —  EMNLP 2019

Domain Adaptive Text Style Transfer

Dianqi Li, Yizhe Zhang, Zhe Gan, Yu Cheng, Chris Brockett, Ming-Ting Sun and Bill Dolan  —  EMNLP 2019

Generating a Common Question from Multiple Documents using Multi-source Encoder-Decoder Models

Woon Sang Cho, Yizhe Zhang, Sudha Rao, Chris Brockett and Sungjin Lee  —  WNGT, EMNLP 2019

Towards coherent and cohesive long-form text generation

Woon Sang Cho, Pengchuan Zhang, Yizhe Zhang, Xiujun Li, Michel Galley, Chris Brockett, Mengdi Wang, Jianfeng Gao  —  Workshop on Narrative Understanding, NAACL 2019

Unsupervised Dialogue Spectrum Generation for Log Dialogue Ranking

Xinnuo Xu, Yizhe Zhang, Lars Liden and Sungjin Lee  —  SIGDIAL 2019 (Best paper nomination)

Self-Enhanced Inverse Reinforcement Learning for Text Generation

Ping Yu, Ruiyi Zhang, Chunyuan Li, Yizhe Zhang, Changyou Chen  —  Imitation, Intent, and Interaction(I3), ICML 2019

Microsoft ICECAPS: An Open-Source Toolkit for Conversation Modeling

Vighnesh Leonardo Shiv, Chris Quirk, Anshuman Suri, Xiang Gao, Khuram Shahid, Nithya Govindarajan, Yizhe Zhang, Jianfeng Gao, Michel Galley, Chris Brockett, Tulasi Menon, Bill Dolan  —  system demonstration, ACL 2019

Improving Textual Network Embedding with Global Attention via Optimal Transport

Liqun Chen, Guoyin Wang, Chenyang Tao, Dinghan Shen, Yizhe Zhang and Lawrence Carin  —  ACL 2019

Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin  —  ACL 2019

Jointly Optimizing Diversity and Relevance in Neural Response Generation. [code]

Xiang Gao, Sungjin Lee, Yizhe Zhang, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan  —  NAACL 2019

Improving Sequence-to-Sequence Learning via Optimal Transport. [code]

Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, Lawrence Carin  —  ICLR 2019


Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization. [code]

Yizhe Zhang, Michel Galley, Jianfeng Gao, Zhe Gan, Xiujun Li, Chris Brockett, Bill Dolan  —  NIPS 2018

Adversarial Text Generation via Feature-Mover's Distance.

Liqun Chen, Shuyang Dai, Chenyang Tao, Dinghan Shen, Zhe Gan, Haichao Zhang, Yizhe Zhang, Lawrence Carin  —  NIPS 2018

Multi-Domain Joint Distribution Learning with Generative Adversarial Nets.

Yunchen Pu, Shuyang Dai, Zhe Gan, Weiyao Wang, Guoyin Wang, Yizhe Zhang, Ricardo Henao, Lawrence Carin  —  ICML 2018

On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms.

Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao and Lawrence Carin.  —  ACL 2018

Joint Embedding of Words and Labels for Text Classification.

Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao and Lawrence Carin.  —  ACL 2018

Deconvolutional Latent-Variable Model for Text Sequence Matching

Dinghan Shen, Yizhe Zhang, Ricardo Henao, Qinliang Su, Lawrence Carin.  —  AAAI 2018

Zero-Shot Learning via Class-Conditioned Deep Generative Models

Wenlin Wang, Yunchen Pu, Vinay Kumar Verma, Kai Fan, Yizhe Zhang, Changyou Chen, Piyush Rai, Lawrence Carin.  —  AAAI 2018


Deconvolutional Paragraph Representation Learning [supplements] [code] [data]

Yizhe Zhang, Dinghan Shen, Guoyin Wang, Ricardo Henao, Zhe Gan, Lawrence Carin  —  NIPS 2017.

Triangle Generative Adversarial Networks [code]

Zhe Gan, Liqun Chen, Weiyao Wang, Yunchen Pu, Yizhe Zhang, Lawrence Carin  —  NIPS 2017.

Stochastic Gradient Monomial Gamma Sampler [supplements] [code]

Yizhe Zhang, Changyou Chen, Zhe Gan, Ricardo Henao, Lawrence Carin.  —  ICML 2017.

Adversarial Feature Matching for Text Generation [supplements] [code] [model] [data]

Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Lawrence Carin.  —  ICML 2017.


Towards Unifying Hamiltonian Monte Carlo and Slice Sampling [supplements] [code]

Yizhe Zhang, Xiangyu Wang, Changyou Chen, Lawrence Carin.  —  NIPS 2016.

Distributed Bayesian Learning with Stochastic Gradient MCMC.

Changyou Chen, Nan Ding, Chunyuan Li, Yizhe Zhang, Lawrence Carin.  —  NIPS 2016.

Generating Text via Adversarial Training.

Yizhe Zhang, Zhe Gan, Lawrence Carin.  —  Workshop on Adversarial Training, NIPS, 2016.

Learning a Hybrid Architecture for Sequence Regression and Annotation. [supplements]

Yizhe Zhang, Ricardo Henao, Jianling Zhong, Lawrence Carin, Alexander Hartemink  —  AAAI 2016.

Bayesian Dictionary Learning with Gaussian Processes and Sigmoid Belief Networks. [code]

Yizhe Zhang, Ricardo Henao, Chunyuan Li, Lawrence Carin.  —  IJCAI 2016.

Triply Stochastic Variational Inference for Non-linear Beta Process Factor Analysis.

Kai Fan, Yizhe Zhang, Lawrence Carin, Katherine Heller.  —  ICDM 2016.

Dynamic Poisson Factor Analysis [code]

Yizhe Zhang, Ricardo Henao, Lawrence Carin.  —  ICDM 2016

Laplacian Hamiltonian Monte Carlo

Yizhe Zhang, Changyou Chen, Ricardo Henao, Lawrence Carin  —  ECML 2016.


Learning Dictionary with Spatial and Inter-dictionary Dependency.

Yizhe Zhang, Ricardo Henao, Chunyuan Li, Lawrence Carin.  —  Workshop on representation learning, NIPS, 2015.

Journal publications

MOST+: a Motif Finding Approach Combining Genomic Sequence and Heterogeneous Genome-wide Signatures. [Source code in C++]

Yizhe Zhang, Yupeng He and Chaochun Wei. BMC Genomics, 2015.

CRF-based Transcription Factor Binding Site Finding System. [Source code in C++]

Yupeng He, Yizhe Zhang, Guangyong Zheng and Chaochun Wei. BMC Genomics, 2012.

Composition-based Classification of Short Metagenomic Sequences Elucidates the Landscapes of Taxonomic and Functional Enrichment of Microorganisms.

Jiemeng Liu, Haifeng Wang, Hongxing Yang, Yizhe Zhang, Jinfeng Wang, Fangqing Zhao and Ji Qi. Nucleic Acids Research, 2012.

Other projects

Learning Infinite Mixture of Directed Acyclic Graphs

Understanding Regulatory Element Topic via Relational Topic Modeling


Towards Improving the Efficiency and Scalability of MCMC inference

Ph.D. defense presentation at Duke. Unifying HMC and slice sampling and beyond.


  • STA 561@Duke Probabilistic Machine Learning

  • STA 571@Duke Advanced Machine Learning

Professional services.

  • Meta-reviewer of AAAI (2019)

  • PC member of NIPS(2015-2019), ICML(2016-2019), AAAI(2016-2019), IJCAI(2016,2017), AISTATS(2018), ACL(2018-2019), EMNLP(2018-2019), NAACL(2019), SIAM(2018).


Deng Cai, CUHK, Research intern 2020

Jungo Kasai, University of Washington, Research intern 2020

Guoyin Wang, Duke University, Research intern 2019

Ziyu Yao, Ohio State University, Research intern 2018

Woon Sang Cho, Princeton University, Research intern 2018 (co-mentoring)

Dinghan Shen, Duke University, Research intern 2018 (co-mentoring)


  • [April. 2020] Our DialoGPT paper is accepted by ACL 2020 demo track.

  • [April. 2020] Two papers accepted by ACL 2020.

  • [Aug. 2019] Serving as local web chairs for ACL 2020.

  • [Aug. 2019] Our SIGDIAL paper is nominated as Best paper candidates.

  • [May. 2019] Two papers accepted by ACL 2019

  • [Oct. 2018] Going to Montreal for NeurIPS 2018 this December. Reach out to me if you will also go!

  • [Aug. 2018] Our recent papers “Adversarial Text Generation via Feature-Mover’s Distance” and “Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization” are accepted by NIPS 2018.

  • [Mar. 2018] Joined Microsoft Research as a full-time researcher.