Curriculum Vitae

Download Full CV: Available in PDF format

Research Interests

Code LLM & Agents Building intelligent coding assistants and autonomous agents that understand and generate code

Long-Horizon Planning Enabling LLMs to perform complex, multi-step reasoning and planning over extended sequences

RAG & Reasoning Retrieval-augmented generation and reasoning systems using continuous token representations

Text Diffusion Models Advancing non-autoregressive generation through diffusion-based approaches

Coding-Based AI Scientist Developing AI systems that can autonomously discover knowledge through code

Professional Experience

May 2022 – Present

Staff Research Scientist

Apple MLR, Cupertino, CA

Manager: Navdeep Jaitly

Using code domain as a testbed for planning and reasoning. Recent work includes DiffuCoder (diffusion-based code generation) and SWE-Agent (autonomous code agents).

September 2021 – May 2022

Research Scientist

Facebook AI (Meta AI), Menlo Park, CA

Manager: Yashar Mehdad

Worked on retrieval-augmented generation (RAG) and summarization systems.

February 2018 – September 2021

Senior Researcher

Microsoft Research, Redmond, WA

Manager: Bill Dolan

Worked on generative dialogue models. Created DialoGPT, an open-sourced pretrained chat model.

Recent Talks

October 2025
Towards Understanding and Building Intuition for Language Model
• University of Pennsylvania, Guest lecturer
• BAIR NLP workshop, UC Berkeley, Invited speaker
• University of Washington, NLP seminar, Invited speaker
July 2025
Bidirectional Language Model
Apple Natural Language Understanding workshop, Invited speaker

Selected Preprints

Jie He, Richard He Bai, Sinead Williamson, Jeff Z. Pan, Navdeep Jaitly, Yizhe Zhang. CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning. arXiv (2025) [paper]

Haoqiang Kang, Yizhe Zhang, Nikki Lijing Kuang, Nicklas Majamaki, Navdeep Jaitly, Yi-An Ma, Lianhui Qin. LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning. arXiv (2025) [paper]

Huangjie Zheng, Shansan Gong, Ruixiang Zhang, Tianrong Chen, Jiatao Gu, Mingyuan Zhou, Navdeep Jaitly, Yizhe Zhang. Continuously Augmented Discrete Diffusion Model for Categorical Generative Modeling. arXiv (2025)

Amin Karimi Monse, Nikhil Bhendawade, Manuel Rafael Ciosici, Dominic Culver, Yizhe Zhang, Irina Belousova. FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models. arXiv (2025)

Shansan Gong, Ruixiang Zhang, Huangjie Zheng, Jiatao Gu, Navdeep Jaitly, Lingpeng Kong, Yizhe Zhang. DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation. arXiv (2025) [paper]

Wei Liu, Ruochen Zhou, Yiyun Deng, Yuzhen Huang, Junteng Liu, Yuntian Deng, Yizhe Zhang, Junxian He. Learn to Reason Efficiently with Adaptive Length based Reward Shaping. arXiv (2025)

Deepro Choudhury, Sinead Williamson, Adam Goliński, Ning Miao, Freddie Bickford Smith, Michael Kirchhof, Yizhe Zhang, Tom Rainforth. BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design. arXiv (2025)

Yizhe Zhang, Richard Bai, Zijin Gu, Ruixiang Zhang, Jiatao Gu, Emmanuel Abbe, Samy Bengio, Navdeep Jaitly. What makes the preferred thinking direction for LLM in Multi-choice Questions? arXiv (2025)

Xiaogeng Liu, Zhiyuan Yu, Yizhe Zhang, Ning Zhang, Chaowei Xiao. Automatic and Universal Prompt Injection Attacks Against Large Language Models. arXiv (2024)

Professional Services

Area Chair / Senior PC

NeurIPS (since 2020)
ICML (since 2022)
ICLR (since 2023)
ACL (2020-2021)
EMNLP (2022)
NAACL (2023)
AAAI (2018-2021)

Editorial Roles

Action Editor, TMLR (since 2023)
Action Editor, ARR (since 2023)

Organization

Organization Committee Member, ACL 2020

Reviewer

NeurIPS, ICML, ICLR
ACL, EMNLP, NAACL
AISTATS, AAAI

Awards & Honors

Stanford Top 2% Scientists (since 2023)
NeurIPS Top 5% Reviewer Award (2018)
Department Fellowship, Duke University (2013-2014)
National Excellent Graduate Scholarship - Top 1% (2012)
Travel Awards: NeurIPS (2015, 2016), ICML (2017), ICDM (2016), IJCAI (2016), AAAI (2016)

Teaching Experience

Advanced Machine Learning (STA571)

Duke University

Instructor: Katherine Heller

Probabilistic Machine Learning (CS571)

Duke University

Instructor: Cynthia Rudin

Technical Skills

PyTorch TensorFlow Python C/C++ Java Lua MATLAB R