Dustin Tran

Research Scientist at Google Brain

I am a research scientist at Google Brain. I am broadly interested in advancing science and intelligence, and where the ideas involve probability, programs, and/or neural nets.

I like to work simultaneously on fundamental research as well as systems to accelerate this research. In terms of systems, this includes Edward2 for specifying probability models as programs, Mesh TensorFlow for distributed computation, and Tensor2Tensor for deep learning research. Previously, I was a Ph.D. student at Columbia advised by David Blei and Andrew Gelman. I developed the original Edward language and was a member of the Stan development team.

Recently, I have been giving the following talk:

  • What Might Deep Learners Learn From Probabilistic Programming? Slides Video

Curriculum Vitae



Some of my work is available as preprints on arXiv.

Analyzing the role of model uncertainty in electronic health records
Where parameter uncertainty affects clinical decision-making.
Michael Dusenberry, Dustin Tran, Edward Choi, Jonas Kemp, Jeremy Nixon, Ghassen Jerfel, Katherine Heller, Andrew Dai

Discrete Flows: Invertible Generative Models for Discrete Data
How to model with discrete invertible functions.
Dustin Tran, Keyon Vafa, Kumar Krishna Agrawal, Laurent Dinh, Ben Poole

Measuring calibration in deep learning
How to measure accuracy of predicted probabilities.
Jeremy Nixon, Michael Dusenberry, Linchuan Zhang, Ghassen Jerfel, Dustin Tran

NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport
Perform HMC over difficult geometries by transforming the space to unit Gaussian.
Matthew Hoffman, Pavel Sountsov, Joshua V. Dillon, Ian Langmore, Dustin Tran, Srinivas Vasudevan

Bayesian Layers: A module for neural network uncertainty
A neural net-stylized primitive for distributions over functions.
Dustin Tran, Michael Dusenberry, Mark van der Wilk, Danijar Hafner

TensorFlow Distributions
A backend for efficient, composable manipulation of probability distributions.
Joshua V. Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, Rif A. Saurous

Expectation propagation as a way of life: A framework for Bayesian inference on partitioned data
How to distribute inference with massive data sets and how to combine inferences from many data sets.
Andrew Gelman, Aki Vehtari, Pasi Jylänki, Tuomas Sivula, Dustin Tran, Swupnil Sahai, Paul Blomstedt, John P. Cunningham, David Schiminovich, Christian Robert

Edward: A library for probabilistic modeling, inference, and criticism
Everything and anything about probabilistic models.
Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja Rudolph, Dawen Liang, David M. Blei

Model criticism for Bayesian causal inference
How to validate inferences from causal models.
Dustin Tran, Francisco J. R. Ruiz, Susan Athey, David M. Blei

Stochastic gradient descent methods for estimation with large data sets
Fast and statistically efficient algorithms for generalized linear models and M-estimation.
Dustin Tran, Panos Toulis, Edoardo M. Airoldi
Journal of Statistical Software, To appear


Reliable uncertainty estimates in deep neural networks using noise contrastive priors
A prior for neural networks in data space.
Danijar Hafner, Dustin Tran, Alex Irpan, Timothy Lillicrap, James Davidson
Uncertainty in Artificial Intelligence, 2019


Simple, distributed, and accelerated probabilistic programming
Probabilistic programs on TPUs.
Dustin Tran, Matthew D. Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, Alexey Radul, Matthew Johnson, Rif A. Saurous
Neural Information Processing Systems, 2018

Autoconj: Recognizing and exploiting conjugacy without a domain-specific language
The autointegrate analog of autodiff.
Matthew D. Hoffman, Matthew Johnson, Dustin Tran
Neural Information Processing Systems, 2018

Mesh-TensorFlow: Deep learning for supercomputers
Model parallelism made easier.
Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, Ryan Sepassi, Blake Hechtman
Neural Information Processing Systems, 2018

Image Transformer
An image autoregressive model using only attention.
Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, Dustin Tran
International Conference on Machine Learning, 2018

Implicit causal models for genome-wide association studies
Generative models applied to causality in genomics.
Dustin Tran, David M. Blei
International Conference on Learning Representations, 2018

Flipout: Efficient pseudo-independent weight perturbations on mini-batches
How to make weight perturbations in evolution strategies and variational BNNs as mini-batch-friendly as activation perturbations in dropout and batch norm.
Yeming Wen, Paul Vicol, Jimmy Ba, Dustin Tran, Roger Grosse
International Conference on Learning Representations, 2018


Hierarchical implicit models and likelihood-free variational inference
Combining the idea of implicit densities with hierarchical Bayesian modeling and deep neural networks.
Dustin Tran, Rajesh Ranganath, David M. Blei
Neural Information Processing Systems, 2017

Variational inference via $\chi$-upper bound minimization
Overdispersed approximations and upper bounding the model evidence.
Adji B. Dieng, Dustin Tran, Rajesh Ranganath, John Paisley, David M. Blei
Neural Information Processing Systems, 2017

Comment, "Fast approximate inference for arbitrarily large semiparametric regression models via message passing"
The role of message passing in automated inference.
Dustin Tran, David M. Blei
Journal of the American Statistical Association, 112(517):156–158, 2017

Automatic differentiation variational inference
An automated tool for black box variational inference, available in Stan.
Alp Kucukelbir, Dustin Tran, Rajesh Ranganath, Andrew Gelman, David M. Blei
Journal of Machine Learning Research, 18(14):1–45, 2017

Deep probabilistic programming
How to build a language with rich compositionality for modeling and inference.
Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei
International Conference on Learning Representations, 2017


Operator variational inference
How to formalize computational and statistical tradeoffs in variational inference.
Rajesh Ranganath, Jaan Altosaar, Dustin Tran, and David M. Blei
Neural Information Processing Systems, 2016

Hierarchical variational models
A Bayesian formalism for constructing expressive variational families.
Rajesh Ranganath, Dustin Tran, David M. Blei
International Conference on Machine Learning, 2016

Spectral M-estimation with application to hidden Markov models
Applying M-estimation for sample efficiency and robustness in moment-based estimators.
Dustin Tran, Minjae Kim, Finale Doshi-Velez
Artificial Intelligence and Statistics, 2016

Towards stability and optimality in stochastic gradient descent
A stochastic gradient method combining numerical stability and statistical efficiency.
Panos Toulis, Dustin Tran, Edoardo M. Airoldi
Artificial Intelligence and Statistics, 2016

The variational Gaussian process
A powerful variational model that can universally approximate any posterior.
Dustin Tran, Rajesh Ranganath, David M. Blei
International Conference on Learning Representations, 2016


Copula variational inference
Posterior approximations using copulas, which find meaningful dependence between latent variables.
Dustin Tran, David M. Blei, Edoardo M. Airoldi
Neural Information Processing Systems, 2015