Hi!!! I am a Phd student from Department of Statistics, University of Michigan, Ann Arbor. I am fortunate to be advised by Professor Long Nguyen and Professor Ya'acov Ritov. Prior to coming to Michigan, I finished my Bachelor's degree in Mathematics and Computer Science at Ho Chi Minh University of Science, Vietnam. My undergraduate majors are Partial Differential Equations (PDE) and Numerical Analysis.

At this moment, I am interesed in clustering problems, especially their theoretical properties. Two major clustering approaches that I am currently working are mixture models and K-means. I also embark on studying the contraction of the order of mixture models, which has attract a considerable amount of attention.

Additionally, I am working with Mikhail Yurochkin to develop interesting models from Bayesian nonparametrics' viewpoint. The target applications are to cluster functional data or to predict new links in Twitter network.

Moreover, I start collaborating with Aritra Guha to work on challenging problems regarding mixtures of exchangeable random variables as well as Dirichlet Process mixture models when the kernel density includes both mean and covariance parameter. The main goal is develop good and efficient algorithms as well as rigorous theories for such practical models. Finally, I also embark on working with Can Le to work on the convergence rates of EM algorithm for Gaussian mixture models when both the mean and covariance are of interest. I feel very motivated and lucky to have them as my good friends and collaborators.

For a recent version of my CV, please email me.

Educational background:

- 2012 - present : PhD in Statistics, University of Michigan, Ann Arbor
- 2007 - 2011 : BS in Mathematics and Computer Science (highest distinction), Ho Chi Minh University of Science, Vietnam.

Current research interests:

- Bayesian nonparametrics
- Mixture models, clustering analysis
- Statistical learning theory
- Robust statistics

Preprints:

- Nhat Ho and XuanLong Nguyen. Identifiability and optimal rates of convergence for parameters of multiple types in finite mixtures .
*95 pages, Technical report 536, Department of Statistics, University of Michigan, Ann Arbor, January 2015*

Journal Publications:

- Nhat Ho and XuanLong Nguyen. On strong identifiability and convergence rates of
parameter estimation in finite mixtures .
*Electronic Journal of Statistics*, 10(1), 271-307, 2016.__Summary__: This paper studies identifiability and convergence behaviors for parameters of multiple types, including matrix-variate ones, that arise in finite mixtures, and the effects of model fitting with extra mixing components. We consider several notions of strong identifiability in a matrix-variate setting, and use them to establish sharp inequalities relating the distance of mixture densities to the Wasserstein distances of the corresponding mixing measures. Characterization of identifiability is given for a broad range of mixture models commonly employed in practice, including locationcovariance mixtures and location-covariance-shape mixtures, for mixtures of symmetric densities, as well as some asymmetric ones. Minimax lower bounds and rates of convergence for the maximum likelihood estimates are established for such classes, which are also confirmed by simulation studies. - Nhat Ho and XuanLong Nguyen. Convergence rates of parameter estimation for some weakly identifiable finite mixtures .
*Annals of Statistics, 2016*, 44(6), 2726-2755, 2016.__Summary__: We establish minimax lower bounds and maximum likelihood convergence rates of parameter estimation for mean-covariance multivariate Gaussian mixtures, shape-rate Gamma mixtures, and some variants of finite mixture models, including the setting where the number of mixing components is bounded but unknown. These models belong to what we call ”weakly identifiable” classes, which exhibit specific interactions among mixing parameters driven by the algebraic structures of the class of kernel densities and their partial derivatives. Accordingly both the minimax bounds and the maximum likelihood parameter estimation rates in these models are shown to be typically much slower than the usual $n^{-1/2}$ or $n^{-1/4}$ rates of convergence. - Nhat Ho and XuanLong Nguyen. Singularity structures and impacts on parameter estimation behavior in finite mixtures of distributions .
*Under review.*- Technical report version that contains all the missing results in the journal version: Singularity structures and impacts on parameter estimation behavior in finite mixtures of distributions .
*85 pages.*

__Summary__: Singularities of a statistical model are the elements of the model’s parameter space which make the corresponding Fisher information matrices degenerate. These are the points for which standard estimation techniques such as the maximum likelihood estimator do not admit the root-n parametric rate of convergence. We propose a general framework for the identification of singularity structures of the parameter space of finite mixtures, and study the impact of the singularity levels on minimax lower bounds and rates of convergence for the maximum likelihood estimator over a compact parameter space. Our investigation makes explicit the deep links between model singularities, parameter estimation rates and minimax bounds, and the algebraic geometry of the parameter space for mixtures of continuous distributions. The theory is applied to establish concrete convergence rates of parameter estimation for finite mixture of skewnormal distributions. This rich and increasingly popular model class is shown to exhibit a remarkably complex range of asymptotic behaviors that have not been hitherto reported in literature. - Technical report version that contains all the missing results in the journal version: Singularity structures and impacts on parameter estimation behavior in finite mixtures of distributions .
- Nhat Ho, XuanLong Nguyen, Mikhail Yurochkin, Hung Bui, Viet Huynh, and Dinh Phung.
__Multilevel clustering via Wasserstein means__.*To appear, Proceedings of the ICML, 2017.*__Summary__: We propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a large hierarchically structural corpus of data. Our method involves a joint optimization formulation over several spaces of discrete probability measures, which are endowed with the Wasserstein distance metric. We propose a number of variants of this problem, which admit fast optimization algorithms, by exploiting the connection to the problem of finding Wasserstein barycenters. We also establish consistency properties enjoyed by our estimates of both local and global clusters. Finally, we present experiment results with both synthetic and real data to demonstrate the flexibility and scalability of the proposed approach.

Conferences, Seminars, and Workshops Presentations:

- Singularity structures and parameter estimation behavior in finite mixtures of distributions.
*Nonparametric Statistics Workshop: Integration of Theory, Methods, and Applications, October, 2016, Ann Arbor, Michigan.* - Singularity structures and impacts on parameter estimation in finite mixtures of distributions.
*Shannon Centennial Symposium, September, 2016, Ann Arbor, Michigan.* - Singularity structures and parameter estimation behavior in finite mixtures of distributions.
*Joint Statistical Meetings (JSM), August, 2016, Chicago, Illinois.* - Singularity structures and parameter estimation behavior in finite mixtures of distributions.
*Conference on Statistical Learning and Data Science, June, 2016, University of North Carolina at the Chapel Hill.* - Singularity structures and parameter estimation behavior in finite mixtures of distributions.
*Statistical Machine Learning Student Workshop, June, 2016, University of Michigan, Ann Arbor.* - Singularity structures and parameter estimation in mixtures of skew normal distributions.
*Michigan Student Symposium for Interdisciplinary Statistical Sciences (MSSISS), March, 2016, Ann Arbor, MI.* - Weak identifiability and convergence rate of mixing measures in over-fitted Gaussian mixture models.
*Student Seminar, Department of Statistics, University of Michigan, January, 2016, Ann Arbor, Michigan.* - Intrinsic difficulties for the inference of mixing measures in finite mixtures of
univariate skew normal distributions.
*From Industrial Statistics to Data Science, October, 2015, Ann Arbor, Michigan.* - Posterior concentration of mixing parameters in some weakly identifiable finite
mixture models.
*10th Conference on Bayesian Nonparametrics, June, 2015, Raleigh, North Carolina* - Weak identifiability and optimal rate of convergence of mixing measures in over-fitted Gaussian mixture models.
*Statistical Machine Learning Student Workshop, June, 2015, University of Michigan, Ann Arbor* - Weak identifiability and optimal rate of convergence of mixing measures in
over-fitted Gaussian mixture models.
*NSF Conference - Statistics for Complex Systems, June, 2015, Madison, Wisconsin* - Optimal convergence rate of parameter estimation in overfitted finite Gaussian
mixture models.
*Michigan Student Symposium for Interdisciplinary Statistical Sciences (MSSISS), March, 2015, Ann Arbor, MI* - Identifiability and convergence rate of parameter estimations in exact-fitted finite
mixture models.
*Statistical Machine Learning Student Workshop, June, 2014, University of Michigan, Ann Arbor*

Selected Honors and Awards:

- Conference on Statistical Learning and Data Science Travel Award, UNC Chapel Hill, 2016.
- Best Poster Award Michigan Student Symposium for Interdisciplinary Statistical Sciences, 2016.
- NSF Conference for Complex Systems Poster Award, 2015.
- Rackham School of Graduate Studies Conference Travel Award, 2015, 2016.
- Departmental Fellowship, University of Michigan, Ann Arbor, 2012-2013.
- Highest Distinction Graduation, Ho Chi Minh University of Science, 2011.
- Odon Vallet Scholarship, Ho Chi Minh University of Science, 2008-2011.
- Outstanding Student Scholarship, Department of Mathematics and Computer Science, Ho Chi Minh University of Science, 2008-2011.

Reviewers:

- Annals of Statistics.
- International Conference on Machine Learning (ICML).
- Neural Information Processing Systems (NIPS).

Professional Services:

- Student Assistant of the Nonparametric Statistics Workshop: Integration of Theory, Methods, and Applications, October 2016, Ann Arbor, Michigan
- Student Assistant of the Extreme Value Analysis Conference (EVA), Ann Arbor, Michigan, June 2015.

In case that you are interested in my research or want to chat with me about academic (non-academic) stuffs, you can reach me through email

Number of page view: