IMS Workshop on Statistics and Computing

发布部门：行政办公室(A) 浏览次数：60

Speaker:Xiyun Jiao (Southern University of Science and Technology, China)

Time:8:40-9:10

Title:Super-Efficient Markov Chain Monte Carlo Algorithms for Bayesian Inference in Population Genomics

Abstract:Bayesian inference in population genomics and phylogenomics under the multispecies coalescent (MSC) model involves intense computation. The mixing efficiency of Markov chain Monte Carlo (MCMC) algorithms is inversely proportional to the computational effort required to achieve a certain precision. We describe a few super-efficient MCMC algorithms, which achieve mixing efficiency higher than the independent sampler. The first, called the mirror-Metropolized-Gibbs (MMG) move, proposes new values of the ‘driver’ parameters around the mirror image of the current values, while ‘passenger’ parameters are sampled from their conditional distribution. The second, called Bactrian-uniform-reflection (BUR), combines a Bactrian sliding-window for the uniform target with variable transform to achieve super efficiency in the driver parameter. We apply the new algorithms to a problem in population genomics and the new algorithms performed favourably compared with other state-of-art MCMC algorithms.

Bio:Xiyun Jiao has been an assistant professor at Department of Statistics and Data Science in Southern University of Science and Technology, China since September, 2020. She obtained her Ph.D. degree in Statistics in 2017 from Imperial College London, and worked as a postdoctoral research associate at Bioscience division, University College London from 2017 to 2020. The research interests of Dr. Jiao include computational statistics, Bayesian statistics, probabilistic and statistical methods in population genetics and phylogenetics.

Speaker: Kai Kang (Sun Yat-sen University, China)

Time: 9:10-9:40

Title: A Blockwise Mixed Membership Model for Multivariate Longitudinal Data: Discovering Clinical Heterogeneity and Identifying Parkinson’s Disease Subtypes

Abstract:Current diagnosis and prognosis for Parkinson’s disease (PD) face formidable challenges due to the heterogeneous nature of the disease course, including that (i) the impairment severity varies hugely between patients, (ii) whether a symptom occur independently or co-occurs with related symptoms differs significantly, and (iii) repeated symptom measurements exhibit substantial temporal dependence. To tackle these challenges, we propose a novel blockwise mixed membership model (BM3) to systematically unveil between-patient, between-symptom, and between-time clinical heterogeneity within PD. The key idea behind BM3 is to partition multivariate longitudinal measurements into distinct blocks, enabling measurements within each block to share a common latent membership while allowing latent memberships to vary across blocks. Consequently, the heterogeneous PD-related measurements across time are divided into clinically homogeneous blocks consisting of correlated symptoms and consecutive time. From the analysis of Parkinson’s Progression Markers Initiative data (n = 1, 531), we discover three typical disease profiles (stages), four symptom groups (i.e., autonomic function, tremor, left-side and right-side motor function), and two periods, advancing the comprehension of PD heterogeneity. Moreover, we identify several clinically meaningful PD subtypes by summarizing the blockwise latent memberships, paving the way for developing more precise and targeted therapies to benefit patients. Our findings are validated using external variables, successfully reproduced in validation datasets, and compared with existing methods. Theoretical results of model identifiability further ensure the reliability and reproducibility of latent structure discovery in PD.

Bio: Kai Kang is an Associate Professor at School of Mathematics, Sun Yat-sen University. He received his undergraduate degree from Sun Yat-sen University in 2015, and his Ph.D. degree from the Chinese University of Hong Kong in 2020. He did his postdoctoral fellowship at Columbia University from 2020-2022 and joined the School of Mathematics, Sun Yat-sen University in July 2022. His main research interests include latent variable modeling, Bayesian analysis. He has published papers as first/corresponding author in Annals of Applied Statistics, Journal of Computational and Graphical Statistics, Journal of Multivariate Analysis, Statistics in Medicine.

Speaker:Jingsi Ming (East China Normal University, China)

Time:9:40-10:10

Title: Spatial transcriptomics deconvolution using graph convolutional networks with adversarial discriminative domain adaptation

Abstract: The rapid advancement of spatially resolved transcriptomics has substantially improved our understanding of the spatial architecture and gene expression heterogeneity within tissues. However, many spatial transcriptomics techniques can not reach single-cell resolution, instead measuring gene expression profiles from mixtures of potentially heterogeneous cell types. Here we propose AddaGCN, a robust deconvolution method to infer cell type composition from spatial transcriptomic data. AddaGCN leverages graph convolutional networks to incorporate spatial information and employs an adversarial discriminative domain adaptation approach to mitigate batch effects between spatial and single-cell reference data. Comprehensive real data analyses demonstrates AddaGCN's flexibility to diverse datasets generated by various technology platforms, and underscore its superior performance and robustness in cell-type deconvolution compared to other methods.

Bio: Jingsi Ming is an assistant professor at the School of Statistics, East China Normal University. She received her PhD from Hong Kong Baptist in 2018, and worked as a postdoctoral researcher at the Hong Kong University of Science and Technology from 2018-2020. She joined East China Normal University in September 2020. Her main research interests include statistical genetics, bioinformatics, statistical machine learning, etc. Her research have been published in Briefings in Bioinformatics, Bioinformatics, Annals of Applied Statistics, Nature Computational Science.

Speaker: Bohai Zhang (BNU-HKBU United International College, China)

Time: 10:30-11:00

Title: Functional Bayesian Additive Regression Trees with Shape Constraints

Abstract: Motivated by the great success of Bayesian additive regression trees (BART) on regression, we propose a nonparametric Bayesian approach for the function-on-scalar regression problem, termed as Functional BART (FBART). Utilizing spline-based function representation and tree-based domain partition model, FBART offers great flexibility in characterizing the complex and heterogeneous relationship between the response curve and scalar covariates. We devise a tailored Bayesian backfitting algorithm for estimating the parameters in the FBART model. Furthermore, we introduce an FBART model with shape constraints on the response curve, enhancing estimation and prediction performance when prior shape information of response curves is available. Our proposed FBART model and its shape-constrained version are the new advances of BART models for functional data. Under certain regularity conditions, we derive the posterior convergence results for both FBART and its shape-constrained version. Finally, the superiority of the proposed methods over other competitive counterparts is validated through simulation experiments under various settings and analyses of two real datasets.

Bio: Dr. Bohai Zhang is currently an Associate Professor of Department of Statistics and Data Science at BNU-HKBU United International College (UIC). He obtained the Ph.D. degree in Statistics at Texas A&M University, and worked as a postdoctoral research fellow at University of Wollongong, Australia. Prior to joining UIC, he was an Assistant Professor at Nankai University. His research interests include Spatial/Spatio-Temporal Statistics, Bayesian Statistics, and statistical applications to remote sensing and geophysical datasets. He has published several papers on international statistical and geoscience journals.

Speaker: Xiaotian Zheng (University of Wollongong, Australia)

Time:11:00-11:30

Title:Bayesian geostatistical modeling for discrete-valued processes

Abstract: In this talk, I will introduce a flexible and scalable class of Bayesian geostatistical models for discrete data, based on nearest-neighbor mixture processes (NNMP), referred to as discrete NNMP. To define the joint probability mass function (pmf) over a set of spatial locations, we build from local mixtures of conditional pmfs using a directed graphical model, with a directed acyclic graph that summarizes the nearest neighbor structure. The approach supports direct, flexible modeling for multivariate dependence through specification of general bivariate discrete distributions that define the conditional pmfs. In particular, we develop a modeling and inferential framework for copula-based NNMPs that can attain flexible dependence structures, motivating the use of bivariate copula families for spatial processes. Moreover, the framework allows for construction of models given a pre-specified family of marginal distributions that can vary in space, facilitating covariate inclusion. Compared to the traditional class of spatial generalized linear mixed models, where spatial dependence is introduced through a transformation of response means, our process-based modeling approach provides both computational and inferential advantages. We illustrate the methodology with synthetic data examples and an analysis of North American Breeding Bird Survey data. This is joint work with Athanasios Kottas and Bruno Sansó from UC Santa Cruz. A paper related to this talk received the 2023 Wiley-TIES Best Environmetrics Paper Award.

Bio: Dr Xiaotian Zheng is Research Fellow at the University of Wollongong, as part of Securing Antarctica's Environmental Future (SAEF), an Australian Research Council Special Research Initiative. Before moving to Wollongong, Xiaotian received his Ph.D. in Statistical Science from the University of California, Santa Cruz. He has a broad interest in parametric and nonparametric methods for complex and dependent data. His current research at SAEF involves climate downscaling, spatial modeling of biodiversity, and spatial transfer learning.

Speaker: Shijia Wang (ShanghaiTech University, China)

Time: 11:30-12:00

Title: Improving approximate Bayesian computation methods for complex data

Abstract: Approximate Bayesian computation (ABC) is a class of Bayesian inference algorithms that targets problems with intractable or unavailable likelihood functions. It uses synthetic data drawn from the simulation model to approximate the posterior distribution. Firstly, we propose an early rejection Markov chain Monte Carlo (ejMCMC) sampler based on Gaussian processes to accelerate inference speed. Secondly, we propose a novel Global-Local ABC-MCMC algorithm that combines the ``exploration capabilities of global proposals with the ``exploitation finesse of local proposals.

Bio: Shijia Wang is an assistant professor in the Institute of Mathematical Sciences, ShanghaiTech University. He received his PhD degree from Simon Fraser University, Canada. His main research interests include Bayesian Statistics, machine learning and genetics.