Seminar: Stable Subsampling of Big Data under Covariate Shift
Seminar| Institute of Mathematical Sciences
Time:Tuesday, July 15th, 2025,10:00-11:00
Location:IMS, RS408
Speaker: Yongdao Zhou, Nankai University
周永道,南开大学统计与数据科学学院教授、博导、统计学系主任,入选国家高水平人才青年项目、天津市创新类领军人才、南开大学百青。研究方向为试验设计和大数据算法分析。主持过5项国家自然科学基金等20余项纵横向项目。曾访问加州大学洛杉矶分校、曼彻斯特大学等五所境外高校。在统计学和机器学习顶刊 JRSSB、JASA、Biometrika、JMLR、IEEE系列源刊及中国科学等国内外重要期刊发表学术论文80多篇;合作出版了9部中英文专著和教材。曾获全国统计科学研究优秀成果奖一等奖、全国统计科学技术进步奖三等奖、天津市教学成果奖特等奖、华为火花奖。现为天津市现场统计研究会理事长,中国数学会理事、均匀设计分会副理事长,泛华统计协会永久会员。
Abstract:The presence of data shift between training and test datasets, coupled with model misspecification, can lead to instability in regression predictions across diverse datasets. In this talk, we present a novel subsampling algorithm for stable prediction, which employs uniform design and confounder balancing methods. Theoretic analyses show that the uniform measure minimizes the maximum integral mean square error (MIMSE) and the global stability loss assesses the independence among variables in each candidate MIMSE-optimal subsampled sets. Numerical experiments conducted on synthetic and real-world datasets demonstrate the superiority of our proposed method over baseline approaches under model misspecification and covariate shift.