数学科学研究所
Insitute of Mathematical Science

Colloquium: Sparse Deep Neural Networks Through L_{1,\infty}-Weight Normalization

Colloquium| Institute of Mathematical Sciences

Time:16:00-17:00, June 24, Mondday

LocationRoom S407, IMS


Speaker: 杨周旺(中国科学技术大学)

    杨周旺,中国科学技术大学数学科学学院教授,博士生导师,现任大数据学院副院长。本科、硕士、博士均毕业于中国科学技术大学数学系。曾在韩国首尔国立大学从事博士后研究,在美国佐治亚理工学院工业与系统工程系进行学术访问研究。长期从事应用数学领域的研究,综合运用数据科学、计算几何、统计学、最优化等理论为解决相关问题建立新的数学模型,发展新方法。主要研究方向包括:数据驱动的最优化建模、稀疏优化、机器学习理论、宏观经济大数据建模分析、视频智能解析、工业定级智能算法等。曾主持和参加国家自然科学基金项目8项,授权和申请发明专利20多项,在国际学术期刊发表论文70多篇。2012年入选教育部新世纪优秀人才支持计划。2014年获得教育部自然科学奖二等奖(第三完成人)。2016年荣获中国运筹学会科学技术奖运筹应用奖(四年颁发一届)。2017年荣获中国数学会计算数学分会第二届“青年创新奖”。杨周旺教授所领导的数据科学团队与企业机构开展了多项大数据联合研究课题,已有部分成果产业化并进入实际工业应用。

  

Abstract: Deep neural networks have recently demonstrated an amazing performance on many challenging tasks. Overtting is one of the notorious features for DNNs. Empirical evidence suggests that inducing sparsity can relieve overtting, and weight normalization can accelerate the algorithm convergence. In this talk, we report our work on L_{1,\infty}-weight normalization for deep neural networks with bias neurons to achieve the sparse architecture. We theoretically establish the generalization error bounds for both regression and classication under the L_{1,\infty}-weight normalization. It is shown that the upper bounds are independent of the network width and sqrt(k)-dependence on the network depth k, which are the best available bounds for networks with bias neurons. These results provide theoretical justications on the usage of such weight normalization to reduce the generalization error. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. We perform various experiments to validate our theory and demonstrate the eectiveness of the resulting approach.

  


  


地址:上海市浦东新区华夏中路393号
邮编:201210
上海市徐汇区岳阳路319号8号楼
200031(岳阳路校区)