数学科学研究所
Insitute of Mathematical Science

Applied Mathematical Seminar94:Theoretical Foundations of Deep Selective State-Space Models

Seminar| Institute of Mathematical Sciences

Time: Monday, November 25th, 2024 , 16:00-17:00

LocationIMS, RS408

Speaker:Nicola Muca Cirone,Imperial College London

 

Abstract:Structured SSMs are proving highly effective for sequential data. Recent advancements show that SSMs with a selectivity mechanism—enabling multiplicative interactions between inputs and hidden states (e.g., in Mamba, GLA, Hawk/Griffin, HGRN2)—can surpass attention-based models on language tasks in both accuracy and efficiency, even at billion-parameter scales. Using Rough Path Theory, we provide a theoretical foundation for this selectivity mechanism, often linked to in-context learning.


地址:上海市浦东新区华夏中路393号
邮编:201210
上海市徐汇区岳阳路319号8号楼
200031(岳阳路校区)