Seminar| Institute of Mathematical Sciences
Time: Monday, November 25th, 2024 , 16:00-17:00
Location: IMS, RS408
Speaker:Nicola Muca Cirone,Imperial College London
Abstract:Structured SSMs are proving highly effective for sequential data. Recent advancements show that SSMs with a selectivity mechanism—enabling multiplicative interactions between inputs and hidden states (e.g., in Mamba, GLA, Hawk/Griffin, HGRN2)—can surpass attention-based models on language tasks in both accuracy and efficiency, even at billion-parameter scales. Using Rough Path Theory, we provide a theoretical foundation for this selectivity mechanism, often linked to in-context learning.