This repo provides efficient implementations for emerging model architectures, with a focus on efficient sequence modeling (e.g., linear attention, state space models, and their hybrids). All ...