mamba paper Options
Jamba is actually a novel architecture developed with a hybrid transformer and mamba SSM architecture produced by AI21 Labs with fifty two billion parameters, making it the biggest Mamba-variant developed so far. it's a context window of 256k tokens.[twelve] Even though the recipe for ahead go should be defined within just this operate, just one s