Article,

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity.

, , and .
J. Mach. Learn. Res., (2022)

Meta data

Tags

Users

  • @dblp

Comments and Reviews