Author of the publication

Sinkformers: Transformers with Doubly Stochastic Attention

, , , and . (2021)cite arxiv:2110.11773Comment: Accepted at AISTATS.

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Sparse Continuous Distributions and Fenchel-Young Losses., , , , , , and . CoRR, (2021)Learning with Differentiable Perturbed Optimizers., , , , , and . CoRR, (2020)Online Passive-Aggressive Algorithms for Non-Negative Matrix Factorization and Completion., , and . AISTATS, volume 33 of JMLR Workshop and Conference Proceedings, page 96-104. JMLR.org, (2014)SVD-Based Screening for the Graphical Lasso., , , , , , and . IJCAI, page 1682-1688. ijcai.org, (2017)Implicit differentiation of Lasso-type models for hyperparameter optimization., , , , , and . ICML, volume 119 of Proceedings of Machine Learning Research, page 810-821. PMLR, (2020)Block coordinate descent algorithms for large-scale sparse multiclass classification., , and . Mach. Learn., 93 (1): 31-52 (2013)Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms., , , and . ICML, volume 48 of JMLR Workshop and Conference Proceedings, page 850-858. JMLR.org, (2016)Scikit-learn: Machine Learning in Python., , , , , , , , , and 6 other author(s). J. Mach. Learn. Res., (2011)Routers in Vision Mixture of Experts: An Empirical Study., , , and . CoRR, (2024)Implicit Diffusion: Efficient Optimization through Stochastic Sampling., , , , , , , , and . CoRR, (2024)