From post

Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions

, и . Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), стр. 8048--8073. Dublin, Ireland, Association for Computational Linguistics, (мая 2022)
DOI: 10.18653/v1/2022.acl-long.554

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Optimizing the decomposition for multiple foreground cosegmentation., и . Comput. Vis. Image Underst., (2015)Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks., , , , , , , , , и 1 other автор(ы). CoRR, (2019)Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings., , , , , , и . TextGraphs@NAACL-HLT, стр. 38-48. Association for Computational Linguistics, (2018)To Copy, or not to Copy; That is a Critical Issue of the Output Softmax Layer in Neural Sequential Recommenders., , и . WSDM, стр. 67-76. ACM, (2024)Automatically Extracting Action Graphs from Materials Science Synthesis Procedures., , , , , , , , и . CoRR, (2017)Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks., , , , , , , , , и 1 other автор(ы). J. Chem. Inf. Model., 60 (3): 1194-1201 (2020)Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling., , , и . ACL (1), стр. 821-854. Association for Computational Linguistics, (2023)Unsupervised Partial Sentence Matching for Cited Text Identification., , , и . SDP@COLING, стр. 95-104. Association for Computational Linguistics, (2022)Overcoming Practical Issues of Deep Active Learning and its Applications on Named Entity Recognition., , , , и . CoRR, (2019)Superpixel-based large displacement optical flow., и . ICIP, стр. 3835-3839. IEEE, (2013)