@brusilovsky

Together Yet Apart: Multimodal Representation Learning for Personalised Visual Art Recommendation

, and . Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, page 204-214. ACM, (June 2023)
DOI: 10.1145/3565472.3592964

Abstract

With the advent of digital media, the availability of art content has greatly expanded, making it increasingly challenging for individuals to discover and curate works that align with their personal preferences and taste. The task of providing accurate and personalized Visual Art (VA) recommendations is thus a complex one, requiring a deep understanding of the intricate interplay of multiple modalities such as image, textual descriptions, or other metadata. In this paper, we study the nuances of modalities involved in the VA domain (image and text) and how they can be effectively harnessed to provide a truly personalized art experience to users. Particularly, we develop four fusion-based multimodal VA recommendation pipelines and conduct a large-scale user-centric evaluation. Our results indicate that early fusion (i.e, joint multimodal learning of visual and textual features) is preferred over a late fusion of ranked paintings from unimodal models (state-of-the-art baselines) but only if the latent representation space of the multimodal painting embeddings is entangled. Our findings open a new perspective for a better representation learning in the VA RecSys domain.

Description

Together Yet Apart: Multimodal Representation Learning for Personalised Visual Art Recommendation | Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization

Links and resources

Tags

community

  • @brusilovsky
  • @dblp
@brusilovsky's tags highlighted