Autor(es):
Silva, Luís ; Oliveira, Francisco ; Gomes, Ivan ; Araújo, C. Mendes ; Oliveira, João
Data: 2025
Identificador Persistente: https://hdl.handle.net/1822/98210
Origem: RepositóriUM - Universidade do Minho
Assunto(s): Visual Search; Deep learning; Outfit; BiLSTM; CNN; Compatibility learning; Transformer; Similarity learning
Descrição
In the ever-evolving world of fashion, building the perfect outfit can be a challenge. We propose a fashion recommendation system, which we call Visual Search, that uses computer vision and deep learning to ensure a coordinated set of fashion recommendations. The system allows users to upload a single photo of their outfit, where a pretrained YOLO model, further fine-tuned on a dataset of labeled clothing items, detects and crops the individual clothing pieces. These pieces are then fed into a compatibility model, comprising a Convolutional Neural Network and bidirectional Long Short Term Memory to generate the most compatible chosen/missing piece. To complete the recommendation process, we incorporated a similarity model based on Vision Transformer. This model meticulously compares the generated image to a given catalog of products, selecting the one that most closely matches the generated image in terms of visual features.