Spaces:
Runtime error
Runtime error
File size: 5,746 Bytes
6abbc16 5a2dc1d 6abbc16 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
from qdrant_client import QdrantClient
from qdrant_client import models
from nicegui import ui
import pandas as pd
import os
(
ui.label('🎧 Music Search with Qdrant')
.style('color: #ab003c; font-size: 350%; font-weight: 450')
.classes('self-center')
)
ui.markdown("## 🎻 A New Way to Find Music 🎶").classes('self-center')
ui.markdown(
"""
🎯 **The purpose** of this app is to showcase one of the many (cool 😎 and fun 💃🏻🕺🏽) ways in which you can use [Qdrant](https://qdrant.tech/)
to conduct [semantic search](https://en.wikipedia.org/wiki/Semantic_search) using music data.
🥁 **The dataset** used for demo app is the [Ludwig Music Dataset (Moods and Subgenres)](https://www.kaggle.com/datasets/jorgeruizdev/ludwig-music-dataset-moods-and-subgenres)
which is freely available on Kaggle. I contains songs for the 9 genres shown below as well as some metadata which was
used as the [payload](https://qdrant.tech/documentation/concepts/payload/) for this app.
🤖 **The model** used to extract the embeddings is the [`panns_inference`](https://github.com/qiuqiangkong/panns_inference)
freely distributed as a Python library. Please note that, while these embeddings show a remarkable quality on the Ludwig
data, they have been taken as is out-of-the-box and have not been fine-tuned one on the Ludwig dataset.
You can evaluate the semantic search capabilities of our app by searching for songs you know and retrieving the
most similar ones. In addition, if you want to see similar songs in a different genre than the one of the song you
selected, pick one of the genres available below to see the results.
💽 **The songs** returned by Qdrant are being downloaded on-the-fly from an S3 bucket.
Each result will come back with a card containing an **image** of the artist, the **name** of the artist and the song,
the **similarity** score, and the **genre** of song (inconsistencies in the genre, e.g. Adele songs classified as "electronic"
are present in the dataset).
🖼️ **The images** shown in the cards represent the names of each artist in the dataset (~4400 unique ones) and these were
collected using the first result from a query sent to Bing's Image Search API. Hence, some images might not be the correct ones.
"""
).style("max-width: 1000px; font-size: 120%").classes('self-center')
metadata = pd.read_csv("payload.csv")
artist_song = sorted(metadata['artist_song'].tolist())
collection = "music_vectors"
client = QdrantClient(
"https://394294d5-30bb-4958-ad1a-15a3561edce5.us-east-1-0.aws.cloud.qdrant.io:6333",
api_key=os.environ['QDRANT_API_KEY'],
)
def create_music_card(qdrant_results):
for song in qdrant_results:
with ui.column():
with ui.card().tight().style("height: 350px; width: 300px"):
ui.image(song.payload['photos']).classes('w-[300px] h-[210px]')
with ui.card_section():
ui.label(f"Artist: {song.payload['artist']}")
ui.label(f"Song Name: {song.payload['name']}")
ui.label(f"Genre: {song.payload['genre']}")
try:
ui.label(f"Score: {song.score}")
except:
pass
first_song = ui.audio(song.payload['urls'])#.classes('w-64'):
first_song.on('ended', lambda _: ui.notify('Audio playback completed!'))
def get_vectors():
"""Callback function for our search box"""
song = song_selection.value.split(' - ')[-1] # get the name of the song selected
get_index = metadata.loc[metadata['name'] == song, 'index'].iloc[0] # get the index of such a song
# retrieve the vector and metadata associated with it
song_vector = client.retrieve(
collection_name=collection, ids=[int(get_index)], with_payload=True, with_vectors=True
)
# Clear the result from the previous artist selected
main_artist.clear()
with main_artist:
create_music_card(song_vector)
# Clear the result from the previous search request
results.clear()
with results:
if filters.value:
genre_filter = models.Filter(
must=[models.FieldCondition(key="genre", match=models.MatchValue(value=filters.value))]
)
music = client.search(
collection_name=collection, query_vector=song_vector[0].vector, query_filter=genre_filter, limit=num_songs.value
)[1:]
else:
music = client.search(collection_name=collection, query_vector=song_vector[0].vector, limit=num_songs.value)[1:]
create_music_card(music)
with ui.label("How many songs would you like to get back? 🤔").classes('w-200 self-center mt-10'):
num_songs = ui.slider(min=1, max=30, step=1, value=11)
ui.linear_progress().bind_value_from(num_songs, 'value')
with ui.label("Filters you can apply to your search 🔎").classes('w-200 self-center mt-5'):
filters = ui.radio([None] + metadata.genre.unique().tolist(), value=None).props('inline color=#ab003c')
song_selection = ui.select(
artist_song, value='Dave Van Ronk - Buckets of Rain', on_change=get_vectors
).style("width: 700px").classes('w-200 self-center mt-10 transition-all')
main_artist = ui.row().classes('w-full justify-center')
results = ui.row().classes('w-full justify-center')
ui.colors(
primary='#ab003c',
secondary='#2c387e',
accent='#f50057',
dark='#f73378',
positive='#f73378',
negative='#ba000d'
)
ui.run(
title='Qdrant for Music',
favicon='https://avatars.githubusercontent.com/u/73504361?s=280&v=4',
# dark=True
) |