You can download the raw source code for these lecture notes here.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.0 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.1 ✔ tibble 3.2.0
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(spotifyr)
library(compmus)
Look at one (or more) of the self-similarity matrices from somebody’s portfolio in your group. Discuss what you think a spectrum-based novelty function would look like for this track. Listen to (some of) the track and also discussion what you think an energy-based novelty function would look like. Which one do you think would be most useful for beat tracking, and why?
For novelty functions, we want to work directly with the segments, and not summarise them at higher levels like Spotify’s own estimates of bar or beat.
pata_pata <-
get_tidy_audio_analysis("3uy90vHHATPjtdilshDQDt") |>
select(segments) |>
unnest(segments)
We can compute an energy-based novelty function based on Spotify’s loudness estimates. The tempo of this piece is about 126 BPM: how well does this technique work?
pata_pata |>
mutate(loudness_max_time = start + loudness_max_time) |>
arrange(loudness_max_time) |>
mutate(delta_loudness = loudness_max - lag(loudness_max)) |>
ggplot(aes(x = loudness_max_time, y = pmax(0, delta_loudness))) +
geom_line() +
xlim(0, 30) +
theme_minimal() +
labs(x = "Time (s)", y = "Novelty")
## Warning: Removed 576 rows containing missing values (`geom_line()`).
We can use similar approaches for chromagrams and cepstrograms. In the case of chromagrams, Aitchison’s clr transformation gives more sensible differences between time points than other normalisations. Even with these helpful transformations, however, self-similarity matrices tend to be more helpful visualisations of chroma and timbre from the Spotify API.
pata_pata |>
mutate(pitches = map(pitches, compmus_normalise, "clr")) |>
arrange(start) |>
mutate(pitches = map2(pitches, lag(pitches), `-`)) |>
slice(-1) |>
compmus_gather_chroma() |>
group_by(start, duration) |>
summarise(novelty = sum(log1p(pmax(value, 0)))) |>
ggplot(aes(x = start + duration / 2, y = novelty)) +
geom_line() +
xlim(0, 30) +
theme_minimal() +
labs(x = "Time (s)", y = "Novelty")
## `summarise()` has grouped output by 'start'. You can override using the
## `.groups` argument.
## Warning: Removed 575 rows containing missing values (`geom_line()`).
pata_pata |>
arrange(start) |>
mutate(timbre = map2(timbre, lag(timbre), `-`)) |>
slice(-1) |>
compmus_gather_timbre() |>
group_by(start, duration) |>
summarise(novelty = sum(log1p(pmax(value, 0)))) |>
ggplot(aes(x = start + duration / 2, y = novelty)) +
geom_line() +
xlim(0, 30) +
theme_minimal() +
labs(x = "Time (s)", y = "Novelty")
## `summarise()` has grouped output by 'start'. You can override using the
## `.groups` argument.
## Warning: Removed 575 rows containing missing values (`geom_line()`).
Listen to the first 30 seconds of ‘Pata Pata’ and discuss the three representations above. Do any of them seem useful?
Find a Spotify track that has a regular tempo but lacks percussion
(e.g., much Western classical music), and compute the above three
representations. (Change the xlim()
line if you want to
look at a different portion of your track.) How do they differ from what
you see for ‘Pata Pata’? Be prepared to show your best ‘novelty-gram’ to
the class.
Spotify does not make the novelty function underlying their own tempo
analysis available to the public, but we can still use onsets of every
segment to generate Fourier tempograms. The tempogram()
function from compmus
generates this automatically from an
audio analysis, ready to plot with geom_raster
(a faster
version of geom_tile
for when every segment has the same
length). Here is an example of ‘Samba do outro lugar’, a track from the
Brazilian indie band Graveola that features several tempo and metre
alternations. Be warned that computing tempograms can be slow!
graveola <- get_tidy_audio_analysis("6PJasPKAzNLSOzxeAH33j2")
graveola |>
tempogram(window_size = 8, hop_size = 1, cyclic = FALSE) |>
ggplot(aes(x = time, y = bpm, fill = power)) +
geom_raster() +
scale_fill_viridis_c(guide = "none") +
labs(x = "Time (s)", y = "Tempo (BPM)") +
theme_classic()
The textbook notes that Fourier-based tempograms tend to pick up strongly on tempo harmonics. Wrapping into a cyclic tempogram can be more informative.
graveola |>
tempogram(window_size = 8, hop_size = 1, cyclic = TRUE) |>
ggplot(aes(x = time, y = bpm, fill = power)) +
geom_raster() +
scale_fill_viridis_c(guide = "none") +
labs(x = "Time (s)", y = "Tempo (BPM)") +
theme_classic()
Return to the track you discussed in Breakout 1 (or choose a new track that somebody in your group loves). Compute regular and cyclic tempograms for this track. - How well do they work? - Do you see more tempo harmonics or more tempo sub-harmonics? Is that what you expected? Why? - Try other tracks as time permits, and be prepared to share your most interesting tempogram with the class.
Let’s try to identify some of the features that Spotify uses to designate playlists as ‘workout’ playlists. For a full analysis, we would need to delve deeper, but let’s start with a comparison of three playlists: Indie Pop, Indie Party, and Indie Running For speed, this example will work with only the first 20 songs from each playlist, but you should feel free to use more if your computer can handle it.
pop <-
get_playlist_audio_features("spotify", "37i9dQZF1DWWEcRhUVtL8n") |>
slice(1:20) |>
add_audio_analysis()
party <-
get_playlist_audio_features("spotify", "37i9dQZF1DWTujiC7wfofZ") |>
slice(1:20) |>
add_audio_analysis()
workout <-
get_playlist_audio_features("spotify", "37i9dQZF1DWZq91oLsHZvy") |>
slice(1:20) |>
add_audio_analysis()
We bind the three playlists together using the trick from earlier in
the course, transpose the chroma vectors to a common tonic using the
compmus_c_transpose
function, and then summarise the
vectors like we did when generating chromagrams and cepstrograms. Again,
Aitchison’s clr transformation can help with chroma.
indie <-
bind_rows(
pop |> mutate(playlist = "Indie Pop"),
party |> mutate(playlist = "Indie Party"),
workout |> mutate(playlist = "Indie Running")
) |>
mutate(playlist = factor(playlist)) |>
mutate(segments = map2(segments, key, compmus_c_transpose)) |>
mutate(
pitches =
map(segments,
compmus_summarise, pitches,
method = "mean", norm = "manhattan"
),
timbre =
map(
segments,
compmus_summarise, timbre,
method = "mean"
)
) |>
mutate(pitches = map(pitches, compmus_normalise, "clr")) |>
mutate_at(vars(pitches, timbre), map, bind_rows) |>
unnest(cols = c(pitches, timbre))
Although the novelty-based transformations of chroma and timbre features are not always useful for visualisations, they can be very useful for classification (next week). Both ‘deltas’ and ‘delta-deltas’, especially for timbre features, are in regular use in music information retrieval. The code example below shows how to compute average delta chroma and timbre features instead of the ordinary average. Can you add delta-deltas, too? Can you use a visualisation to find any patterns in the data?
indie_deltas <-
pop |>
mutate(playlist = "Indie Pop") |>
bind_rows(
party |> mutate(playlist = "Indie Party"),
workout |> mutate(playlist = "Indie Workout")
) |>
mutate(playlist = factor(playlist)) |>
mutate(segments = map2(segments, key, compmus_c_transpose)) |>
mutate(
segments =
map(
segments,
mutate,
pitches = map(pitches, compmus_normalise, "manhattan")
)
) |>
mutate(
segments =
map(
segments,
mutate,
pitches = map2(pitches, lag(pitches), `-`)
)
) |>
mutate(
segments =
map(
segments,
mutate,
timbre = map2(timbre, lag(timbre), `-`)
)
) |>
mutate(
segments =
map(
segments,
slice,
-1
)
) |>
mutate(
pitches =
map(segments,
compmus_summarise, pitches,
method = "mean", na.rm = TRUE
),
timbre =
map(
segments,
compmus_summarise, timbre,
method = "mean", na.rm = TRUE
)
) |>
mutate_at(vars(pitches, timbre), map, bind_rows) |>
unnest(cols = c(pitches, timbre))
indie_deltas
## # A tibble: 60 × 101
## playlist…¹ playl…² playl…³ playl…⁴ playl…⁵ dance…⁶ energy key loudn…⁷ mode
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <int> <dbl> <int>
## 1 37i9dQZF1… Indie … https:… Spotify spotify 0.595 0.891 0 -3.93 1
## 2 37i9dQZF1… Indie … https:… Spotify spotify 0.601 0.37 10 -10.5 1
## 3 37i9dQZF1… Indie … https:… Spotify spotify 0.747 0.655 6 -6.90 1
## 4 37i9dQZF1… Indie … https:… Spotify spotify 0.687 0.768 9 -5.90 1
## 5 37i9dQZF1… Indie … https:… Spotify spotify 0.516 0.322 9 -11.8 1
## 6 37i9dQZF1… Indie … https:… Spotify spotify 0.533 0.74 2 -5.77 1
## 7 37i9dQZF1… Indie … https:… Spotify spotify 0.806 0.507 11 -6.43 1
## 8 37i9dQZF1… Indie … https:… Spotify spotify 0.548 0.456 9 -5.98 1
## 9 37i9dQZF1… Indie … https:… Spotify spotify 0.656 0.838 5 -8.41 1
## 10 37i9dQZF1… Indie … https:… Spotify spotify 0.392 0.541 1 -8.68 1
## # … with 50 more rows, 91 more variables: speechiness <dbl>,
## # acousticness <dbl>, instrumentalness <dbl>, liveness <dbl>, valence <dbl>,
## # tempo <dbl>, track.id <chr>, analysis_url <chr>, time_signature <int>,
## # added_at <chr>, is_local <lgl>, primary_color <lgl>, added_by.href <chr>,
## # added_by.id <chr>, added_by.type <chr>, added_by.uri <chr>,
## # added_by.external_urls.spotify <chr>, track.artists <list>,
## # track.available_markets <list>, track.disc_number <int>, …