We purchase data feeds from a number of commercial data providers, including Xperi (formerly All Music Guide), Songkick, and LyricFind, among others. Licensed data is generally quite comprehensive, although its breadth is often limited.
Some providers like MusicBrainz and Discogs rely on contributions from their users for data. The key advantage of the crowdsourced model is that it provides access to data that isn’t available from commercial sources; there may not be a good economic reason to collect or create certain data, so it’s up to enthusiasts to do it, and they frequently do an outstanding job.
Roon users contribute their knowledge just by listening to their favorite music. Because they’re highly opinionated listeners, they plumb the depths of their favorite genres and styles every day, and those connections form the basis of Valence’s models and maps.
Valence has a secret weapon when it comes to clean data: the expertise of the Roon community. Comprised of music professionals, audiophiles, and self-professed music nerds, this passionate group has provided our localizations, internet radio directory, and is now beginning to create and curate new data used by Valence.
The data ingested by Valence come from rights holders (record labels), commercial data providers, not-for-profit data projects, and the Roon community. A piece of data in Valence can come into being from any one of those sources, growing and becoming more detailed as corroborating evidence of its accuracy is found in additional data sources. This unique approach to aggregation means that following a complete daily ingestion, Valence’s index is arguably the largest and most accurate in the world.
Valence uses an approach which takes into account file names, directory structure, tag data, and file length to deliver high-confidence album and track identifications to Roon.
Most music players interact with streaming serivces using an API for searching, browsing, and playback; they also depend on those API for most or all of their music metadata. Valence actually ingests the full catalogs of integrated music services every day, then determines (in advance) which of its own data should be applied to every stream from each integrated service.
Conventionally, “metadata” implies the name of an album and the tracks it contains. Valence goes further by capturing album- and track-level credits, recording dates, release and reissue dates, label, rating, and review, as well as distinguishing between different versions of an album.
In addition to recording data, Valence aggregates information about performers, composers, producers, and conductors, including their vital stats, biographies, and also places they’ve lived, bands or ensembles they’ve joined, social links, and upcoming concert dates.
In Valence’s data model, recordings (tracks) are instances of compositions, which is an important notion particularly when one artist covers a song by another artist, or (as is frequently true not only in classical and jazz, but also in rock and pop) the composer and performer aren’t the same person.
Valence produces internal popularity “charts” at all levels: artist, album, track, and composition. It also generates several more specific charts per genre, and a series dedicated to classical music.
What people do in music informs Valence’s model of the music world. For example, Trent Reznor has different relationships to music he creates when he is fronting Nine Inch Nails, collaborating on a film score with Atticus Ross, or producing an album for Halsey.
A sonata having three movements is no guarantee that there will be three corresponding tracks on an album which features that composition. Understanding the underlying works and their subdivisions lets Valence accurately portray covers of pop songs, American Songbook jazz standards, and multi-part classical works correctly.
Valence’s similarity model maps how similar certain artists, albums, and tracks are to other artists, albums, and tracks. It is based on both user behavior (people who like X also like Y) and expert opinion or ground truth.
Over the arc of an artist or composer’s career, patterns often appear illuinating ranges of particularly well-regarded material. Valence models these ranges to create a notion of the “heyday” – times which are especially notable in that person’s career.
Musicians often evolve over the course of their careers. Valence ascribes scores for “genreness” and “composerness” (among many other vectors) to help understand whether Taylor Swift is a Pop or Country artist, or whether Bob Dylan should be viewed as a performer or a songwriter.
Within the boundaries of metadata-driven connections, Valence uses its models to refine and produce a more nuanced and meaningful set of possible recommendations.
The context in which a recommendation is made can radically impact its accuracy. A great list of R&B albums would be completely different in the context of browsing Aretha Franklin than Frank Ocean; a list of notable pianists in the context of Chopin have nothing in common with those in the context of Post-bop Jazz.
Valence generates a private model, stored in each user’s profile, which documents the facets of the user’s taste based on their library and listening history. Recommendations are weighted taking this taste profile into account.
Factual information provides the foundation for recommendations; for example, knowing that two musicians have collaborated is the best evidence of a connection between them. Editorial information (like genre categorizations and album ratings) provides another dimension on which to weigh suggestions.
Irrespective of technical approaches to the underlying data science, domain knowledge in music is the backbone of Valence. What sets it apart is subtleties like the differences between 60s and contemporary R&B, or the fact that Taylor Swift started off country and ended up ruling the pop charts, or that you probably don’t want to hear Beethoven’s string quartets in between movements of his symphonies.
One of the most striking examples of Valence in Roon is the user interface elements that appear in response to context. For example, if you’re looking at an artist who has collaborated often, you may see “featured collaborators” or if the artist is part of a scene, you may see “other artists from Glasgow”. Roon contains hundreds of user interface elements which are only rendered in contexts with relevant supporting data.
Because local and streaming content are identified in the same way, both are enriched with metadata and interleaved in Roon. This eliminates content silos and allows true intention-driven browsing – the focus is on what you want to hear, not where that thing lives.
New releases are displayed in various contexts in Roon, and in each context the results are completely different. New releases on the home screen are filtered using the taste profile, but those on an artist screen additionally incorporate genre and the artist’s active years for greater relevance.
Every day, Valence produces six unique 25-track mixes for each Roon user. Each mix is themed on an artist that features prominently in the user’s listening history, and combines selections that are likely to be familiar with some that will challenge the listener’s taste.
Like any large system, the music streaming supply chain invites abuse. Many “bad actor” labels release inauthentic recordings in an effort to get users to listen to their streams. Valence allows Roon to filter out this low-quality content and display only genuine releases.
In Roon, Focus exposes Valence’s capabilities directly to the user, and enables multi-dimensional filtration of artist discographies and entire user libraries.
Traditionally, searching a set of local files is one thing, and searching a remote data set by API is another. Roon does both with a unified user interface by using Valance’s aggregated database and context-aware search, improving both accuracy and relevance.