Valence
Roon adds value to music by using data, identification, and machine learning to provide rich context for music, and to help listeners engage with it more deeply. Valence is the technology stack that makes this possible.
Contribute

The transition from physical to digital media holds huge promise for making music experiences richer, but so far that promise has mostly been unfulfilled. The reason (we think) is that there’s more to being a music fan than listening to audio. Fans want to know about music, and understand the people who make it; they want to form intellectual and emotional connections that enrich their own life experiences.

Physical formats like vinyl and CD had liner notes that contained a wealth of information, but downloads and streams have stripped those away. Our goal was to start by restoring all that data, and then go further by leveraging new technologies to accomplish more than was ever possible with physical media.

Valence allows us to do that in several steps. We begin by aggregating various data sources, including commercial metadata, crowdsourced contributions, and listening history from expert listeners. Next we identify every data object the system encounters, whether that’s a file on a hard drive, a stream from a music service, or a mention of a producer in an article about an album. Using these building blocks, Valence forms a database not just of artists, albums, and tracks, but also of composers, works, performances, conductors, ensembles, soloists, labels, and collaborators. This database allows us to build an understanding of the relationships between musical entities, their popularity, similarity, and categories into which they naturally fall. The result is a recommendation methodology that consistently takes into account authoritative metadata, expert opinion, popularity, context awareness, and taste profiles for individual users.

Aggregation

There are many sources of music metadata. Some offer breadth (coverage of a large number of releases) while others have depth (richly detailed or specific data), but no single data source has everything. Numerous specialized sources have just a single data type (concerts or lyrics, for example).

The foundation of Valence – its music database – is created by combining multiple data sets from different sources. Every day, terabytes of information are ingested, deduplicated, and disambiguated; the result is an exhaustive compendium of recordings, performers, composers, and compositions.

Licensed data

We purchase data feeds from a number of commercial data providers, including Xperi (formerly All Music Guide), Songkick, and LyricFind, among others. Licensed data is generally quite comprehensive, although its breadth is often limited.

Crowdsourced data

Some providers like MusicBrainz and Discogs rely on contributions from their users for data. The key advantage of the crowdsourced model is that it provides access to data that isn’t available from commercial sources; there may not be a good economic reason to collect or create certain data, so it’s up to enthusiasts to do it, and they frequently do an outstanding job.

Listening history

Roon users contribute their knowledge just by listening to their favorite music. Because they’re highly opinionated listeners, they plumb the depths of their favorite genres and styles every day, and those connections form the basis of Valence’s models and maps.

Community contribution

Valence has a secret weapon when it comes to clean data: the expertise of the Roon community. Comprised of music professionals, audiophiles, and self-professed music nerds, this passionate group has provided our localizations, internet radio directory, and is now beginning to create and curate new data used by Valence.

Identification

Because Valence is designed to be a map of the entire corpus of music ever made, it must be able to identify every recording that someone might listen to. The music data available from any given source (even record labels) is also inherently imperfect, so Valence assumes that none of them are authoritative.

Data identification

The data ingested by Valence come from rights holders (record labels), commercial data providers, not-for-profit data projects, and the Roon community. A piece of data in Valence can come into being from any one of those sources, growing and becoming more detailed as corroborating evidence of its accuracy is found in additional data sources. This unique approach to aggregation means that following a complete daily ingestion, Valence’s index is arguably the largest and most accurate in the world.

File identification

Valence uses an approach which takes into account file names, directory structure, tag data, and file length to deliver high-confidence album and track identifications to Roon.

Stream identification

Most music players interact with streaming serivces using an API for searching, browsing, and playback; they also depend on those API for most or all of their music metadata. Valence actually ingests the full catalogs of integrated music services every day, then determines (in advance) which of its own data should be applied to every stream from each integrated service.

A richer schema

Valence’s aggregated database is the holy grail: huge breadth of coverage with depth and richness of data.

Recording data

Conventionally, “metadata” implies the name of an album and the tracks it contains. Valence goes further by capturing album- and track-level credits, recording dates, release and reissue dates, label, rating, and review, as well as distinguishing between different versions of an album.

Biographical data

In addition to recording data, Valence aggregates information about performers, composers, producers, and conductors, including their vital stats, biographies, and also places they’ve lived, bands or ensembles they’ve joined, social links, and upcoming concert dates.

Composition data

In Valence’s data model, recordings (tracks) are instances of compositions, which is an important notion particularly when one artist covers a song by another artist, or (as is frequently true not only in classical and jazz, but also in rock and pop) the composer and performer aren’t the same person.

Modeling & synthesis

On a foundation of metadata, Valence builds models of relationships between pieces of music and the people who create them. These models make it possible to understand the music world from the perspective of a fan or expert listener.

Granular popularity

Valence produces internal popularity “charts” at all levels: artist, album, track, and composition. It also generates several more specific charts per genre, and a series dedicated to classical music.

Roles & relationships

What people do in music informs Valence’s model of the music world. For example, Trent Reznor has different relationships to music he creates when he is fronting Nine Inch Nails, collaborating on a film score with Atticus Ross, or producing an album for Halsey.

Composition-recording mappings

A sonata having three movements is no guarantee that there will be three corresponding tracks on an album which features that composition. Understanding the underlying works and their subdivisions lets Valence accurately portray covers of pop songs, American Songbook jazz standards, and multi-part classical works correctly.

Similarity model

Valence’s similarity model maps how similar certain artists, albums, and tracks are to other artists, albums, and tracks. It is based on both user behavior (people who like X also like Y) and expert opinion or ground truth.

Artist "heyday"

Over the arc of an artist or composer’s career, patterns often appear illuinating ranges of particularly well-regarded material. Valence models these ranges to create a notion of the “heyday” – times which are especially notable in that person’s career.

X-ness

Musicians often evolve over the course of their careers. Valence ascribes scores for “genreness” and “composerness” (among many other vectors) to help understand whether Taylor Swift is a Pop or Country artist, or whether Bob Dylan should be viewed as a performer or a songwriter.

Recommendation

Valence takes five factors into account when providing music recommendations. Used together, these allow for musically sensitive user-centric suggestions that are uncanny in their accuracy.

Relevance & popularity models

Within the boundaries of metadata-driven connections, Valence uses its models to refine and produce a more nuanced and meaningful set of possible recommendations.

Context-awareness

The context in which a recommendation is made can radically impact its accuracy. A great list of R&B albums would be completely different in the context of browsing Aretha Franklin than Frank Ocean; a list of notable pianists in the context of Chopin have nothing in common with those in the context of Post-bop Jazz.

Taste profile

Valence generates a private model, stored in each user’s profile, which documents the facets of the user’s taste based on their library and listening history. Recommendations are weighted taking this taste profile into account.

Authoritative metadata

Factual information provides the foundation for recommendations; for example, knowing that two musicians have collaborated is the best evidence of a connection between them. Editorial information (like genre categorizations and album ratings) provides another dimension on which to weigh suggestions.

Expert opinion

Irrespective of technical approaches to the underlying data science, domain knowledge in music is the backbone of Valence. What sets it apart is subtleties like the differences between 60s and contemporary R&B, or the fact that Taylor Swift started off country and ended up ruling the pop charts, or that you probably don’t want to hear Beethoven’s string quartets in between movements of his symphonies.

Valence at work in Roon
Data-aware user interface

One of the most striking examples of Valence in Roon is the user interface elements that appear in response to context. For example, if you’re looking at an artist who has collaborated often, you may see “featured collaborators” or if the artist is part of a scene, you may see “other artists from Glasgow”. Roon contains hundreds of user interface elements which are only rendered in contexts with relevant supporting data.

Interleaved library

Because local and streaming content are identified in the same way, both are enriched with metadata and interleaved in Roon. This eliminates content silos and allows true intention-driven browsing – the focus is on what you want to hear, not where that thing lives.

New Releases

New releases are displayed in various contexts in Roon, and in each context the results are completely different. New releases on the home screen are filtered using the taste profile, but those on an artist screen additionally incorporate genre and the artist’s active years for greater relevance.

Daily Mixes

Every day, Valence produces six unique 25-track mixes for each Roon user. Each mix is themed on an artist that features prominently in the user’s listening history, and combines selections that are likely to be familiar with some that will challenge the listener’s taste.

Spam filter

Like any large system, the music streaming supply chain invites abuse. Many “bad actor” labels release inauthentic recordings in an effort to get users to listen to their streams. Valence allows Roon to filter out this low-quality content and display only genuine releases.

Focus

In Roon, Focus exposes Valence’s capabilities directly to the user, and enables multi-dimensional filtration of artist discographies and entire user libraries.

Search

Traditionally, searching a set of local files is one thing, and searching a remote data set by API is another. Roon does both with a unified user interface by using Valance’s aggregated database and context-aware search, improving both accuracy and relevance.