Integrating Chilean music websites and databases into Musicbrainz
Over the last five years, several initiatives have been developed in Chile for creating databases and websites devoted to Chilean music. These projects have provide public access to information about music such as discographies, artist biographies, and albums and concert reviews. This information provides musicians with another way of promoting their work, especially for some artists and genres that are underrepresented in the Chilean media, such as those doing classical and folklore music. Furthermore, the information in these websites, mostly written by specialized music journalists, serves to preserve the Chilean music history. However, most of these initiatives have been funded by the Government of Chile and the Sociedad Chilena del Derecho de Autor (SCD, the Chilean copyright society) as short-term projects, so their resources are limited and will last only for a couple of years. In addition, the information in the websites or databases is not shared or linked among them, so it is not necessarily consistent and the efforts of the people working in these projects are not opti- mized. Thus, currently there is no way of doing a common query for looking and retrieving all data for a specific artist, album, or song.
To overcome the aforementioned problem, this project has as its main goals:
- To assign all Chilean music oeuvres, authors, and albums with an unique Musicbrainz identifier (MBID)
- To create a common repository of Chilean music-related data. By means of assigning these music entities with a unique identifier, they can be searched using publicly available application programming interfaces (APIs) that use MBID. Achieving these two goals, all Chilean artists, songs and albums will be searchable using MBIDs, and a unique query in a website will look for all related data and entities.
Methodology and milestones
This project will be structured in five major parts:
- A survey of the major sources of valuable data
- Data recollection and scrapping.
- The verification, correction and normalization of the data according to the Musicbrainz (MB) style guide
- The uploading of the corrected data to MB
- The development of a website for searching all the data in a centralized repository. Milestones for the project are:
The milestones for the project are:
- Survey of databases and websites with structured Chilean music data. Due on January 23.
- Data harvesting: in this stage, all the valuable information from the selected websites will be retrieved. Due on February 6.
- Data verification, correction and normalization according to the Musicbrainz style guide. Due on March 5.
- Corrected data upload to MB. Due on March 19.
- Website development: this last stage will be devoted to the creation of a website for searching and accessing the data for all Chilean music. The website will show all relevant data harvested from websites and databases in a centralized site. Due on April 16.
Availability of resources
The most valuable resource for the project is the “Base de Datos de la Música Chilena” BDCH, the Chilean music database, http://bdch.musica.cl/login.php), database administered and updated periodically by the SCD. Additional resources of data are the websites http://musicapopular.cl, http://mus.cl, http://portaldisc.cl and http://www.vccl.tv/wp2/, all Chilean-music centred sites with mostly up-to-date information.
Materials to be submitted at the end of the project
The material to be submitted at the end of the project will be the code for harvesting all the valuable information from the databases and websites, the code for interacting with the MB API, and a running website for doing queries about Chilean music entities that that will show all the already harvested information.
Substantial outcomes of this project will be:
- To assign all authors, songs, and albums from the Chilean music corpus with a unique MBID.
- To develop a website where a unique query about any Chilean music author, song, or album will retrieve information already harvested from several websites and databases. Also by using the MBIDs with other sites offering APIs, extra information will be retrieved.
The achievement of these goals could be the starting point for developing future projects with a whole music corpus, Chilean music, correctly catalogued.