Extracting and parsing spanish-formatted dates

We have collected birth and death dates for many of the Chilean artists that belong to our databases. Most of this data comes from musicapopular.cl. However, this data doesn’t have any formatting rules and it seems that different people entered data with different criteria. In other words, there are different styles of spanish dates.

As MB supports fields for date periods in the form YYYY-MM-DD, we developed a script using regular expression to parse all dates to this format. Done.

Digging into MusicBrainz NGS webservices

I have been dealing with the following problem: when searching the ‘Dogma’ artist in the MB website, I obtain

For our project I am only interested in the Chilean artist, which is the one that has a disambiguation comment field with the ‘Chilean artist’ note.

However, when searching the ‘Dogma’ artist using the musicbrainzngs.search_artist method, it outputs:

{'artist-list': [{'alias-list': ['Dogma'],
'id': '87373e74-74ca-4a0e-af24-2e17ab83f6f5',
'name': 'Dogma',
'sort-name': 'Dogma',
'type': 'Group'},
{'alias-list': ['Dogma'],
'id': '30fad333-2d95-4650-b27e-7c3147254105',
'name': 'Dogma',
'sort-name': 'Dogma',
'type': 'Group'},
{'alias-list': ['Dogma'],
'id': '66b7aa34-3117-42d7-b108-942ba99ba30b',
'name': 'Dogma',
'sort-name': 'Dogma'},
{'alias-list': ['Dogma'],
'id': '2b582ed9-2776-4f9f-9895-3ee0e9962f8e',
'name': 'Dogma',
'sort-name': 'Dogma'},
{'alias-list': [u'D\xf8gma'],
'id': '02a66935-f631-43cf-9788-15ef1e19f28a',
'name': u'D\xf8gma',
'sort-name': u'D\xf8gma'},
{'alias-list': ['Dogma'],
'id': '5839ff7d-88af-45c6-be93-a8f29b276f70',
'name': 'Dogma',
'sort-name': 'Dogma'},
{'alias-list': ['Dogma'],
'id': 'a6746c54-bdbc-4691-b8f5-8dabfab788cd',
'life-span': {'begin': '1996', 'end': '2003'},
'name': 'Dogma',
'sort-name': 'Dogma',
'type': 'Group'},
{'alias-list': ['Dogma'],
'id': 'dfd4ed8a-5626-4826-97ba-22905a9e22ba',
'life-span': {'end': '1996'},
'name': 'Dogma',
'sort-name': 'Dogma',
'type': 'Group'},
{'alias-list': ['Dogma', 'Dogma Crew'],
'country': 'ES',
'id': 'c98ecf9f-5572-4317-b15a-79cde78698ac',
'name': 'Dogma Crew',
'sort-name': 'Dogma Crew',
'type': 'Group'},
{'alias-list': ['Dogma 3000'],
'id': '8712f8c3-8f82-4f5a-a1f1-5702651f497a',
'name': 'Dogma 3000',
'sort-name': 'Dogma 3000'},
{'alias-list': ['Dogma 1'],
'id': '1a328d22-a2d4-43c6-9a92-489c23e2e042',
'name': 'Dogma 1',
'sort-name': 'Dogma 1'},
{'alias-list': ['The Dogma'],
'country': 'IT',
'id': '87067d59-89cf-4549-8d8e-28f503a563fe',
'life-span': {'begin': '1999'},
'name': 'The Dogma',
'sort-name': 'Dogma, The',
'type': 'Group'},
{'alias-list': ['Dogma Hollow'],
'id': 'ba8833ab-56bc-4c1f-8a60-a077c30d8a51',
'name': 'Dogma Hollow',
'sort-name': 'Dogma Hollow'},
{'alias-list': ['Dogma Cats'],
'country': 'GB',
'id': '236df439-6f5c-4280-bf8a-40ac44448350',
'name': 'Dogma Cats',
'sort-name': 'Dogma Cats',
'tag-list': [{'count': '1', 'name': 'uk'},
{'count': '1', 'name': 'england'},
{'count': '1', 'name': 'cambridge'}],
'type': 'Group'},
{'alias-list': ['Hot Dogma'],
'id': 'f41d67c5-a2e5-4a25-af96-39a91b72693b',
'life-span': {'begin': '2010'},
'name': 'Hot Dogma',
'sort-name': 'Hot Dogma',
'type': 'Group'},
{'alias-list': ['Dogma and The Afro-Cubans Rhythms',
'Dogma & The Afro-Cuban Rhythms'],
'id': 'cf264d63-a810-4ce0-8357-3b6a513cd7a2',
'name': 'Dogma & The Afro-Cuban Rhythms',
'sort-name': 'Dogma & The Afro-Cuban Rhythms',
'tag-list': [{'count': '1', 'name': 'splitme'}],
'type': 'Group'},
{'id': '5a73a61e-a9bc-4dfe-83e1-756e842c616b',
'name': 'Falso Dogma',
'sort-name': 'Falso Dogma',
'type': 'Group'}]}

Hence, the MB NGS python does not provide by default a way to look into this field, so I modded the distribution in order to retrieve this field.
Now, when I query MB for ‘Dogma’:

m.search_artists('Dogma', limit = 1, offset = 2)
http://musicbrainz.org/ws/2/artist/?query=Dogma&limit=1&offset=2

I obtain

{'artist-list': [{'alias-list': ['Dogma'],
'disambiguation': 'Chilean artist',
'id': '66b7aa34-3117-42d7-b108-942ba99ba30b',
'name': 'Dogma',
'sort-name': 'Dogma'}]}

, which is what I am looking for. Since this point, I just need to iterate over a number of artists, and see if any of them has:

  • a ‘country’:’CL’ value
  • or ‘chile’ within the value of the key:value pair (re.search('chile', value)

However, a second problem that I have had is that when I search using the same search_artists method:

search_artists(query='', limit=None, offset=None, **fields)

Specifying these key:values for the **fields:
{'tags':'uk', 'tags':'england', 'country':'GB'}
and doing this query
m.search_artists('Dogma', {'tags':'uk', 'tags':'england', 'country':'GB'})
I get the same list as before, so these extra fields are not narrowing the search:

{'artist-list': [{'alias-list': [u'D\xf8gma'],
'id': '02a66935-f631-43cf-9788-15ef1e19f28a',
'name': u'D\xf8gma',
'sort-name': u'D\xf8gma'},
{'alias-list': ['Dogma'],
'id': '5839ff7d-88af-45c6-be93-a8f29b276f70',
'name': 'Dogma',
'sort-name': 'Dogma'},
{'alias-list': ['Dogma'],
'id': 'a6746c54-bdbc-4691-b8f5-8dabfab788cd',
'life-span': {'begin': '1996', 'end': '2003'},
'name': 'Dogma',
'sort-name': 'Dogma',
'type': 'Group'},
{'alias-list': ['Dogma'],
'id': 'dfd4ed8a-5626-4826-97ba-22905a9e22ba',
'life-span': {'end': '1996'},
'name': 'Dogma',
'sort-name': 'Dogma',
'type': 'Group'},
{'alias-list': ['Dogma'],
'id': '87373e74-74ca-4a0e-af24-2e17ab83f6f5',
'name': 'Dogma',
'sort-name': 'Dogma',
'type': 'Group'},
{'alias-list': ['Dogma'],
'id': '30fad333-2d95-4650-b27e-7c3147254105',
'name': 'Dogma',
'sort-name': 'Dogma',
'type': 'Group'},
{'alias-list': ['Dogma'],
'id': '66b7aa34-3117-42d7-b108-942ba99ba30b',
'name': 'Dogma',
'sort-name': 'Dogma'},
{'alias-list': ['Dogma'],
'id': '2b582ed9-2776-4f9f-9895-3ee0e9962f8e',
'name': 'Dogma',
'sort-name': 'Dogma'},
{'alias-list': ['Dogma', 'Dogma Crew'],
'country': 'ES',
'id': 'c98ecf9f-5572-4317-b15a-79cde78698ac',
'name': 'Dogma Crew',
'sort-name': 'Dogma Crew',
'type': 'Group'},
{'alias-list': ['Dogma 3000'],
'id': '8712f8c3-8f82-4f5a-a1f1-5702651f497a',
'name': 'Dogma 3000',
'sort-name': 'Dogma 3000'},
{'alias-list': ['Dogma Cats'],
'country': 'GB',
'id': '236df439-6f5c-4280-bf8a-40ac44448350',
'name': 'Dogma Cats',
'sort-name': 'Dogma Cats',
'tag-list': [{'count': '1', 'name': 'uk'},
{'count': '1', 'name': 'england'},
{'count': '1', 'name': 'cambridge'}],
'type': 'Group'},
{'alias-list': ['Hot Dogma'],
'id': 'f41d67c5-a2e5-4a25-af96-39a91b72693b',
'life-span': {'begin': '2010'},
'name': 'Hot Dogma',
'sort-name': 'Hot Dogma',
'type': 'Group'},
{'alias-list': ['Dogma 1'],
'id': '1a328d22-a2d4-43c6-9a92-489c23e2e042',
'name': 'Dogma 1',
'sort-name': 'Dogma 1'},
{'alias-list': ['The Dogma'],
'country': 'IT',
'id': '87067d59-89cf-4549-8d8e-28f503a563fe',
'life-span': {'begin': '1999'},
'name': 'The Dogma',
'sort-name': 'Dogma, The',
'type': 'Group'},
{'alias-list': ['Dogma Hollow'],
'id': 'ba8833ab-56bc-4c1f-8a60-a077c30d8a51',
'name': 'Dogma Hollow',
'sort-name': 'Dogma Hollow'},
{'alias-list': ['Dogma and The Afro-Cubans Rhythms',
'Dogma & The Afro-Cuban Rhythms'],
'id': 'cf264d63-a810-4ce0-8357-3b6a513cd7a2',
'name': 'Dogma & The Afro-Cuban Rhythms',
'sort-name': 'Dogma & The Afro-Cuban Rhythms',
'tag-list': [{'count': '1', 'name': 'splitme'}],
'type': 'Group'},
{'id': '5a73a61e-a9bc-4dfe-83e1-756e842c616b',
'name': 'Falso Dogma',
'sort-name': 'Falso Dogma',
'type': 'Group'}]}

A third problem is that if I do:
m.search_artists('Dogma', limit = 1, {'tags':'uk', 'tags':'england', 'country':'GB'})
I obtain this error:
SyntaxError: non-keyword arg after keyword arg (, line 1)
, which ought to be an error of the Python module because I am properly following the module syntax.

I’ve been taking a closer look to the syntax when doing advanced queries using MB and it is possible to create complex queries such as:

Advanced query syntax : dogma (comment:chile*) (country:CL)

or in the web-browser:

http://musicbrainz.org/search?query=dogma+%28comment%3Achile*%29+%28country%3ACL%29&type=artist&limit=25&advanced=1

This returns:

So I will try to replicate this syntax in my queries within my scripts:

http://musicbrainz.org/search?query=supernova+(comment:chile*)+(country:CL)&type=artist&limit=5&advanced=1