【发布时间】:2012-12-29 14:52:56
【问题描述】:
我是 python 新手,正在尝试使用 urllib 从 musicbrainz 获取 JSON 数据,并且能够解析一些数据。然而,对于一些艺术家来说,JSON 数据中的某些字段/键并不存在,因为它们可能不是必需的。我不确定如何根据进一步的过滤条件过滤获取的 JSON 数据。
在下面的示例中,我需要过滤获取的 JSON 数据,仅针对那些 release-list>>release>>release-group>>type 为“Single”的数据。获取的 JSON 有时最多可以进行 50 次迭代,但我只是将其进一步过滤为仅属于“单一”类别的那些。请让我知道如何去做。谢谢!
示例 JSON:
{
"created": "2012-12-27T13:22:55.834Z",
"recording-list": {
"count": 3,
"offset": 0,
"recording": [{
"score": "100",
"title": "Stronger",
"artist-credit": {
"name-credit": [{
"artist": {
"name": "Britney Spears",
"sort-name": "Spears, Britney"
}
}]
},
"release-list": {
"release": [{
"id": "13c5511f-1f99-4ffe-97d5-562c05e9d8d5",
"title": "Hit Hammer 2001 (disc 1)",
"status": "Official",
"artist-credit": {
"name-credit": [{
"artist": {
"id": "89ad4ac3-39f7-470e-963a-56509c546377",
"name": "Various Artists"
}
}]
},
"release-group": {
"id": "6c4c2cc3-3d8e-3a19-9d46-da076c34b6e9",
"type": "Compilation",
"primary-type": "Album",
"secondary-type-list": {
"secondary-type": ["Compilation"]
}
},
"medium-list": {
"track-count": 20,
"medium": [{
"position": 1,
"track-list": {
"count": 20,
"offset": 0,
"track": [{
"number": "1",
"title": "Stronger",
"length": 203266
}]
}
}]
}
}]
}
}, {
"id": "feb9acbf-1d3d-4395-9512-bfbdcfa72eb9",
"score": "100",
"title": "Stronger",
"artist-credit": {
"name-credit": [{
"joinphrase": "",
"artist": {
"name": "Britney Spears",
"sort-name": "Spears, Britney"
}
}]
},
"release-list": {
"release": [{
"id": "45e2a271-2f6b-4029-b11e-b6d94d169f9a",
"title": "Stronger: The Remixes",
"status": "Official",
"release-group": {
"id": "4d018ba8-f05e-4817-8c70-34307161a0fc",
"type": "Single",
"primary-type": "Single"
},
"date": "2000-12-12",
"country": "US",
"medium-list": {
"track-count": 6,
"medium": [{
"position": 1,
"format": "CD",
"track-list": {
"count": 6,
"offset": 0,
"track": [{
"number": "1",
"title": "Stronger",
"length": 203000
}]
}
}]
}
}]
},
"puid-list": {
"puid": [{
"id": "28550845-c68a-314d-90c1-010dff730f4a"
}]
}
}]
}
}
Python 代码:
def get_mbid(artist, song):
artist=urllib.quote_plus(artist)
song=urllib.quote_plus(song)
recording_url = 'http://search.musicbrainz.org/ws/2/recording/?&fmt=json&query=artist:"'+artist+'"%20AND%20recording:"'+song+'"'
search_results = urllib.urlopen(recording_url)
json = simplejson.loads(search_results.read())
search_results.close()
if json['recording-list']['count'] == 0:
return get_mbid_artist(artist)
else:
recordings = json['recording-list']['recording']
for recording in recordings:
mbid = recording['artist-credit']['name-credit'][0]['artist']['id']
print mbid
【问题讨论】:
-
你真的想过滤
primary-type;type已弃用,将被primary-和secondary-type取代。为什么不将过滤器添加到查询 URL?添加&primary-type=Single应该可以。 -
嗨 Martijn,感谢您的建议,但它似乎不起作用,即使我在 URL 中传递了主要类型,JOSN 转储看起来仍然相同。 musicbrainz.org/ws/2/…
-
对,我怀疑它正在返回任何有至少单一版本的录音。你需要什么样的过滤?你总是可以循环发布版本,只处理那些是单身的,对吧?
-
您是否有理由不为 Web 服务使用预制库? (python-musicbrainz-ngs)。它对解决这个问题没有帮助,但在其他方面可能会有所帮助。
标签: python json musicbrainz