By Sergio Oramas, Mohamed Sordo, and Luis Espinosa-Anke L. In 2nd Workshop on Knowledge Extraction from Text, Proceedings of the International World Wide Web Conference (pp. 661-666), Florence, Italy. Universitat Pompeu Fabra Barcelona | MTG Music Technology Group
This paper presents a rule based approach to extracting relations from unstructured music text sources. The proposed approach identifies and disambiguates musical entities in text, such as songs, bands, persons, albums and music genres. Candidate relations are then obtained by traversing the dependency parsing tree of each sentence in the text with at least two identified entities. A set of syntactic rules based on part of speech tags are defined to filter out spurious and irrelevant relations. The extracted entities and relations are finally represented as a knowledge graph. We test our method on texts from songfacts.com, a website that provides tidbits with facts and stories about songs. The extracted relations are evaluated intrinsically by assessing their linguistic quality, as well as extrinsically by assessing the extent to which they map an existing music knowledge base. We present encouraging results in both evaluations since our system produces a vast percentage of linguistically correct relations between entities, and is able to replicate a large part of the knowledge base.