Publisher References in Bibliographic Entity Descriptions

by   Jim Hahn, et al.

This paper describes a method for improved access to publisher references in linked data RDF editors using data mining techniques and a large set of library metadata encoded in the MARC21 standard. The corpus is comprised of clustered sets of publishers and publisher locations from the library MARC21 records found in the POD Data Lake, an Ivy+ Library Consortium metadata sharing initiative. The POD Data Lake contains seventy million MARC21 records, forty million of which are unique. The discovery of publisher entity sets described forms the basis for the streamlined description of BIBFRAME Instance entities. This study resulted in two major outputs: 1) A prediction database and 2) sets of publisher location and name association rules. The association rules are the basis of a prototype autosuggestion feature of BIBFRAME Instance entity description properties designed specifically to support the autopopulation of publisher entities in linked data RDF editors.



