Pages

Thursday 13 September 2012

IGeLU 2012 - Plenary closing session

Bibliographic Framework Initiative Approach for MARC Data as Linked Data

Sally McCallum, Chief, Network Development and Standards Office, Library of Congress

MARC
Although MARC is 40 years old, it still dominates the environment. There are lots of sharing options on a MARC format based record. It has adjusted to various cataloguing norms. It has lots of data elements even compared to other norms that may be more sophisticated in other respects. MARC has adapted to technical change. There are structural limitations (for example when extending MARC in xml) so we need to move ahead.

RDA and more
There are new cataloguing norms, in particular RDA, but there are others too. Within the RDA ground there is more option for parsing data. It's a 2-way sword because that creates more data elements. There is a use of codes rather than terms and an emphasis on relationships. RDA also offers more flexibility with authoritative headings. Is it possible to include the broader cultural community in library cataloguing norms? We say that and we'll be able to accomodate all the various cultural environments but it is not clear yet that we'll be able to.

Transcriptions
There are pros and cons to transcriptions. As resources are published in more than one way, that is transcribed in more than one way, this is becoming less of something that we have to be focussing on. In the cataloguing area, and headings versus terms, what should we use? At the LoC we use headings, but at we don't know what the future will be. There is also more user supplied information (crowd sourcing).

Type of resources
The printed resource production doesn't seem to go down whilst e-resources is increasing from the publishers as well as in collections. We'll be in a situation where the collection of printed resources is changing. Then there are casual resources, for example, twitter etc. We don't really know what to do with that, should we archive it?

Systems
There is more need for e-resources access management and this should take into account licensing and rights management. E-resource object management implies preservation. There is a lot of push on retrieval needs, both basic and scholar. Libraries have a role to play still in this area.

So the main issue is flexibility. In the next 5 years all of this will have changed again.

Framework Initiative - the bold venture
We need to work together to share bibliographic description and save money. We've included people with broad perspectives and have defined the requirements and the approach for the Initiative.
Requirements:
- Broad accommodation of content norms and data models
- New views of different types of metadata: descriptive, authority, holdings / coded data, classification data, subject data / preservation, rights, technical, archival
- Reconsideration of the activity relationships: exchange, internal storage, inupt interfaces and techniques
- Enhanced linking: traditional = textual, identifiers / semantic technology = URIs
- Accommodate different types of libraries: large, small, research, public, specialised...
- MARC compatibility: maintenance of MARC21 continued / enable reuse of data from MARC / provision of transformations to new models

Approach
Orientation towards the web and linked data. Investigate the use of semantic web standards (RDF data model, various syntaxes: xml, json, n-triples etc.) We want to work with high models and collaboration.
Linked data is important because of the amount of social media on the web, the way search engines work, more and more applications are going towards linked data and there is an increased flexibility to describe resources.

Initial model development
We have a contract with Zepheira (May 2012), because we wanted someone who wasn't completely absorbed in MARC, RDA etc. and who had a broad approach, with a good understanding of DC (?)technologies. So our partner has a long experience of MARC, as well asW3C and a pratcial application of RDF. We had 2 major tasks: Review of several related initiatives and translating bibliographic data to a
linked data form (evolution, not revolution / a basis for cummunity discussion and dialogue).

Balancing factors
- MARC21 historical data and roles
- Previous efforts for modelling bibliographinc information (FRBR - RDA, Indecs - Onix)
- Previous efforts to express bibliographic information as linked data (BL, Deutsche Nazional Bibliothek, Library of Congress' ID, OCLC Worldcat, schema.org)
- Using the web as model for expressing and connecting information (URIs, decentralisation of data, annotation)
- Library community social and technical deployment probabilities
- Adoption outside the library community
- Flexibility for future cataloguing and use scenarios
- Leverage machine technology for the mechanical while keeping the librarian expertise in control

So we started by descontructing MARC, that means identifying MARC resources (MARCR), for example people, places, institutions, subjects etc. Since you need a replacement for those, they have to be pulled out first.

Phase 1 - High level model
4 core classes (initially 2 but rare books / music librarians felt this was not enough):
- Work: resource reflecting the conceptual essence of the cataloguing item / roughly equivalent to FRBR work or expression
- Instance: resource reflecting an individual, material embodiment of the Work
- Authority: resource reflecting key authority concepts that have defined relationships
- Annotation: resource that "decorates" other MARCR resources (e.g. holdings, cover images, reviews)
Each of these are represented by URIs.

So subjects and creators relate to Work; publisher and format relate to Instance; Instance and Work are linked together. Annotations can relate to either Work or Instance so there are lots of places that we can use URIs.See model photos here

Phase 1.5 - early experimentation
- Preliminary work at the LoC
- Very small group of early experimenters
- Working with high level model, vocabularies, conversion tools
- Creative development of syntaxes and configurations
- Adjust model

Model development
- Make model, mappings and tools available and encourage broader experimentation?
- Parallel phase 2 to refine the model and keep folding in experience based changes
- Follow the progress: www.loc.gov/marc/transition
- Join the discussion: bibframe@listserv.loc.gov

No comments:

Post a Comment