Version 0.15 October 2012 (working draft)
Intended audience: information managers and professionals (librarians, IT), and architecture/systems developers involved with research organisations.
There are many different types of research outputs, as described in the introduction of Group 2 Pathways. Many of them have also been referred to in the context of the 2.1 Pathways on Open Publishing and Building and Developing Repositories. In the current web environment most data is linked directly through hypertext links and is interoperable through metadata and exchange protocols. These relationships between items of data are not necessarily very intuitive - the linking of concepts is quite limited. Linked Data, the concept which is at the heart of an ‘open data architecture’, aims to relate and connect data in a way that produces for the user richer and more relevant results from searching across Web resources. Linked Data is a set of best practices for publishing and connecting structured data on the Web.
The availability of raw data is also becoming more important in the activities of researchers. Almost all fields of research generate data which is then analysed and presented in different forms. Some fields generate very large volumes of data, for instance geospatial information. Raw data may remain in this form awaiting a suitable application. Working toward the open availability of raw data is now seen as an important way to enable innovation in many areas of agricultural science, and it too is seen as an important element of an open data architecture.
The principles of institutional policy and capacity, infrastructure development and information management are described in the Pathways 2.1 section of this Group, particularly:
- Put in place policies that enable digitization of research materials and the sustainable development of a repository
- Digitize and preserve documents and other materials
- Develop a repository for digital materials
- Make the contents of your repository more visible on the web
These provide common ground between the current world of repositories and web sites and the requirements of the development toward an open data infrastructure. Much of the guidance in these Pathways is still relevant to an open data architecture. However there are significant new concepts, differences and requirements, relating to Linked Data and the interconnection of resources, that require further explanation and guidance. These are:
- The Semantic Web and what it means for producers and users of data; how it aims to develop information exchange into a form of knowledge exchange – something that is meaningful to the individual and his or her specific interests and needs;
- Metadata (description of digital objects or resources) and the framework for its description;
- Ontologies (standard vocabularies) and their role in automatic linking between resources.
These themes are introduced, and their implications for the implementation of an open data architecture are explained, in further Pathways.