Metadata and documentation

Metadata describes other data. It provides information about an item and its relevance so that it can easily be found when needed. Library catalogue records and telephone directories are good examples of metadata.

“Metadata are a subset of core data documentation, which provides standardised structured information explaining the purpose, origin, time references, geographic location, creator, access conditions and terms of use of a data collection” (UK Data Archive).

It is part of broader contextual information that accompanies data to ensure it can be found and understood over time. The information that can be recorded can range from a detailed description of the data to explanatory material about why the data was created and how it has been used.

Therefore, alongside the term metadata the term documentation may be used, referring to all the information necessary to interpret, understand and use a given dataset, a set of files or a single document - sometimes the words are used interchangeably.

Within research metadata is a significant and developing area with specific descriptive standards such as Common European Research Information Format (CERIF), the Dublin Core Metadata Initiative (DCMI) and the Data Documentation Initiative (DDI) used to enable sharing, access, interpretation and re-use of research data. Research funder requirements now demand researchers create and make metadata openly available, notably to describe complex datasets, and thus facilitate access and re-use.

Why create metadata – what are the benefits?

“A crucial part of making data user-friendly, shareable and with long-lasting usability is to ensure they can be understood and interpreted by any user. This requires clear data description, annotation, contextual information and documentation
Data documentation explains how data were created or digitised, what data mean, what their content and structure are, and any manipulations that may have taken place. It ensures that data can be understood during research projects, that researchers continue to understand data in the longer term and that re-users of data are able to interpret the data. Good documentation is also vital for successful data preservation.” (UK Data Archive).

Good documentation ensures your data can be:

  • Searched for and retrieved
  • Understood now and in the future
  • Properly interpreted, as relevant context is available.

When should metadata be created?

It is good practice to begin to document your data at the earliest point of your work and continue to add information as it progresses. It is easier to capture it then, rather than trying to remember things at a later date.

What metadata should be created?

Ask yourself, “What information would I need to understand and use this data in twenty years?”

Potentially useful information includes basic description:

  • Title
  • Date
  • Author(s)
  • Format
  • File name/path
  • Storage location/URL
  • Subject
  • Rights
  • Access information
  • Keywords

Note: This information is currently restricted to basic appreciation of metadata, and typically application through Microsoft Office documents/files.

How to create metadata in Microsoft Office files – “Properties”

There are a number of ways you can add documentation to your data:

  • Embedded documentation
  • Supporting documentation (in a separate file)
  • Catalogue metadata (usually structured according to an international standard, used to identify and locate the data that meet the user's requirements via a web browser or web based catalogue).

You can ensure that digital files are well-structured internally by adding file names, creation date, author(s), version information, full referencing etc. In addition Microsoft Office products use “Properties” to record common pieces of metadata such as title, author, organisation, subjects and keywords, and additional comments.

Not only will this help you keep your files organised and possible to interpret, it will also allow you to sort folders by properties that you have added and search for documents with particular properties.

Some of these Properties fields can be edited whilst some are completed automatically by the software program.

Microsoft Office Properties Fields

These examples are from Microsoft Word but other Microsoft Office programs have similar fields.

User Editable Fields
Title File Size
Tags Pages
Comments Words
Status Edit Time
Categories Templates
Subject Last Modified (time and date)
Hyperlink Base Created (time and date)
Company Last Printed (time and date)
Manager Last Modified By
Author

This information is available in a separate document. Good Practice and Guidance – Metadata for documentspdf

EPSRC

  1. Research organisations will ensure that appropriately structured metadata describing the research data they hold is published (normally within 12 months of the data being generated) and made freely accessible on the internet; in each case the metadata must be sufficient to allow others to understand what research data exists, why, when and how it was generated, and how to access it. Where the research data referred to in the metadata is a digital object it is expected that the metadata will include use of a robust digital object identifier (For example as available through the DataCite organisation).
  2. Where access to the data is restricted the published metadata should also give the reason and summarise the conditions which must be satisfied for access to be granted. For example ‘commercially confidential’ data, in which a business organisation has a legitimate interest, might be made available to others subject to a suitable legally enforceable non-disclosure agreement.
    EPSRC has the following clear expectations of organisations in receipt of EPSRC research funding
Contact Us

e: researchdata@le.ac.uk

t: 0116 252 2310