Citing Data

Reasons for citing data

Data requires citations for the same reasons journal articles and other types of publications require citations: to acknowledge the original author/producer and to help other researchers find the resource.

Citing data is important because it:

  • Acknowledges and provides credit to the originator of the data
  • It verifies data and results, thereby facilitating their re-use in further research
  • Enables the impact of data to be tracked.

Benefits for researchers :

  • Makes data publications more acceptable for CVs and journals
  • Facilitates discovery of grey literature.

Data Citation Poster

How to cite data

A dataset citation includes all of the same components as any other citation:



year of publication,

publisher (for data this is often the archive where it is housed),

edition or version, and

access information (a URL or other persistent identifier).

Consistency is important when citing data.

The links below will provide guidelines on citing data;

Citation tools

Citation software tools help you to:

import citations from your favourite databases and websites.

  • build and organize bibliographies.
  • format citations for papers.
  • take notes on articles and save them in your collection of citations.
  • save and organize PDFs, screenshots, graphs, images, and other files for your research.

Examples include:





General rules

Some style manuals do provide instructions for the citation of data, and selected examples are listed on the Data Citations tab. If the style manual you are using does not address data citations, you can follow these general rules.

Usually a style manual will lay out basic rules for the order of citation elements, regardless of the type of work. This is what you will need to pay close attention to in order to format your citation correctly. If you can’t find a generic list of rules, then look at how the citation for a book is formatted.

These are the citation elements you need to consider when building a data citation:


Who is the creator of the data set? This can be an individual, a group of individuals, or an organization.


What name is the data set called, or what is the name of the study?

Edition or Version

Is there a version or edition number associated with the data set?


What year was the data set published? When was the data set posted online?


Is there a person or team responsible for compiling or editing the data set?

Publisher and Publisher Location

What entity is responsible for producing and/or distributing the data set? Also, is there a physical location associated with the publisher?

In some cases, the publisher of a data set is different than how we think of the publisher of a book. A data set can have both a producer and a distributor.

The producer is the organization that sponsored the author’s research and/or the organization that made the creation of the data set possible, such as codifying and digitizing the data.

The distributor is the organization that makes the data set available for downloading and use.

You may need to distinguish the producer and the distributor in a citation by adding explanatory brackets, e.g., [producer] and [distributor].

Some citation styles (e.g., APA) do not require listing the publisher if an electronic retrieval location is available. However, you may consider including the most complete citation information possible and retaining publisher information even in the case of electronic resources.

Material Designator

What type of file is the data set? Is it on CD-ROM or online?

This may or may not be a required field depending on the style manual. Often this information is added in explanatory brackets, e.g. [computer file].

Electronic Retrieval Location

What web address is the data set available at? Is there a persistent identifier available? If a DOI or other persistent identifier is associated with the data set it should be used in place of the URL.