What is FAIR data?

How to make Data Findable, Accessible, Interoperable, Reusable (FAIR)

How many gigabytes of data do you think was created, captured, copied, and consumed in 2020?

Think about the data created and consumed in your organisation, then add all the data from video, phones, security cameras, IoT devices, and thousands of sources worldwide.

We're now in the Zettabyte Era[1], with 2020 data estimated at 59 zettabytes[2]. That's 59 trillion gigabytes in one year!

Volume isn't the only issue, data velocity (how fast new data is created) is accelerating and data variety (the types and sources of data) has never been greater.

Regardless, the expectation remains that this data can be ingested, processed, stored, analysed, and provide value.

Effective and scalable data management is critical to satisfy the demand for business insights into data and to meet the rise of machine learning and AI.

Why you need FAIR principles for your data

  1. Is your data findable, accessible, interoperable, and reusable by both humans and computers?
  2. What is your plan for handling the exponential growth in data?
  3. How will you manage the increasing diversity of data sources?
  4. What is the quality of your data? Do you trust your data?
  5. Are data costs increasing without the promised ROI?

How FAIR data can help you meet these challenges

FAIR provides principles to deal with the increasing data volume, data complexity, and velocity of data creation.

FAIR data and the underlying standards, tools and practices should be a key part of your data management strategy.

Findable: Data is described with rich, semantic metadata so it is findable by humans and computers.
Data and metadata are assigned globally unique and persistent identifiers (PIDs).
Data and metadata is indexed in a searchable Data Catalogue, ideally DCAT-compliant.
Metadata includes the identifier of the data they describe.
Accessible: Data and metadata can be retrieved by their PID using standard data protocols (e.g http).
Data and metadata can be retrieved using open communication protocols, i.e. without proprietary tools.
Data access has authentication and authorisation applied where necessary.
Metadata should be accessible even when the data is not, or no longer, available.
Interoperable: Data is provided in commonly understood formats, preferably open formats.
Metadata is defined in controlled vocabularies using knowledge representation, e.g. RDF, OWL, SKOS.
Metadata includes qualified references to other metadata.
Metadata controlled vocabularies follow FAIR principles.
Reusable: Metadata has rich, detailed attributes so a user (human or machine) can decide if the data is useful.
Data has a clear and accessible data usage licence (legal interoperability).
Data provenance is recorded, e.g. the origin of the data, any processing, any recompilation.
Metadata meets domain-relevant industry or community standards.

What steps to get FAIR data?

Step 1: Do a FAIR self-assessment. This will give you a maturity rating and identify areas to improve.
Use the free tool at https://crosslateral.com.au/fair-tool.html

Step 2: Take an ontological approach to your data, creating semantic metadata and taxonomies. Using ontologies will dramatically lift data literacy in your organisation.

Step 3: Start to apply data standards and practices such as Data Catalog Vocabulary (DCAT), RDF, OWL, SKOS, and Linked Data.

Step 4: Architect your data platform to take advantage of a DCAT-compliant Data Catalogue (such as ours), use SKOS-based vocabularies for master data management (such as ours), and use Graph databases for Linked Data (we do this).

If you need help to make your data FAIR, please contact us at info@crosslateral.com.au.

References

[1] "Zettabyte Era" Wikipedia, 10 Mar. 2021, en.wikipedia.org/wiki/Zettabyte_Era.
[2] Statista. 2021. Total data volume worldwide 2010-2024. https://www.statista.com/statistics/871513/worldwide-data-created/.
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
“FAIR Data – ARDC.” Australian Research Data Commons, 2020, ardc.edu.au/resources/working-with-data/fair-data.
"FAIR Principles". GO FAIR. Retrieved 2021-03-15. Material was copied from this source, which is available under a Creative Commons Attribution 4.0 International License.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.