How many gigabytes of data do you think was created, captured, copied, and consumed in 2020?
Think about the data created and consumed in your organisation, then add all the data from video, phones, security cameras, IoT devices, and thousands of sources worldwide.
We're now in the Zettabyte Era[1], with 2020 data estimated at 59 zettabytes[2]. That's 59 trillion gigabytes in one year!
Volume isn't the only issue, data velocity (how fast new data is created) is accelerating and data variety (the types and sources of data) has never been greater.
Regardless, the expectation remains that this data can be ingested, processed, stored, analysed, and provide value.
Effective and scalable data management is critical to satisfy the demand for business insights into data and to meet the rise of machine learning and AI.
FAIR provides principles to deal with the increasing data volume, data complexity, and velocity of data creation.
FAIR data and the underlying standards, tools and practices should be a key part of your data management strategy.
Findable: |
Data is described with rich, semantic metadata so it is findable by humans and computers. Data and metadata are assigned globally unique and persistent identifiers (PIDs). Data and metadata is indexed in a searchable Data Catalogue, ideally DCAT-compliant. Metadata includes the identifier of the data they describe. |
Accessible: |
Data and metadata can be retrieved by their PID using standard
data protocols (e.g http). Data and metadata can be retrieved using open communication protocols, i.e. without proprietary tools. Data access has authentication and authorisation applied where necessary. Metadata should be accessible even when the data is not, or no longer, available. |
Interoperable: |
Data is provided in commonly understood formats, preferably open formats. Metadata is defined in controlled vocabularies using knowledge representation, e.g. RDF, OWL, SKOS. Metadata includes qualified references to other metadata. Metadata controlled vocabularies follow FAIR principles. |
Reusable: |
Metadata has rich, detailed attributes so a user
(human or machine) can decide if the data is useful. Data has a clear and accessible data usage licence (legal interoperability). Data provenance is recorded, e.g. the origin of the data, any processing, any recompilation. Metadata meets domain-relevant industry or community standards. |
Step 1:
Do a FAIR self-assessment. This will give you a
maturity rating and identify areas to improve.
Use the free
tool at
https://crosslateral.com.au/fair-tool.html
Step 2: Take an ontological approach to your data, creating semantic metadata and taxonomies. Using ontologies will dramatically lift data literacy in your organisation.
Step 3: Start to apply data standards and practices such as Data Catalog Vocabulary (DCAT), RDF, OWL, SKOS, and Linked Data.
Step 4: Architect your data platform to take advantage of a DCAT-compliant Data Catalogue (such as ours), use SKOS-based vocabularies for master data management (such as ours), and use Graph databases for Linked Data (we do this).
If you need help to make your data FAIR, please contact us at info@crosslateral.com.au.