Automate records appraisal and classification

Why you need ontologies to automate records appraisal and classification

In the first of this three part blog series, records management expert Conni Christensen provides insights from her experience with information governance and auto-classification methodology

What is auto-classification?

Auto-classification is the process where documents are classified (tagged with metadata) by a machine i.e. a software tool. You could easily think that the machine can make informed decisions about classification just by reading the document. In reality, the classification is dependent on the knowledge that you build into the auto-classification engine.

Classification which supports information governance is significantly different to classification for search. There is more complexity because there are more aspects to consider, such as:

  • users wanting to be able to capture and classify their documents so they can find, use and share them.
  • information professionals having to classify with metadata that governs access, data protection (i.e. GDPR), retention and disposal, all in accordance with contemporary standards and legislation.
  • ICT professionals wanting to be able to manage information infrastructure more effectively, and
  • the C suite wanting everyone to be more efficient at information management, spending less money on consultants and more time delivering products and services (but at the same time they don’t want to expose the business to unnecessary risks through poor governance practices).

Is it possible to achieve all this through auto-classification?

Yes it is… but we need to develop fit for purpose machine readable data models, such as ontologies, that convey the requisite knowledge into the auto-classification platform.

What are ontologies?

Ontologies are linked data models for describing a domain, that list the types of objects and their instances, the relationships that connect them, and the constraints on the ways in which objects and relationships can be combined.

The term dictionary is used to refer to an electronic vocabulary or lexicon as used for example in spelling checkers. If dictionaries are arranged in a subtype-supertype hierarchy of concepts (or terms) then it is called a taxonomy. If it also contains other relations between the concepts, then it is called an ontology.

Unlike File Plans, ontologies enable us to combine multiple concepts and define multiple types of relationships. In an ontology we can accommodate several taxonomies within the same scheme so you can address the needs of all stakeholders. And because they are built for machine-application scale is not an issue.

Ontologies hold the key to automated appraisal

Ontologies enable the automation of records appraisal. If we extract the knowledge built into contemporary disposal authorities we can create data models which enable the auto-classifier to recognise significant concepts, then tag documents with the appropriate records class ID.

Are ontologies difficult to build?

Not in my experience. In fact, I find ontologies far easier to build than file plans because the logic is more explicit.

You start by defining your data model (ie what metadata you want to tag with) and the relationships between your concepts. Then slice and dice your existing controls, your metadata libraries, business classification schemes and disposal authorities.

Ontologies technology

Lastly you need an environment where you can utilise the outputs of auto-classification. We’ve loaded the ontology into the SharePoint Term Store and are using all the SharePoint functionality for exploiting managed metadata – creating views, setting up filters, refining searches and redirecting documents using workflows.

Of course you need purpose built tools. We used our own a.k.a.® software for building the ontology and the Pingar’s DiscoveryOne auto-classification platform to apply the ontology and tag the documents in our SharePoint library. Click the links to find out more about these technologies.

Conni Christensen founded Synercon in 1998 and is the designer of a.k.a.® information governance software.

She has more than twenty years’ experience in records and information management, business consulting, training and software development. For many years, Conni has worked across the globe as a highly sought trainer, speaker and presenter.