Metadata, the data model and automating record-keeping
In the third of this three part blog series, records management expert Conni Christensen provides insights from her experience with information governance and auto-classification methodology.
Information stockpiles are growing fast. Where is the silver bullet that will magically fix our records management problems? Where is the quick fix that will classify, appraise, protect, and dispose of our records?
Auto-classification is viewed by many as the solution.
Auto-classification will take the burden of classification off end users, who don’t want to classify or aren’t able to classify their ‘records’. There are eCM products providing auto-classification tools such as text analytics and predictive coding. Auto-classification is available now to classify your content in order to determine why it must be retained, how long it must be retained and when it can be legally disposed of.
With the panacea of auto-classification built in, why is there continued resistance by business users to use eCM systems?
Probably because business users still view eCM as a disruptive technology. The organisation needs records management but business users are looking to streamline their processes, save time, save energy and materials, improve quality, accuracy, precision and productivity. SharePoint’s popularity is, in part, due to its open development environment which enables business users to create information systems that enhance their business processes.
Auto-classification per se is not necessarily seen by business users as a means of automating their information management. Comparing different types of business systems such as Accounting and ERP, what are the characteristics that support the automation of business processes, including information management?
- Enterprise resource planning (ERP) systems comprise a suite of integrated applications designed to support and automate processes such as product planning, development, manufacturing processes, sales and marketing, inventory.
- Accounting systems likewise integrate multiple processes such as ordering, invoicing, billing, payroll, time management, taxation, banking, also within a common environment.
- Accounting and ERP systems show us how complex processes can be successfully automated. Features that that support automation include:
- A standardised metadata environment. Metadata common to all processes is standardised and shared across all processes. Metadata labels are standardised. Data entry is controlled by lookup sets. Free text is virtually eliminated.
- Document management embedded into related processes. Document types are standardised. Documents are linked to the processes or tasks. Emails are captured as part of the workflow.
- Intelligent metadata capture. Information architecture that captures linked metadata into forms, minimising data entry by users. Interconnected fields that trigger auto-fill of data into other fields.
- The automation of secondary or consequential processes, such as taxation and stock control, enabled by the information architecture.
- Interoperability within systems and between systems is enabled by standardised metadata labels and values. Mapping tools provide the means of translation where necessary.
- Metadata searches are enabled. Complex searches can be saved. There are multiple paths to finding information.
Metadata and the data model
Within these systems metadata is the means by which information is classified for the purpose of arranging, sorting, grouping, filtering, and finding. Metadata is used to apply access and security rules, data protection rules. Through metadata and the data model, compliance is inbuilt enabling conformance to principles of integrity, consistency, reliability, authenticity, etc.
Everything is connected through metadata: entities, relationships, workflows.
Anyone who has worked with Accounting or ERP systems knows that these systems are incredibly similar. Entities and relationships are the same, processes are the same, and data flows in the same predictable ways. This is because Accounting and ERP systems are built to the same canonical data model.
The canonical data model is the accepted standard for logical data models and it provides software developers with a standard template to build to.
In the absence of industry standards, developers will devise their own data models. And this is what has happened with eCM development. The absence of standardised recordkeeping models has caused chaos within the industry, illustrated by patently different approaches to retention and disposal. We still don’t have standardised definitions for labelling recordkeeping elements such as aggregations (i.e. files), records classes, document types, disposition events, and disposition actions. Nor do we have standard data values to enable interoperability and data exchange between systems.
We lack the reference models that clearly define recordkeeping processes and document the intersection of recordkeeping with business operations, applying the rules to the process, transaction, document type, subject, agent etc.
And while industry standards such as ISO 16175 (Principles and Functional Requirements for Records in Electronic Office Environments) and 15489 (Standard for Records Management) require organisations to undertake an analysis of business activity, none of these standards deliver useful models for system developers to use to integrate recordkeeping into business models.
Let’s return to classification
There are multiple forms of classification built into accounting and ERP systems. In my business accounting system we classify by:
- document type
- income and expense items
and multiple types of user defined classifications such as:
- customer type
- supplier type
But the burden of classification has been reduced by the interconnectedness of the system.
We have also built classification models in SharePoint where we leverage logical connections between related concepts to capture metadata. Interoperability is enabled between Accounting and SharePoint by using standardised labels and metadata values. Tedious browsing through the file plan has been completely eliminated with the use of faceted classification and linked metadata models.
Nevertheless, we are developing models to integrate auto-classification into our information systems, to search for keywords and entities to map document into classification, and to link data protection, access and security rules to documents.
Likewise, we are currently refining and testing metadata models that will support the automated appraisal and disposition of records, using auto-classification to identify keywords (and their synonyms) and mapping them to metadata-based retention schedules.
What’s driving this model of recordkeeping automation forward is:
- A thorough analysis of business processes, data flow, inputs and outputs;
- Identification of recordkeeping requirements;
- All connected into logical data models;
- Supported by an enterprise taxonomy and metadata framework;
- Augmented with auto-classification tools
To put it more succinctly, automation is the point at which where system design, records management, and business analysis meet.
Find out how to approve findability in your ECMS from our eBook The search for meaning: why ‘findability’ can help maximize your ECMS investment.
Conni Christensen founded Synercon in 1998 and is the designer of a.k.a.® information governance software.
She has more than twenty years’ experience in records and information management, business consulting, training and software development. For many years, Conni has worked across the globe as a highly sought trainer, speaker and presenter.