The Challenges of Building EnterpriseContent Taxonomies and the Role of
Classification Technologies in MaintainingTheir Effectiveness
Agenda
The Challenge of Unstructured Content
Key Concepts and Terms
Taxonomy, Classification and ECM Adoption
Classification Technologies for ECM
80% of Enterprise Data is Unstructured
Document
Image
Report
Other
Billing statements Claims images Customer Correspondence Mortgage docs Contracts Signed BOLs Healthcare EOBs Marketing collateral Website content Voice authorizations Signature cards Credit enrollments Material Safety Data Sheets ISO 9000 docs Plant schematics Product images Spec sheets ….and much more!
The Challenge of ManagingUnstructured Content
What is Enterprise Content?
Organizing the explosion of
unstructured content becomes critical:
We’ve got 600 GB of content from
basic content services all over the
enterprise. How can we get this content
efficiently mapped into our ECM
taxonomy?
We’ve been managing our content
without classifying it for a few years
now. How can our users navigate amongst
this existing content in a way that’s
intuitive for our business?
The lawyers have to review 400,000
electronic documents for their case. How can we make sure they don’t
waste their time?
Where do I start?
Business Value of Classification for ECM
Ability to Structure Content with Databases
Multiple Repositories Make Access Difficult
And Then There’s SharePoint, File Shares and . . .
Key Concepts and TermsMetadata: a means of describing, locating, cataloging, andactivating content as objects in a software ecosystem (literally,data about data).
Enterprise Catalog: a centralized and normalized metadatamodel for unstructured content for the purposes of providing consistent services across all ECM applications.
Taxonomy: a hierarchical structure of informationcomponents, any part of which can be used to classify acontent item in relation to other items in the structure.
Classification: a coding of content items as members of agroup for the purposes of cataloging them or associating themwith a taxonomy.
Taxonomy and Classification in ECM
Classification Examples:– Document Classing– Foldering
Taxonomy Examples:– Enterprise Content Catalog– Industry Standard Document Taxonomies (ISO, XMI)
Methods:– Rules-Based: Applies pre-determined rules for
“if then” classification of text and properties– Analytics-Based: Applies algorithms to interpret classes in order to apply classification rules to them
ECM Taxonomy Illustrated
Criteria For ECM Classification Management Solutions
Integrate with and support the ECM metadata model Interpret a highly-federated content ecosystem
Go beyond search to catalog and manage content
Build on advanced analytic technologies – rules alone are not enough
Lessons Learned From ERP Adoption
Getting Classification Right: ‘Garbage in = garbage out’ is often used inmetadata management projects to describe the problem of building ametadata model on inconsistent sources.
Driving Process on Taxonomies: ERP systems depending on 3 mastertaxonomies – material, vendor and customer. These taxonomies driveevents, workflow definition and the development of transaction-centricbusiness process applications
Mastering Metadata: The ability to deploy new enterprise applicationsdepends upon the re-usability, scalability and integrity of the metadata model
System of Record is Required for Standardization:– Establishes an enterprise standard that can be audited– Forms the foundation for building demonstrable best practices
– Enforces consistency of data capture and output
Taxonomy and Classification in ECM
Classification Examples:– Document Classing– Foldering
Taxonomy Examples:– Enterprise Content Catalog– Industry Standard Document Taxonomies (ISO, XMI)
Methods:– Rules-Based: Applies pre-determined rules for ‘if,
then’ classification of text and properties– Analytics-Based: Applies algorithms to interpretclasses in order to apply classification rules to the
Most organizations face content taxonomy pain –
especially as they standardize around ECM – Mapping content to taxonomy during
ingestion – Reclassifying content under management – Evolving taxonomies as new types of
content emerge – Integrating folksonomies (SharePoint) into a master taxonomy
Classification is Hard Work
Business Drivers for ECM Taxonomy Management
Proliferating departmental solutions– Content Management– Collaboration (SP, Quickr, Team Rooms, Wikis)
User-based classification and high workforceturnover– Productivity declines as knowledge disappears– Legal discovery is a secondary concern
Mergers and Acquisitions – need to reconciledisparate content management practices,repositories and processes