Date post: | 01-Nov-2014 |
Category: |
Technology |
Upload: | dhirajgaur |
View: | 838 times |
Download: | 1 times |
Content Management System
What is Content?• The concept of
– structured vs. unstructured data– Data vs. Content
• Structured data fits neatly into well-defined buckets.
• “unstructured” data, which does not fit so predictably into welldefined buckets, has become known as “content.”
Business Process Structured Data Unstructured Data
Sales Contact Information Cover Letters, Proposals, Contracts,RFPs
Marketing Product Numbers and Prices Brochures, Specifications,FAQs , Web Banner Ads.
Production Bills of Materials, Inventory Levels
Engineering Drawings, Process Specifications.
Customer Support Customer Lists, Phone Logs, Contact History
Customer Correspondence ,Troubleshooting , FAQ
Purchasing Vendor ID, Item Number,Price, Discount
Product Specifications, Vendor Catalogs
Human Resources Employee Lists, Payroll Benefits Information
Employee Policies,Resumes, Performance.
Finance and Administration
General Ledger, FinancialProjections
Annual Reports, Board Minutes ,Compliance Reporting, Accounting Policies
Enterprise Content Management – sample user requirements (from a large Financial
Svcs Company)• “If a new bond comes into inventory, then we should get a
message, an alert...and be able to refine to say that I only have California, Oregon and Washington clients...."
• “In the month of July, I received 95 e-mails from my subscriptions. These e-mails included 61 that had 143 attachments that had 67 more attachments. In total therefore, I received almost 400 documents including 5 different types (HTML,PDF, Word, Rich Media, …). Even with this volume, I had subscribed to only 10 categories in the Equities area. There are a total of 26 Equity Subscription areas and a total of 166 categories to which a user can subscribe across all Product Areas.”
Professional users of a traditional Content Management Product/Solution
Enterprise Content Management – sample user requirements (from a large Financial Svcs
Company)• The real question is, "Which sales ideas may have significant
relevance to my book of business?" For example, an earnings warning on an equity rated Hold or Lower and not owned by any of my clients may not be of high relevance to me. Ideally, a relevance analysis would:– Greatly reduce the volume of Product Area Ideas sent to every FA,
hopefully to perhaps 10% to 20% or less of today's volume with ideas that are potentially actionable for that FA and his/her client
– Result in FAs reading and evaluating the Product Area Ideas, taking appropriate actions, and generating sales because the Product Area Ideas would be relevant
– Result in customer satisfaction because clients would understand FAs are paying attention to their needs and developing focused ideas
Professional users of a traditional Content Management Product/Solution
Enterprise Content Management – sample product requirements (from a large Financial Svcs
Company)
• “Content generation is a more complex and probably costly problem to solve ... we reportedly create about 9 million messages a month for field delivery. On average, this would mean 1,000 messages per month per ‘big user’ or perhaps only 500 to 600 per ‘little user’.…I strongly believe an analysis is in order of the nature and necessity of generated content , the establishment of content generation standards, themovement towards development and implementation of a relevance engine, … “
Director (Product Management) of a large company that uses a leading Content Management Product
How is Content managed?
Author
EditUpdate
Publish
Content management is significantly more complex than management of structured relational data.
A system that pieces together content for the purpose of viewing that content within a web based device
Action Data Content
Create Created automatically byapplications or manually via aforms-based interface
Requires creative skills and often collaboration betweenmultiple contributors
Review and Edit If manual review is required ,normally a quick double-checkvia a forms-based interface or audit report
Requires a complex iterative cycle in which multiple parties make comments and annotations that are factored into the next updated version
Link to Related Information
Through foreign keys and/or relational JOIN operational
Requires a combination of hyperlinks ,metadata, and “virtual document” parent-child Relationships
Format and Deliver Typically handled throughstandard reporting tools,Visual Basic interfaces orASP/JSP tools on the Web
Requires complex formatting specifications and transformations between file formats, XML
Action Data Content
Update Typically handled at either a field or record level in a well-defined applicationEnvironment
Changes may occur at any level (e.g. a word in entire chapter, etc.), requiring complex change management including control and track the specific items that were changed
Index Handled through a well-defined relational schema
Requires a combination of structured hierarchy (e.g. cabinet-folder structure) and flexible relational metadata.
Search and Retrieval Typically handled though SQL queries using the defined relational schema
Often requires a complex combination of metadata, full text and structural elements,and sometimes even more exotic techniques such as Query-by- Image-Content
What Makes Content Management Difficult?
• The flexibility and unpredictability of content
• Lack of well-defined, industry-standard application infrastructure for handling content
• Complex creation, update and change management cycles
• Complex reuse and repurposing issues
• Complex cross-referencing and indexing schemes
• Complex formatting and transformation requirements
• Complex search and retrieval issues
A Brief History of Content Management
• Content has existed for at least 5,000 years, since the invention of written language.
• Formal content management probably didn’t begin until the founding of the Library of Alexandria in 150 B.C.
• For at least the last 100 years, content has been playing a big role in business, in the form of brochures, catalogs, contracts , correspondence, invoices, purchase orders, billings and so forth.
• As the 1990s dawned, personal computers were increasingly becoming linked by local area networks. With the realization that this provided a means to re-establish control over electronic content, the age of document management was born.
A Brief History of Content Management
• By 1998, the Web had evolved from an interesting phenomenon to serious business, and was now composed of billions of individual Web pages. Suddenly “document management” began to go out of vogue, and “web content management” became the central focus.
• The Web frenzy hit its crescendo in 1999, but with the dot.com and NASDAQ crash in the year 2000, attention has again turned to a more balanced combination of print and web-based content. Also, while the rush to B2C e-commerce has slowed somewhat, there is now a renewed focus on automatically communicating electronic business content through XML-based B2B commerce networks.
Variation Business Purpose Example
Web Content Management Ensure that complex Web site content is complete, up-to date
Managing all the content behind the Amazon.com
Knowledge Management Archive and index critical organizational knowledge so thatemployees can takeadvantage of it
Extensive knowledge base used by service technicians at a telecommunicationsCompany
Document Management Manage complex document-basedinformation so common elements can be reused, anddocuments can be dynamically assembled for publishing
Management of overlapping and constantly changing information in automobile usermanuals, dealer service manuals, and technical Specifications
Variation Business Purpose Example
Imaging Management Replace costly and error prone paper processing with electronic storage andworkflows
Insurance claims processing
Digital Asset Management Allow a mass of multi-media electronic content (photos, audio, video, etc.) to be stored in Multimedia Data base
Finding artwork for developing advertising creative , archiving news video clips at CNN
Records Management Ensuring that critical records are secure but accessible, andare deleted when they should Be
Management of required documentation at a nuclear power plant
The Role of XML in Content Management
• XML blurs the distinction between structured and unstructured data, allowing data items buried inside an unstructured document to be explicitly tagged.
• XML plays at least three key roles in content management:– As a source format for content publishing– As a delivery format to the web– As a universal data interchange format
New Enterprise Content Management Challenges
1. More variety and complexity More formats (MPEG, PDF, MS Office, WM, Real, AVI, etc) More types (Docs, Images -> Audio, Video, Variety of text-
structured, unstructured) More sources (internal, extranet, internet, feeds)
2. Information Overload Too much data, precious little information (Relevance)
3. Creating Value from Content How to Distribute the right content to the right people as needed?
(Personalization -- book of business) Customized delivery for different consumption options
(mobile/desktop, devices) Insight, Decision Making (Actionable)
New Enterprise Content Management Technical Challenges
1. Aggregation Feed handlers/Agents that understand content representation and
media semantics Push-pull, Web-DB-Files, Structured-Semi-structured-
Unstructured data of different types
2. Homogenization and Enhancement Enterprise-wide common view
Domain model, taxonomy/classification, metadata standards Semantic Metadata– created automatically if possible
3. Semantic Applications Search, personalization, directory, alerts, etc. using metadata and
semantics (semantic association and correlation), for improved relevance, intelligent personalization, customization
Related Stock
News
Semantic Web – Intelligent Content(supported by Taalee Semantic Engine)
IndustryNews
Technology Products
COMPANY
SECEPA
Regulations
Competition
COMPANIES in Same or Related INDUSTRY
COMPANIES inINDUSTRY with Competing PRODUCTS
Impacting INDUSTRY or Filed By COMPANY
Important to INDUSTRY or COMPANY
Intelligent Content = What You Asked for + What you need to know!
Focused relevantcontent
organizedby topic
(semantic categorization)
Automatic ContentAggregationfrom multiple
content providers and feeds
Related news not
specifically asked for(Semantic
Associations)
Competitive research inferred
automatically
Automatic 3rd party content
integration
Semantic Application – Equity Dashboard
Technologies for Organizing Content
• Information Retrieval/Document Indexing• TF-IDF/statistical, Clustering, LSI• Statistical learning/AI: Machine learning, Bayesian, Markov
Chains, Neural Network• Lexical, Natural language• Thesaurus, Reference data, Domain models (Ontology)• Information Extractors • Reasoning/Inferencing: Logic based, Knowledge-based, Rule
processing and Most powerful solutions require combine several of these,
addressing more of the objectives
Ontology• Standardizes meaning, description,
representation of involved concepts/terms/attributes
• Captures the semantics involved via domain characteristics, resulting in semantic metadata
• “Ontological Commitment” forms basis for knowledge sharing and reuse
Ontology provides semantic underpinning.
An OntologyAn Ontology
Disaster
eventDate
description
site => latitude, longitude
sitelatitude
longitude
Natural Disaster
Man-made Disaster
damage
numberOfDeaths
damagePhoto
Volcano
EarthquakeNuclearTest
magnitude
bodyWaveMagnitude
conductedBy
explosiveYield
bodyWaveMagnitude < 10
bodyWaveMagnitude > 0
magnitude < 10
magnitude > 0
Terms/Concepts(Attributes) Functional
Dependencies (FDs)
Domain Rules
Hierarchies
Controlled Vocabularies/ Classifications/Taxonomies/Ontologies
• WordNet• Cyc• The Medical Subject Headings (MeSH): NLM's controlled
vocabulary used for indexing articles, for cataloging books and other holdings, and for searching MeSH-indexed databases, including MEDLINE. MeSH terminology provides a consistent way to retrieve information that may use different terminology for the same concepts. Year 2000 MeSH includes more than 19,000 main headings, 110,000 Supplementary Concept Records (formerly Supplementary Chemical Records), and an entry vocabulary of over 300,000 terms.
Semantic Technology Features• Unstructured Text Content• Semi-Structured Content• Structured Content• Audio/Video Content with associated text (transcript, journalist notes)• Create a Customized "World Model" (Taxonomy Tree with customized domain
attributes)• Automatically homogenize content feed tags• Automatically categorize unstructured text• Automatically create tags based on text Itself• Create and maintain a Customized Knowledge Base for any domain• Automatically enhance content tags based on information beyond text• Build contextually relevant custom research applications• Contextual Search (an order of magnitude better than keyword-based search)• Support push or pull delivery/ingestion of content• Personalization/Alerts/Notifications• Real Time Indexing (stories indexed for search/personalization within a minute)• Provide the user with relevant information not explicitly asked for (Semantic
Associations)
Along with the evolution of metadata and semantic
technologies enabling the next generation of the Web, Content Management has
entered the next generation of Enhanced Content
Management.