Date post: | 11-Jul-2015 |
Category: |
Technology |
Upload: | douglas-randall |
View: | 412 times |
Download: | 0 times |
Phonotonetic Chinese Language System and the
Semantic Web
Phonotonetic Chinese Language Institute
April 29, 2010
Douglas R. Donahue
Orthgraphic Types
Global Writing System Methods
Alphabetic based
Logographic based
Syllabary based
Abjad based
Abugida based
“A different language is a different vision of life.”
Federico Fellini
Linguistic Word Order Patterns
SVO: Subject-Verb-Object (English, Mandarin Chinese)
VSO: Verb-Subject-Object (Scandinavian languages, Celtic, Hawaiian)
SOV: Subject-Object-Verb (Hindi, Persian, Latin, Korean)
VOS: Verb-Object-Subject (Fijian, Malagasy)
OVS: Object-Verb-Subject (Hixkaryana)
OSV: Object-Subject-Verb (Xavante, Warao)
Chinese Orthographies
Logographs− Traditional− Simplified
Transliteration Methods− Hanyu Pinyin, Zhuyin Fuhao, Wade-Giles− Logographic searches take place according to
phonetic soundings Radical Collation: Primary Chinese character
dictionary ordering rules, which enable users to locate, compare, sort, merge, etc. written logographs
Stroke Order: Minor primary ordering classifier
Logographic Transliterative Side Effects
Homophone: A word that are pronounced the same as another word but differs in meaning.
Heterographs: Homophones that are spelled differently.
Homographs:One of a group of words that share the same spelling but have different meanings.
Homonyms: One of a group of words that share the same spelling and the same pronunciation but have different meanings
Heterophones: Identically written words that having different pronunciations and meanings.
Phonetic ChineseLanguage System
A Chinese language specific alphabet and dictionary 1-to-1 correspondence between the spoken sounds, and
written symbols for words of the Mandarin Chinese language
A bridge between the phonological (alphabetic), and ideographic orthographic types
Provides lexicographical (a.k.a dictionary) ordering of both Mandarin words and logographs (via transliteration)M
Provides utilitarian collation of written Mandarin Chinese, for both human an machine
Enables software written in Western languages to work directly with Mandarin
Extends existing Chinese orthagraphies, and works in union with them
Lexicographical Order
Alphabetic order; Dictionary order
A natural order result of the Cartesian product of two ordered sets i.e.
(a,b) ≤ (a,b′) only if a < a′ or (a = a′ and b ≤ b′) Logographic
WWW Temporal Characteristic
• The Internet• WWW v1 & v2• Social Web• Semantic Web• Ubiquitous Web
• (International) Domain Name System
• RDF• OWL• RSS• I18N Ontologies• I18N Datasets
Finding Information
Collation The assembly of written information into a standard order. Collating lists of words or names into alphabetical order is the basis of most office filing systems, library catalogs and reference books.
Classification concerned with arranging information into logical categories, while collation is concerned with the ordering of those categories
Collation algorithm: A process which defines the order involved with the comparison of two values (e.g. the "Unicode collation algorithm")
Sorting algorithm: A procedure to put a list of items in the order specified by the associated 'rules' of collation
Structured Data
Structured Data: particular ways of organizing data in a computer for efficient use. Essential ingredients that make the management of huge amounts of data possible; i.e. databases and (internet) indexing services
Lists, Sets, Queues, Maps DiGraph: An abstract data type which
represents a relationship or connection. Consists of two types of elements; namely vertices and edges.
Edges: Vertice (endpoint) connectors. Vertices: Graph element endpoints
Structured Web
Information on the Web is becoming more structured.
Information on the Semantic Web MUST become more structured!
Movements toward structure include: The rise of APIs The proliferation of vertical applications that
run on top of existing data An increase in classic Semantic Technologies
and Microformats The spread of RSS as an information delivery
mechanism
Resource Description Framework (RDF)
A method employed to describe conceptual understanding through a modeling approach involving information represented by Web associated Resources
Employs the use of URIs to make statements about Web associated Resources
Employs the use of URIs to make statements about Resources that are of interest in the real world
Standard Data Model for the exchange of data on The Web
Various languages are used to express the modelled data
A Directed Graph is employed whose elements consist of a pair of vertices and an edge
Semantic Web
• Storing data on the network is not sufficient
• Ability to sieve information from network resident datastores
• Transforming network resident datastores to knowledge
• Converting network resident knowledge to action
Internationalized RDF
http://odoncaoa.orgfree.com/foaf.rdf#odoncaoa ifoaf:name_cn http:// 唐道革 . 中国
Data models• Description of how data
are represented and accessed via computational system.
• Formal definition of the computational system's data elements and their relations, within the domain
• Wayfinding mechanism, that employs a set of symbols and text in the explanation of an information subset that improves communication
Giant Global Graph Not a separate Web but an extension of the
current WWW Information is given well-defined structure and
meaning Facilitate a more automated networking
environment, where computers can operate more autonomously on behalf of people
A Structured Web, akin to an immense DB A common framework enabling data to be
shared, and reused across application, enterprise and community boundaries
Hash Tables/Maps
• Data Structure for collections involving unique IDs• Most prevalent data structure in use to perform common
key/value paired searches• Structured data mechanism which support the management
of the concepts and relationships used to describe, and represent knowledge areas via ontologies
PCL System
• PCLS ameliorates the indeterminism generated as a transliterative side effect
• Sorting• Searching• Indexing• Lexicographical
Order
Orthographic Relations
Conclusion
Metcalfe's Law Formulated by Robert Metcalfe in regard to Ethernet Explains many network effects involving human use of
communication technologies e.g. The Internet, Social Networking, and the World Wide Web
More of a heuristic or metaphor than an iron-clad empirical rule.
The social utility of a network depends upon the number of nodes in contact.
If English speaking Americans, and Mandarin speaking Chinese users don't understand each other, the utility of the network of users speaking the other language, is zero; and the heuristic has to be calculated for the two networks separately.