Post on 21-Nov-2014
description
transcript
From Data Structures to Databases
Prof. AlvaradoMDST 3703
5 February 2013
Business
• Quiz 1– To be posted this evening– Due Thursday evening– Covers content before Databases– End-of-week reflections still due
• Blogging– Please remember to be timely
• Safari Resources– If you can’t access, try going through
the Library page
Review
• Building as knowing– Ramsay’s point in “On Building”
• DH as cultural reverse engineering– Finding the rules in the patterns– Texts and images are the patterns in
question
• Reverse engineering is like building– Same process in reverse
(deconstruction)– Also requires building other things – like
databases to store stuff
For example, in Studio on Thursday we began to reverse engineer Plato’s Republic. The next step in our exercise was to parse the text into “words” and organize them in a list using an array
By the way, were we actually grabbing words?
Not really – we were find substrings, letter patterns that could also exist
within words (e.g. “cavern”)
Also, these patterns did not match synonyms or pronouns (e.g. “this”) that stand for the same thing as the
word in question
This is the difference between SYNTAX and SEMANTICS
Syntax = sequences of signsSemantics = meanings of signs
Semantics is much harder for computers to grasp than syntax
In fact, some think that semantics is beyond the capacity of any computer
…
Getting back to PHP
We can use arrays to model the text. So, within a FOREACH loop iterating through the lines of a text and parsing each line for “words,” we could do the following:
$words[$word]++;$words[] = $word;$lines[$lineNumber][] = $word;
Each method suggests a different model
More about PHP Arrays
• Arrays can be added to like so:$myArray[] = $newItem;
• Arrays can also use strings instead of number as indices, e.g.$myArray[3] = ‘foo’;$myArray[‘person’] = ‘Bob’;
• Array items may also point to arrays, creating multidimensional arrays$myArray[‘person’] = array();$myArray[‘person’][‘Bob’] = $something;
Arrays with string indices are called “associative arrays” in PHP
Arrays of arrays can be used to create data structures like trees and grids
Read Chapter 5 of PHP: The Good Parts to learn more about arrays (see link in Resources on the course blog)
Also, the PHP manual is always a good place to lookhttp://php.net/manual/en/language.types.array.php
Arrays as Data Structures
• PHP arrays can be used to create data structures to model things, like texts, e.g.$words[$word]++;$words[] = $word;$lines[$lineNumber][] = $word;
• These three create the following1. A simple list of word types (and their
counts)2. A list of each word in order (position and
word)3. A grid of line numbers and words
Here is an example of how we would create the third kind of data structure. This would store a grid of words.
And it would store the text in grid something like this one …
These numbers are the first dimension of the array (Y)
These horizontal numbers are the second dimension of the array (Y)
In this model, a text is a grid of words, each with an X and Y coordinate
Is this the only way to represent a text?
Is it the most accurate?
Texts can also be represented as trees
Document Elements and Structures
Play– Act +
• Scene +– Line +
Book– Chapter +
• Verse +
Letter
– Heading• Return Address• Date• Recipient Info
– Name– Title– Address
– Content• Salutation• Paragraph +• Closing
XML is designed to represent text
What are some differences between trees and tables?
Tables are more rigidTrees allow for indefinite depth
But tables are easier to manipulate
In any case, tables and trees are two major kinds of data structure that you will encounter …
Speaking of trees … what is this?
". . . the tree of nature and logic by the thirteenth-century poet, philosopher, and missionary Ramon Lull. The main trunk supports a version of the tree of Porphyry, which illustrates Aristotle's categories. The ten leaves on the right represent ten types of questions, and the ten leaves on the left are keyed to a system of rotating disks for generating answers. Such diagrams and disks comprise Lull's Ars Magna (Great Art), which was the first attempt to develop mechanical aids to reasoning. It served as an inspiration to the pioneer in symbolic logic, Gottfried Wilhelm Leibniz.”
John Sowa, explaining the cover art for Knowledge Representation
Tree of Logic (and a primitive computer)
What is this tree an example of?
The tree is a “knowledge representation” (KR)
A KR is a model that comprises
1. A set of categories (aka Ontology)Names and relationships between names
2. A set of inference rules (aka Logic)A method of traversing names and relations
3. A medium for computationA medium for producing inferences
4. A language for expressing these things
Such as a programming or markup language
Ontologies are systems of categories rooted in world views
Ontologies consist of categories and their relationships
These are often mapped onto physical things – the human body, or trees – as part of our cognitive model
The tree as body as society among the Umeda of New Guinea
Logic is a name for the systematic unpacking ontologies in discourse …
Here is a sample ontology, one very similar to Aristotle’s
And this is a syllogism, the basic unit of reasoning in classical logic
How is it related to the tree?
The sentences in the syllogism stand for the traversal of the tree that represents an implicit ontology
Reasoning always implies an ontology
Ontologies are often unexpressed
Ontologies often conflict with each other
(Digital) Humanists excavate or reverse engineer these ontologies
Now, a KR for a computer has to be an operationalized KR
How would we express a syllogism in PHP?
One way is to convert the tree into an array
0 1 2 3 4
But, given such an array, how can we find out if Socrates is mortal?
How do we find if the following is set:
We’d have to some some complicated nested looping to find the answer …
So, PHP gives us tools to create an ontology, but not a way to reason
efficiently with them
To create more effective KRs, we need the services of a database
A database is a “a system that allows for the efficient storage and retrieval of information”
But beyond this, it also allows us to “represent knowledge”
Given Unsworth’s definition, how must it do this?
Databases provide a language to define ontologies (schema) and to “unpack” these ontologies –
via a query language that lets us efficiently search and retrieve
data organized schema
In this course, we are going to use a relational database to store and access information
Relational databases use a language known as SQL
(pronounced S-Q-L, although some say “sequel”)
SQL
• SQL stands for “Structured Query Language”– NOT invented by Microsoft
• Invented in the 1970s and commercialized in the 1980s– Probably responsible for new business
models like JIT inventories
• Built on Codd’s relational model (1970)– Implements set theory and formal logic– Around the time of SGML
SQL
• A language used by relational databases– Oracle, SQL Server, Access, etc.
MySQL
• A very fast, simplified, and easy to use relational database
• A client/server app– Runs on the internet– Not a desktop app like Access
• Created by Monty Widenius in the mid 1990s– Open Source– A Finn living in Sweden – Same time as PHP
• Powered the Web 2.0 revolution
phpMyAdmin
• A PHP interface to MySQL• Relatively easy to use– No need to know SQL
• Great to manage databases that your PHP programs will use
• Today you will get started using UVA’s free MySQL server
The role of PHP