Mdst3705 2013-02-05-databases

Post on 21-Nov-2014

457 views 0 download

Tags:

description

 

transcript

From Data Structures to Databases

Prof. AlvaradoMDST 3703

5 February 2013

Business

• Quiz 1– To be posted this evening– Due Thursday evening– Covers content before Databases– End-of-week reflections still due

• Blogging– Please remember to be timely

• Safari Resources– If you can’t access, try going through

the Library page

Review

• Building as knowing– Ramsay’s point in “On Building”

• DH as cultural reverse engineering– Finding the rules in the patterns– Texts and images are the patterns in

question

• Reverse engineering is like building– Same process in reverse

(deconstruction)– Also requires building other things – like

databases to store stuff

For example, in Studio on Thursday we began to reverse engineer Plato’s Republic. The next step in our exercise was to parse the text into “words” and organize them in a list using an array

By the way, were we actually grabbing words?

Not really – we were find substrings, letter patterns that could also exist

within words (e.g. “cavern”)

Also, these patterns did not match synonyms or pronouns (e.g. “this”) that stand for the same thing as the

word in question

This is the difference between SYNTAX and SEMANTICS

Syntax = sequences of signsSemantics = meanings of signs

Semantics is much harder for computers to grasp than syntax

In fact, some think that semantics is beyond the capacity of any computer

Getting back to PHP

We can use arrays to model the text. So, within a FOREACH loop iterating through the lines of a text and parsing each line for “words,” we could do the following:

$words[$word]++;$words[] = $word;$lines[$lineNumber][] = $word;

Each method suggests a different model

More about PHP Arrays

• Arrays can be added to like so:$myArray[] = $newItem;

• Arrays can also use strings instead of number as indices, e.g.$myArray[3] = ‘foo’;$myArray[‘person’] = ‘Bob’;

• Array items may also point to arrays, creating multidimensional arrays$myArray[‘person’] = array();$myArray[‘person’][‘Bob’] = $something;

Arrays with string indices are called “associative arrays” in PHP

Arrays of arrays can be used to create data structures like trees and grids

Read Chapter 5 of PHP: The Good Parts to learn more about arrays (see link in Resources on the course blog)

Also, the PHP manual is always a good place to lookhttp://php.net/manual/en/language.types.array.php

Arrays as Data Structures

• PHP arrays can be used to create data structures to model things, like texts, e.g.$words[$word]++;$words[] = $word;$lines[$lineNumber][] = $word;

• These three create the following1. A simple list of word types (and their

counts)2. A list of each word in order (position and

word)3. A grid of line numbers and words

Here is an example of how we would create the third kind of data structure. This would store a grid of words.

And it would store the text in grid something like this one …

These numbers are the first dimension of the array (Y)

These horizontal numbers are the second dimension of the array (Y)

In this model, a text is a grid of words, each with an X and Y coordinate

Is this the only way to represent a text?

Is it the most accurate?

Texts can also be represented as trees

Document Elements and Structures

Play– Act +

• Scene +– Line +

Book– Chapter +

• Verse +

Letter

– Heading• Return Address• Date• Recipient Info

– Name– Title– Address

– Content• Salutation• Paragraph +• Closing

XML is designed to represent text

What are some differences between trees and tables?

Tables are more rigidTrees allow for indefinite depth

But tables are easier to manipulate

In any case, tables and trees are two major kinds of data structure that you will encounter …

Speaking of trees … what is this?

". . . the tree of nature and logic by the thirteenth-century poet, philosopher, and missionary Ramon Lull. The main trunk supports a version of the tree of Porphyry, which illustrates Aristotle's categories. The ten leaves on the right represent ten types of questions, and the ten leaves on the left are keyed to a system of rotating disks for generating answers. Such diagrams and disks comprise Lull's Ars Magna (Great Art), which was the first attempt to develop mechanical aids to reasoning. It served as an inspiration to the pioneer in symbolic logic, Gottfried Wilhelm Leibniz.”

John Sowa, explaining the cover art for Knowledge Representation

Tree of Logic (and a primitive computer)

What is this tree an example of?

The tree is a “knowledge representation” (KR)

A KR is a model that comprises

1. A set of categories (aka Ontology)Names and relationships between names

2. A set of inference rules (aka Logic)A method of traversing names and relations

3. A medium for computationA medium for producing inferences

4. A language for expressing these things

Such as a programming or markup language

Ontologies are systems of categories rooted in world views

Ontologies consist of categories and their relationships

These are often mapped onto physical things – the human body, or trees – as part of our cognitive model

The tree as body as society among the Umeda of New Guinea

Logic is a name for the systematic unpacking ontologies in discourse …

Here is a sample ontology, one very similar to Aristotle’s

And this is a syllogism, the basic unit of reasoning in classical logic

How is it related to the tree?

The sentences in the syllogism stand for the traversal of the tree that represents an implicit ontology

Reasoning always implies an ontology

Ontologies are often unexpressed

Ontologies often conflict with each other

(Digital) Humanists excavate or reverse engineer these ontologies

Now, a KR for a computer has to be an operationalized KR

How would we express a syllogism in PHP?

One way is to convert the tree into an array

0 1 2 3 4

But, given such an array, how can we find out if Socrates is mortal?

How do we find if the following is set:

We’d have to some some complicated nested looping to find the answer …

So, PHP gives us tools to create an ontology, but not a way to reason

efficiently with them

To create more effective KRs, we need the services of a database

A database is a “a system that allows for the efficient storage and retrieval of information”

But beyond this, it also allows us to “represent knowledge”

Given Unsworth’s definition, how must it do this?

Databases provide a language to define ontologies (schema) and to “unpack” these ontologies –

via a query language that lets us efficiently search and retrieve

data organized schema

In this course, we are going to use a relational database to store and access information

Relational databases use a language known as SQL

(pronounced S-Q-L, although some say “sequel”)

SQL

• SQL stands for “Structured Query Language”– NOT invented by Microsoft

• Invented in the 1970s and commercialized in the 1980s– Probably responsible for new business

models like JIT inventories

• Built on Codd’s relational model (1970)– Implements set theory and formal logic– Around the time of SGML

SQL

• A language used by relational databases– Oracle, SQL Server, Access, etc.

MySQL

• A very fast, simplified, and easy to use relational database

• A client/server app– Runs on the internet– Not a desktop app like Access

• Created by Monty Widenius in the mid 1990s– Open Source– A Finn living in Sweden – Same time as PHP

• Powered the Web 2.0 revolution

phpMyAdmin

• A PHP interface to MySQL• Relatively easy to use– No need to know SQL

• Great to manage databases that your PHP programs will use

• Today you will get started using UVA’s free MySQL server

The role of PHP