+ All Categories
Home > Documents > faculty.juniata.edufaculty.juniata.edu/rhodes/ida/exams/exam1f13a.docx · Web viewFor each of the...

faculty.juniata.edufaculty.juniata.edu/rhodes/ida/exams/exam1f13a.docx · Web viewFor each of the...

Date post: 18-Jan-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
9
IT 241 Information Discovery Fall 2013 Exam 1 Page 1 Thursday, Sept. 26, 2013 Name _____________________________ [12 pts] 1. Below is one of the visualization pipelines from the text. For each of the following concepts, list which stage in the pipeline or interaction/actions would the concept be best associated. Handling missing values _______________________ Normalizing values ________________________ Relational database tables ____________________________ A CSV file ______________________________ A scatterplot __________________________________ Selecting datapoints in a parallel coordinates graph ______________________________ 2. Name four dimensions of data displayed in Minard’s map of Napoleon’s march to Moscow and back (see back page). [4 pts] _______________________________ ______________________________________ _______________________________ ______________________________________ 3. A graphic can be classified as an exploratory visualization, an explanatory visualization or an example of visual art. [8 pts] a. Explain the difference between exploratory and explanatory visualization.
Transcript
Page 1: faculty.juniata.edufaculty.juniata.edu/rhodes/ida/exams/exam1f13a.docx · Web viewFor each of the following concepts, list which stage in the pipeline or interaction/actions would

IT 241 Information Discovery Fall 2013 Exam 1 Page 1

Thursday, Sept. 26, 2013Name _____________________________

[12 pts]1. Below is one of the visualization pipelines from the text.

For each of the following concepts, list which stage in the pipeline or interaction/actions would the concept be best associated.

Handling missing values _______________________ Normalizing values ________________________

Relational database tables ____________________________

A CSV file ______________________________ A scatterplot __________________________________

Selecting datapoints in a parallel coordinates graph ______________________________

2. Name four dimensions of data displayed in Minard’s map of Napoleon’s march to Moscow and back (see back page). [4 pts]

_______________________________ ______________________________________

_______________________________ ______________________________________

3. A graphic can be classified as an exploratory visualization, an explanatory visualization or an example of visual art.[8 pts]

a. Explain the difference between exploratory and explanatory visualization.

b. Classify each of these visualizations as one of the three

Nightengale’s rose petal of the cause of death in the army _______________________________

A scatterplot of car horsepower versus gas efficiency _______________________________

The map of the internet (see last page) ___________________________________

A matrix of scatterplots from the US census _________________________________

Page 2: faculty.juniata.edufaculty.juniata.edu/rhodes/ida/exams/exam1f13a.docx · Web viewFor each of the following concepts, list which stage in the pipeline or interaction/actions would

IT 241 Information Discovery Fall 2013 Exam 1 Page 2

[20 pts]4. Data coding

a. The value 1010 0101 in binary is _______________ in decimal and its corresponding hexadecimal digits are _____.

Converting decimal 42 to binary becomes ______________ .

If the 8 bit ASCII codes in decimal for “A” and “a” are 65 and 97, respectively, then the decimal ASCII codes for the string “Bad” are _____________________________________.

If we want to store 42 unique values, we would need at least _______ bits to represent those values.

b. A 800 x 500 pixel color image coded in RGB (+ 1 alpha byte) format requires ________________bytes.

c. Compression techniques are used for images, audio and video. What does is mean if the compression is “lossy”?

d. A 20 second stereo sound clip (2 channels) sampled at 48000 samples per second with a 16 bit depth will

result in storing __________________ bytes.

e. What are the colors for these RGB hexadecimal encodings?

000000 = __________________ 555555 = ______________________

0000FF = __________________ 00FF00 = ______________________

f. If your data is simply a table of data with rows and columns, then what editable file structure is appropriate?

___________ (choose from XML, CSV, BMP, XLS)

If your data contains elements that have sub-elements, what editable data file structure would be appropriate?

_______________ (XML, CSV, BMP, XLS)

5. Plot on the number line with small circles this set of 10 univariate numbers {45, 55, 63, 67, 68, 74, 80, 84, 90, 95} then superimpose a Tukey box plot representing the median and 25th and 75th percentiles.

[16 pts]

┼────┼────┼────┼────┼────┼────┼────┼────┼────┼────┼────┼40 45 50 55 60 65 70 75 80 85 90 95

What are these data’s mean? _______

If a standard deviation spans the middle 70% of the data above and below the mean, estimate its value ________

If we normalize this data set to fall between 0 and 100, then these 4 values are recoded as…

45-> ______, 55-> _________, 90->_______ and 95-> _________

Page 3: faculty.juniata.edufaculty.juniata.edu/rhodes/ida/exams/exam1f13a.docx · Web viewFor each of the following concepts, list which stage in the pipeline or interaction/actions would

IT 241 Information Discovery Fall 2013 Exam 1 Page 3

6. We saw the following relational database SQL query in class. Fill in the blanks below regarding the query.[4 pts]

SELECT S.lastName, S.firstName, S.major FROM Student S, Enroll EWHERE E.grade<='B' AND E.stuId=S.stuId

Describe the result set of this query, i.e., what rows of data would come from the application of the query?

7. Describe three possible ways to handle missing data in a data set.[6 pts]

a.

b.

c.

8. The following attributes are found in a daily weather data set in various towns or counties in some state or province. Associate the best descriptor for each attribute. If you do not understand the meaning of the attribute, please ask for clarification.

Choices: Nominal-Categorical (Cat), Nominal-Arbitrary (no categories) (Arb), Ordinal Continuous (Cont), Ordinal Discrete (Disc),Spatial/geographic (Geo), Temporal/Time (Time)

[8 pts]

DateDay of Week (Mon, Tue, etc)Latitude-LongitudeCounty/town nameHigh temperature in FRainfall in mmNumber of highway fatalitiesProminent cloud type (choice of 6 types + ‘clear’)

Page 4: faculty.juniata.edufaculty.juniata.edu/rhodes/ida/exams/exam1f13a.docx · Web viewFor each of the following concepts, list which stage in the pipeline or interaction/actions would

IT 241 Information Discovery Fall 2013 Exam 1 Page 4

9. True/false miscellany.

[22 pts]

_____ Statistics can be computed only if the data columns are ordinal.

_____ Nominal-categorical can be converted to ordinal when the nominal data is ordered.

_____ Nominal-categorical unordered can be converted to a series of binary attributes.

_____ A Likert scale is unordered.

_____ Calculating a mode statistic can be directly applied to continuous data.

_____ Bins with the ranges [80,90) and [90,100], puts the value 90 into the [80,90) bin.

_____ Frequency counts are appropriate only for data in discrete ordinal form.

_____ The correlation statistic is a value in the range [-1,1]

_____ A correlation of 0.9 is one that says the two attributes have value pairs that agree roughly when relatively high or low.

_____ A correlation close to zero means that both attributes can be ignored.

_____ Linear regression attempts to fit a line to the data that maximizes the y-distance between the line and the data points.

_____ Linear regression is limited to 1 dependent variable and 1 independent variable.

_____ Attributes normally correspond to rows and tuples correspond to columns in relational databases.

_____ Relational databases relate data by common values between tables.

_____ CSV files should have the same number of values in each row.

_____ CSV files may not use the first row as column names.

_____ Since commas are used as separators in CSV, you cannot use commas in the data like “Name, Jr.”

_____ Spaces and tabs in CSV files are ignored unless used for word separation.

_____ XML looks like HTML (web page sources) except that we can define our own tags.

_____ Data can be represented in multiple ways in XML format.

_____ JSON is an alternative data file format to XML and CSV formats.

_____ Word document files (.doc, .docx) contain much binary data.

Page 5: faculty.juniata.edufaculty.juniata.edu/rhodes/ida/exams/exam1f13a.docx · Web viewFor each of the following concepts, list which stage in the pipeline or interaction/actions would

IT 241 Information Discovery Fall 2013 Exam 1 Page 5


Recommended