+ All Categories
Home > Documents > DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and...

DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and...

Date post: 18-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
27
DKAN Data Warehousing, Visualization, and Mapping
Transcript
Page 1: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

DKAN

Data Warehousing, Visualization, and Mapping

Page 2: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Acknowledgements We’d like to acknowledge the NuCivic team, led by Andrew Hoppin, which has done amazing work creating open­source tools to make data available to the world; it’s been a pleasure improving DKAN together over the past two years. Gemima Barlow and the NDI Nigeria team initially supported the development of color­shaded maps, teaching us the meaning of the world “choropleth” in the process, and NDI’s Gender, Women and Democracy team for significant user identified and funded important usability improvements. This content is available under a Creative Commons Attribution­ShareAlike 4.0 International Public License. You are free to: Share — copy and redistribute the material in any medium or format; Adapt — remix, transform, and build upon the material for any purpose, even commercially. The licensor cannot revoke these freedoms as long as you follow the license terms. The license terms include: Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use; ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original; No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Table of Contents

Acknowledgements Table of Contents Introduction

Purpose of DKAN Features

Adding Data to DKAN Adding a New Dataset and Resource(s)

Step 1: Create the Dataset Step 2: Add one or more Resources to the Dataset Step 3: Adding Metadata to a Dataset

Visualizations Charts Choropleth Maps Publishing Visualizations Questions Data Stories

Page 3: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Introduction Many governments, institutions, and organizations are now moving towards open data, collecting and publishing large quantities of information in an effort to increase transparency and use data to inform policy. However, open data is not enough to improve lives, as the raw data has to be presented in meaningful and accessible ways to both citizens and policymakers. Data needs to be organized, processed, and presented in human-readable formats so that citizens, analysts, and policymakers can effectively use the information. However, many organizations lack the resources and technical capability to use commercial data visualization services or develop platforms of their own. That often means that the organizations in the best position to collect data and work closely with the communities the data comes from lack the ability to present and share this information in effective ways.

Purpose of DKAN Spreadsheets of raw numbers are difficult for most of us to easily understand. With DKAN, organizations can take large amounts of data and instantly organize, display, analyze, and visualize this information. This data-driven storytelling can help policymakers quickly understand the data to make better decisions, and each form of visualization can be instantly created as needed. Choropleth maps instantly show regional trends and variations, and a large dataset can instantly be organized into multiple charts and graphs comparing changes over, time, region, funding, or any number of variables. While other programs can easily be used to create individual graphs or sort lists of data, DKAN provides a comprehensive data warehousing, browsing, and visualization solution for large sets of data tagged with multiple variables, with highly customizable options based on the same set of data. DKAN is especially useful for rapidly prototyping multiple visualizations, aggregating data, and displaying changes over time or by geographic region. It has been particularly successful in releasing data from elections, censuses, health monitoring, and economic analysis.

Features Ready to use out­of­the­box, DKAN boasts powerful data warehousing, publishing, and visualization capabilities. With this tool, users can quickly publish and display open data, creating powerful data narratives with charts, graphs, and maps. The content management system (CMS) can be integrated with blogs and DKAN is compatible with major open data

Page 4: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github or Drupal for free and use the tool used by governments pursuing open data and used by NDI in multiple elections for publishing and visualizing data.

Adding Data to DKAN DKAN’s data publishing model is based on the concept of datasets and resources. A dataset is a collection of one or more resources; a resource is the actual “data” being published, such as a CSV table or a GeoJSON data file. Adding a New Dataset and Resource(s) In our example, we’ll be adding a dataset with Wisconsin polling places to a DKAN site. The data may look familiar; it's one of the sample datasets provided with DKAN upon installation. Step 1: Create the Dataset By default, only authenticated (“logged­in”) users can add new Datasets and Resources to a DKAN website. Once logged in, we can use the "Add Dataset" link in the main navigation bar. Depending on your user permissions, you may have access to the administration menu; in that case, you may also navigate to Content >> Add Content >> Dataset link to access the “Create Dataset” form.

Page 5: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

The Dataset is simply the container or folder for the actual data resource files and contains basic higher level information that applies across all the data, such as title, description, category tags, and license. Once we’ve entered information about the data, we can click the “Next: Add data” button to begin adding data. Step 2: Add one or more Resources to the Dataset

Page 6: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github
Page 7: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

After creating a dataset, we’re prompted to add one or more data resources to it. There are

three types of Resources that can be added to a Dataset, depending on the type and location of

the Resource:

Upload a file ­­ this option allows publishers to upload data files to the DKAN site. As in the “link to a file” option, the data within the file will be imported into your

DKAN site’s Datastore for preview and analysis by your users. See The DKAN

Datastore for more information.

Link to a file ­­ this option allows publishers to create a link to a data file published on another Internet website. Although the file itself will remain on the other site, the

data within the file can be imported into your DKAN site’s Datastore for preview and

analysis by your users. See The DKAN Datastore for more information.

Link to an API ­­ some data resources aren’t standalone files but queryable online databases; the interface to these databases is known as an API. Adding links to

these types of online database interfaces to your DKAN data catalog can be very

useful for developers interested in working with your data.

Typically, you’ll need to upload a file (almost always a .CSV), so please feel free to ignore the

linking options if you don’t need them.

To continue with our Wisconsin Polling Places example, we’ll add one resource file to the

Dataset we created in Step 1. Our resource file is a CSV that is, comma­separated values

format; this is a popular file format for exchanging tabular data. Let’s explore the example

resource shown here and the various fields within:

Resource / Choose File ­ upload a file from your local hard drive. Resource / Recline Views ­ DKAN’s “Data Preview” feature allows visitors to

preview published data in three views:

Map ­ data with latitude and longitude coordinates can be previewed in a map interface

Graph ­ tabular (spreadsheet) data can be graphed by users, letting them create their own meaningful visualizations (Please note this is a method for the data intake, not for rendering the graphs themselves)

Grid ­ by default, tabular data is presented in a basic spreadsheet view, with filter, sort, and search capabilities

Page 8: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Title ­ this is the title of the individual data file, not the parent dataset container. Description ­ a rich­text editor field is provided so publishers can offer detailed

and useful descriptions

Format ­ entering the file format here will allow users the ability to search for data by specific format

Dataset ­ this is the parent dataset container; this field should already be populated if you’re adding a Resource subsequent to adding a Dataset

At the bottom of the Add Resource page, we can choose:

Save ­ Save progress on this resource and immediately return to it for further editing

Save and add another ­ Save this resource and add another resource to the same dataset

Next: Additional Info ­ Save this resource and enter optional metadata

In our example, we’re only adding a single resource, so we’ll click “Next: Additional Info” to

move onto Step 3. If we had more than one resource to add to this dataset, we would choose

the “Save and add another” option.

Step 3: Adding Metadata to a Dataset Organizations may be interested in providing valuable information about their dataset to both

human visitors to the website and machines discovering the dataset through one of DKAN's

public APIs. All the below fields are optional, but provide important context on data type, kind

and function. Adding additional metadata to the dataset serves to further clarify how the data

can be used by others.

Page 9: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Let's take a closer look at some of the metadata fields available on this form:

** Author** ­ The data set's author, in plain text.

Spatial / Geographical Coverage Area ­ Lets us define what region the data applies to. In this case, the US State of Wisconsin. You can use the map

widget to draw an outline around the state borders, or, click the "Add data

manually" button if you already have a GeoJSON string you can paste in.

Spatial / Geographical Coverage Location ­ The region the data applies to, written in plain text. This can be used instead of or in addition to the Coverage Area field.

Page 10: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Frequency ­ How often is this dataset updated? We might expect our list of

polling places to be updated every year, so we could select "annually."

However, often we don't expect the data to be updated (even in this case,

perhaps we plan to post the next version of the data as a separate dataset), in

which case we can leave this blank.

Temporal Coverage ­ Like Geographic Coverage, this field lets us give some context to the data, but now for the relevant time period. Here we could enter

the year or years for which our polling places data is accurate.

Granularity ­­ This is a somewhat open­ended metadata field that lets you describe the granularity or accuracy of your data. For instance: "Year".

Data Dictionary ­ Another open­ended field, this is a space for almost any kind of explanation for understanding the terminology/units/column names/etc. in

our dataset. In most cases, this will be a simple URL to a Data Dictionary

resource elsewhere on the web.

Additional Info ­ Lets us arbitrarily define other metadata fields. See Additional Info field for more information.

Resources ­ This field is a reference to the resources you have already added. You should generally leave this field alone and use the workflows outlined here

and in Updating Datasets in DKAN to add, edit and remove resources from

your Dataset.

After you click "Save", the metadata we enter will appear on the page for this Dataset:

Page 11: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Visualizations Charts For numeric data that’s best rendered comparatively, you’ll want to make charts with your resources. You can make bar charts, pie charts, scatterplots, or line graphs.

Navigate to the dataset you want to base your chart on, then

Click the ‘Explore Data’ button

Page 12: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Right­click (or on Macs, control­click) the download button to copy the URL of the

resource file. Saving this link will allow you to directly revisit your resource in the future.

Now use the administration menu at the top to navigate to Structure » Entity types »

Visualization » Chart » Add Chart

Page 13: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Enter values for the title, description, categories and tags fields.

At the bottom of the form, paste the resource link you just copied into the ‘Source’ field.

Now, click the ‘Next’ button.

If the URL was loaded properly you will have two fields to fill under the title 'Define

Variables'. The first one, 'Series' ­ stands for the Y axis, and the second field, X­Field,

stands for the X axis. On these fields you have to choose the columns that you are going

to display. Only the Series field can contain multiple values. If the column names are not

displayed properly, check again that your source URL was correct. Keep the radio

buttons checked in 'auto'.

After making sure that everything is correct, click the ‘Next’ button.

Page 14: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Now you can select the type of chart you want to create. Click on the image of the chart

type you would like to use.

The charts on this screen are generic images and not based on the data you loaded. To

see the actual chart, click the ‘Next’ button.

If everything went ok, you should see your chart displayed. The data might be slightly

misplaced so on the right column, you can edit the X Format for the labels (number,

date, etc) , Label Rotation, Color of the lines / columns / etc, X and Y labels for the axis

themselves and margins to move not only the labels but the chart as well.

If you would like to see what this data looks like in another type of chart or graph, click

“Back” on the bottom on the page and repeat these steps with another chart or graph

selection.

After editing and customizing the chart to your liking, click the ‘Finish’ button.

Page 15: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Now you have created your chart. On the chart’s page, there will be an “Embed” button.

Click on it to reveal the HTML Embed code which you can add to any website to embed

a live, dynamic chart which will update if you change the chart on your DKAN site. You

can also set the height and width of the embedded chart by typing it into the Height and

Width boxes above the Embed code.

Choropleth Maps A choropleth map is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per­capita income. The choropleth map provides an easy way to visualize how a measurement varies across a geographic area or it shows the level of variability within a region. Choropleth Maps can effectively be used to report area values at virtually any scale, from global to local – and the data can be thought about in many different ways at many different levels of analysis, from general overall patterns to the detection of details. They are especially helpful for finding intriguing hot spots.

1. Look for Content ­> Add Content ­> Resource in the admin menu and click on it.

Page 16: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

2. Upload a csv file for the resource.

3. Fill the required fields and save the resource

Page 17: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

4. Look for Structure ­> Entity Types ­> Geo File ­> geojson ­> Add geojson in the

admin menu and click on it.

GeoJSON is a widely­used data format for displaying vectors in web maps. It is based on

JavaScript object notation, a simple and minimalist format for expressing data structures using

syntax from JavaScript. In GeoJSON, a vector feature and its attributes are represented as a

JavaScript object, allowing for easy parsing of the geometry and fields.

Page 18: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

5. Set Title

6. Upload a geojson file

7. Fill name attribute with the column name in the data (csv resource) that will match the

name property for the features in the geojson file.

Page 19: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

8. Click Save.

Page 20: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

9. Look for Structure ­> Entity Types ­> Visualization ­> Choropleth Visualization ­>

Add Choropleth Visualization in the admin menu and click on it.

Page 21: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

10. Fill Title

11. Select the geojson file we created for the geojson field.

12. Select the resource file we created for the resource field.

Page 22: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

13. Select the colors you like to use for the choropleth map.

14. Fill data column with the column or columns in the csv of your data that you want to

display in the map. Separate multiple columns with a comma. The columns that you

choose will appear as radio buttons on the side of your visualization, which you can then

toggle between to see the effect of different data. If you leave this field blank, you'll get a

list of radio buttons for all of the columns in your data sheet. The select of certain

columns in your data can be helpful when, for instance, trying to show change of data

over a certain time period ­ you could for example choose the April, May, June columns,

but leave out July, August, September.

Page 23: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

15. Fill the data breakpoints with comma separated numbers. If you leave this field blank,

breakpoints will be calculated for you based on the data. You will use breakpoints to

determine what data values will be captured by different colors on the visualization. For

instance, if you use ‘25, 50, 75, 100’ as your data breakpoints, your visualization will

display 4 different shades ­ one for those values between 0­25, a slightly darker shade

for values 25­50, an even darker shade for values 50­75, and the darkest shade for

values 75­100. Remember to choose your breakpoints wisely based upon the data that

you want to display!

Page 24: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

16. Click Save & Enjoy!

Publishing Visualizations After you finish creating the visualization, click on the blue ‘Embed’ button to get an embed code for sharing the file on other platforms. You can alter the height and width of the file to be embedded by entering the desired values in the corresponding text boxes. Once you’ve copied the code, you can now implant your visualization anywhere with a field for embedding an HTML element. Even on other sites, the graph will automatically update to any change made to the source data or settings on DKAN.

Page 25: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Questions DKAN not only renders data visualizations, it can serve as a standalone data storytelling platform as well. The first function available for telling data stories is creating a “question,” which allow users to combine visualizations with companion text and images.

Fill in the fields as desired, attach files, and categorize the question as fits the content. Fields marked with a red asterisk (*) are required to create the question. Make sure the entity URL matches the one auto­generated for the question. Previously rendered visualizations can be added to the question by pasting the embed code into the corresponding field. Click ‘Save’ at the bottom and your question is ready for viewing.

Data Stories Telling stories based on data is a primary goal of DKAN. Visualizations can be used to create a clear understanding of a complex situation. Furthermore, elements of storytelling can be used to illustrate what the findings actually mean. The best method for leveraging the narrative in your data with DKAN is creating a “data story”. Data stories consist of multiple elements and pieces of content, allowing you to build unique and engaging bulletins showcasing your data.

Page 26: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

Title it and add any images, body text, or tags, then select the layout that best fits how you want to represent your data and content. Click ‘Save’ and you’ll be greeted with a screen prompting you to add and define your content. The functional icons do the following:

plus icons allow you to add content gear icons permit you to modify formatting options paintbrush icons allow you to change the style of content’s pane arrow icons enable you to change the position of the content trash can icons allow you to delete the content

Page 27: DKAN - DemTools · 2015-12-07 · standards, including the White House’s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github

You can add all kinds of content, new or existing, and organize it as you see fit. When you’ve finished building and organizing content, click the save button at the bottom and your data story is ready!


Recommended