ESSENCE Advanced Skills – Using APIs
Slides adapted from: Amanda Dylina Morse, MPH
Washington State Department of Health
WA State DOH | 2
Agenda
API basics
o What they are
o Why they’re cracker
Creating an API in ESSENCE
o Types of APIs available
o Fields to include
o Issues to watch for
Pulling data into R using an API
o Credential manager
o Manipulating APIs
o Avoiding throwing errors
o Special functions
WA State DOH | 5
API Basics
What is an API?
o A method of computers talking to each other
o As a serious scientist, I used Wikipedia for a simple explanation → “An application
programming interface (API) is a computing interface which defines interactions
between multiple software intermediaries. It defines the kinds of calls or requests that can
be made, how to make them, the data formats that should be used, the conventions to
follow, etc.”
Advantages of using an API?
o Automation of repeated tasks → pulling data or graphics for reports and analyses
o Doing analyses you can’t do in ESSENCE
o Creating visualizations that aren’t available in ESSENCE or where a different look/feel is
needed
o Incorporating ESSENCE data into a report
o Preparing a dataset where limited fields or anonymized information is required
WA State DOH | 7
Getting Your API
Different types of APIs available in ESSENCE
o Table builder
o Time series
o Alert list
o Data details
Start by building your query—enter your limiters and parameters as you usually would
o We have resources to help you with this in the RHINO Guidebook
Field names are available in the NSSP Data Dictionary under the “ESSENCE API and Data Details” and “ESSENCE API Query Parameters” tabs
There’s more documentation on APIs available in ESSENCE here, if you’d like to dig into that
WA State DOH | 8
Hot Tips to Keep Wayne and AKP from Calling Us
Sometimes you need to pull down a lot of data—like, multiple years of raw data from the data details output, which can stress the system for everyone
o NSSP calls us when someone is stressing the system, so we’ll know it was you
1. First, evaluate if you really need all the raw data—can it be aggregated?
o If so, consider using the table builder instead of data details
o If not, consider pulling the data using multiple smaller timeframes
2. If you are testing/building out a query, use a very short timeframe ***Once you submit an API, there is no way to kill it on the user end***
3. Consider which fields you need and only pull down that selection
4. Repeated data pulls:
o Save a “historical” data file and only pull more recent data and append
o Pull only records that have been updated recently using “LastUpdatedDateTime”
5. Run large data pulls after hours or on the weekend
6. To speed up data pulls, turn off VPN and any other devices using up internet
WA State DOH | 9
Time Series API
After running your query, open the Query Options
box and click “API URLs”
o Data → the values associated with each
observation
o Graph → the actual graph image (single full size
or micrograph(s))
o Summary stats → what it says on the package
If you’re not using a stratified time series, you can
also set your graph title and axis labels to have
them saved in the API
o This will only save for stratifications if they’re
micrographs
Highlight the text and copy it
WA State DOH | 10
Data Details API
The process is the same for APIs to pull data details
outputs, but you get four choices
o CSV with raw values
o CSV with reference values
o JSON with raw values
o JSON with reference values
There’s not a “best format” but I like CSV with
refence values, Cody likes JSON files
o It’s all about preference
When I initially run the query, I typically will run just a
day or two of data, then modify the dates in the
API URL when I put it in R
o This helps keep Wayne and AKP happy
WA State DOH | 11
Table Builder API
Table builder APIs are a lot like the others and come in two flavors
o CSV
o JSON
You may find it helpful to set Site = Washington in your query so you can limit to a single column and put all stratifications of interest in rows (this can make working with the data much easier in R)
o HasBeenE can also do this if you’re pulling emergency department visits
If you use a percent query, your output will include columns for:
o Numerator (relevant counts), denominator (total counts), and percentage
You only need to design the table in ESSENCE to generate the API URL
o If too many cells in table builder, you can still pull the data via API
If you stratify by facility or region, you will get ALL options in the system → explicitly select facilities/regions of interest to avoid this
WA State DOH | 12
Alert List API
Not premade by ESSENCE
Allows you to pull down list of alerts, filter, sort, etc
Sample URL provided in API Guidance doc
WA State DOH | 14
ESSENCE Credentials
There are multiple ways to get ESSENCE to
communicate with R to get your credentials, but the
“keyring” package in R is the easiest
o Hides your credentials (great if you share code)
o Option 1: manually enter password into pop-up
o Option 2: save credentials in Windows Credential
Manager
■ You don’t have to enter it every time you run the code
(gross)
■ You’ll need to open your Windows Credential Manager
setting and enter the information to look as it does on the
right (but with your NSSP credentials)
♦ Every time you change your ESSENCE password (every
90 days!) you’ll need to update it in the Credential
Manager too
WA State DOH | 15
API URLs
Pull all ED visits with Coronavirus DD CC and DD category between 1/9/21 – 1/11/21
https://essence2.syndromicsurveillance.org/nssp_essence/api/dataDetails/csv?endDate=11Jan2021&percentParam=noPercent&datasource=va_er&startDate=9Jan2021&medicalGroupingSystem=essencesyndromes&userId=520&hospFacilityType=emergency%20care&aqtTarget=DataDetails&ccddCategory=cdc%20coronavirus-dd%20v1&geographySystem= region&detector=probrepswitch&timeResolution=daily&hasBeenE=1
WA State DOH | 18
Custom Functions
Using custom functions to pull down data:
o Pulling ESSENCE records for a specific list of ESSENCE IDs
■ Available here: https://github.com/PHSKC-
APDE/DOHdata/blob/master/essence/essence_query_functions.R
■ Call function using:
source("https://raw.githubusercontent.com/PHSKC-APDE/DOHdata/master/essence/essence_query_functions.R")
#PULL IN DATA FOR LINKED RECORDS FROM ESSENCE
essenceresults <- bind_rows(event_query(event_id =api_data$C_BioSense_ID, bulk=T, group_size=1000))
o Pulling down data using date loops
■ Functions available here: https://github.com/sara-chronister/syndromic-surveillance/tree/master/API#Call in set-dates function
source("https://raw.githubusercontent.com/sara-chronister/syndromic-surveillance/master/API/Set-Dates")
#Call in get csv function
source("https://raw.githubusercontent.com/sara-chronister/syndromic-surveillance/master/API/Get-CSV")
# Call in looping date function from Github repo
source("https://raw.githubusercontent.com/sara-chronister/syndromic-surveillance/master/API/Loop-Successive-Time-
Periods")
lstartdate <- format(Sys.Date()-90, "%Y-%m-%d") #ESSENCE query start date
lenddate <- format(Sys.Date(), "%Y-%m-%d") #ESSENCE query end date
vis <- return_longterm_query(url2, loop_start = lstartdate, loop_end = lenddate, by=30) #RUN QUERY IN 30 DAY CHUNKS
WA State DOH | 19
Watching Out for Issues
Issues to look out for:
o Incomplete/corrupted data pulls
■ Signs:
♦ Data details will have message in last record about query being interrupted
♦ Weirdo time series
o Failed data pull
■ Signs:
♦ No records in data pull
♦ Status code indicates failure
Resolution:• Ensure strong/stable internet
connection• Run again
WA State DOH | 20
Questions?
Amanda Dylina Morse, MPH Epidemiologist | Syndromic Surveillance Coordinator
Office of Public Health Outbreak Coordination,
Informatics, and Surveillance (PHOCIS)
Washington State Department of Health
206.437.2045 | [email protected]
Natasha Close, PhD, MPH Epidemiologist
Office of Public Health Outbreak Coordination,
Informatics, and Surveillance (PHOCIS)
Washington State Department of Health206.430.0617| [email protected]