Post on 22-May-2020
transcript
Network Anomaly Analysis using the Microsoft HoloLens
Steve Beitzel1 , Josiah Dykstra2 , Paul Toliver1 , and Jason Youzwak1 1 Vencore Labs, Basking Ridge, NJ, USA
{sbeitzel, ptoliver, jyouzwak}@vencorelabs.com 2 Laboratory for Telecommunication Sciences, College Park, MD, USA
jdykstra@LTSnet.net
We investigate the feasibility of using Microsoft HoloLens, a mixed reality device, to visually analyze
network capture data and locate anomalies. We developed MINER, a prototype application to visualize
details from network packet captures as 3D stereogram charts. MINER employs a novel approach to time-
series visualization that extends the time dimension across two axes, thereby taking advantage of the
immersive 3D space available via the HoloLens. Users navigate the application through eye gaze and hand
gestures to view summary and detailed bar graphs. Callouts display additional detail based on the user’s
immediate gaze. In a user study, volunteers used MINER to locate network attacks in a dataset from the
2013 VAST Challenge. We compared the time and effort with a similar test using traditional tools on a
desktop computer. Our findings suggest that network anomaly analysis with the HoloLens achieved
comparable effectiveness, efficiency and satisfaction. We describe user metrics and feedback collected from
these experiments; lessons learned and suggested future work.
INTRODUCTION
The goal of this work is to investigate the feasibility of
using augmented reality (AR) and mixed reality (MR) devices
to assist in the day-to-day work of network operators.
Conventional tools for network analysis are not always well
suited to exploratory analytical tasks on large datasets. This
study explores alternatives to conventional approaches by using
3D applications developed for mixed reality devices such as the
Microsoft HoloLens to explore and analyze large datasets. We
capture a variety of metrics designed to inform the differences
in operator experience when using mixed reality tools in
comparison to conventional approaches.
Previous research has demonstrated that the use of the
HoloLens can lead to increased performance and lower
workload (Velamkayala, Zambrano, & Li, 2017). Other
research has shown that 3D visualizations can improve both the
speed and accuracy of power system tasks (Wiegmann,
Overbye, Hoppe, Essenberg, & Sun, 2006).
In our past research (Beitzel, et al., 2016)we explored the
use of Android-based augmented reality (AR) devices and
performed experiments to demonstrate the effect of AR devices
on cognitive load. These experiments showed that users
expressed a decrease in their cognitive load when using an AR
device with limited capabilities to monitor for emergent alerts.
In a subsequent effort (Beitzel, Dykstra, Toliver, & Youzwak,
2017) we explored the capabilities of the HoloLens and
developed prototype applications to display logical networks as
3D stereograms on different levels: globally on a sphere of the
earth and locally as logical network topology.
This work describes experiments performed with a new
HoloLens application for network traffic visualization called
MINER (MIxed reality NEtwork AnalyzeR), which provides
the user a capability to visually determine if there are issues in
their network that may require more detailed analysis, such as
evidence of network attacks or configuration issues. We use the
novel approach of displaying network traffic information as 3D
stereogram visualization charts, with gaze context-sensitive
callouts. We conducted a limited user study to compare the new
application with a similar desktop application. We discuss user
feedback, lessons learned, and advice to researchers and
practitioners.
PRACTICE INNOVATION
Network operators have numerous desktop network
visualization tools at their disposal, but relatively few for mixed
reality devices. Our goal is to design and build network
visualization tools that take advantage of the capabilities of a
mixed reality device, specifically the Microsoft HoloLens and
to assess the value of using 3D immersive visualization.
HoloLens
The HoloLens is a mixed reality device developed by
Microsoft. The HoloLens contains an Intel 32-bit processor, a
custom-built Microsoft Holographic Processing Unit (HPU
1.0), 2 GB RAM, 64 GB flash memory, and network
connectivity via Wi-Fi 802.11ac (Microsoft HoloLens). Using
projection-based smart-glasses that utilize optical waveguide
technology, 2D and 3D images can be displayed on the
HoloLens, overlaid on top of the user’s field of view.
We believe the HoloLens provides advantages over
desktop tools in regards to data visualization, including
the ability to track user gaze and provide context-
related information;
hand gesture interface;
capability of displaying 3D stereograms;
immersive 3D user environment, where the user can
move physically to view different data points; and
larger display (effectively room-size).
PRACTICE APPLICATION
Not
sub
ject
to U
.S. c
opyr
ight
rest
rictio
ns. D
OI 1
0.11
77/1
5419
3121
8621
472
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 2094
To evaluate the potential use of a mixed reality device to
enhance network operations tasks, we developed a 3D app for
visualizing NetFlow-based IP traffic statistics called MINER.
The associated resources/tools we utilized in our effort included
the following: (i) a 3D-based visualization application running
on the Microsoft HoloLens, (ii) a collection of pre-processed
NetFlow traffic matrices imported into the HoloLens
application, and (iii) a set of Python-based tools for pre-
processing the NetFlow dataset. We discuss each of these
elements below, beginning with the network traffic dataset.
Dataset
MINER is designed to read an arbitrary dataset containing
various types of traffic matrix statistics. To develop and test the
application, we selected the publicly available VAST Challenge
2013 Mini-Challenge 3 dataset (VAST Challenge 2013: Mini-
Challenge 3), which contains a rich set of anomalous network
events including port scans, denial of service (DoS) attacks, and
data exfiltration. The dataset also includes a ground truth
spreadsheet that provides full details for the attacks present in
the dataset.
Dataset Processing Tools
The VAST Challenge 2013 NetFlow data consists of over
6GB of synthetic raw comma-separated-value (CSV) data
collected over a two-week period. For performance and storage
reasons, we pre-processed the NetFlow data on a desktop
machine, and loaded the resulting smaller data matrices onto the
HoloLens.
We developed a Python-based data conversion utility to
preprocess the data. The utility takes NetFlow CSV data files as
input and creates a collection of data matrices using various
combinations of fields for the row heading, column heading,
and individual cell values. As discussed in detail below,
MINER plots these matrices on a 3D bar graph where rows
correspond to the X-axis, columns correspond to the Y-axis,
and cell values are plotted as bars scaled along the Z-axis. A
matrix could have, for example, time and IP address associated
with the X- and Y-axis while the Z-axis might correspond to
number of NetFlow bytes observed in the specified time
interval.
The conversion utility aggregates NetFlow data statistics
over five minute intervals and outputs the data as a collection
of week-long summary reports and four-hour window detailed
reports in CSV format for use by MINER.
HoloLens application
We developed MINER using a toolchain consisting of
Unity 3D and Microsoft Visual Studio. The pre-processed
NetFlow data matrices are imported separately as resource
assets into the Unity project.
When MINER is started on the HoloLens, the user must
first select the appropriate parameter for plotting along the Z-
axis. The user selects parameters by gazing at a popup menu of
options based on NetFlow fields and selecting the desired field
using HoloLens hand gestures. Menu options include: the
number of bytes or packets transmitted between different
network source and destination nodes, and the number of
unique destination or source TCP/IP ports.
Summary reports. Once the user has selected their desired
Z-axis selection, MINER displays a weeklong summary report
as a 3D bar graph. Weeklong summary reports use a novel
approach to displaying a time-series graph in that both the X
and Y-axis initially represent time. Based upon configuration of
the NetFlow pre-processing utility, the range along the X-axis
is set to a 4-hour time window (240 minutes) with 5-minute
resolution for each bar. Adjacent 4-hour windows are plotted
sequentially along the Y-axis using the approach illustrated in
Figure 1. As opposed to a traditional 2D graph, which would
otherwise become very wide along the X-axis (assuming the
same 5-minute increments), this approach makes efficient use
of the third dimension available from the HoloLens.
Figure 1 - Illustration of how 2D time-series is represented as a
3D bar graph in MINER
An example HoloLens snapshot of a weeklong summary
report after a Z-axis selection of number of “Packets in” is
shown in Figure 2. The 3D bars are normalized to the maximum
number observed over the weeklong dataset and shaded
according to a jet color palette. Additional text labels and titles
serve as references on the X, Y, and Z-axes. The 3D bar graph
is anchored to a stationary position within the user’s physical
environment, allowing the user to walk around the virtual image
for viewing the data at any perspective.
Figure 2 - Snapshot of MINER on the HoloLens with weeklong
summary report for number of “Packets in”
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 2095
Callouts. As the user visually scans the graph for potential
network anomalies (e.g. large spikes in traffic or excessive
number of TCP/IP ports in use), MINER displays a callout
containing additional details, such as bar amplitude and
timestamp for the 3D bar on which the user is currently focusing
their gaze. In Figure 3(a), we show an example of one such
callout. If the user suspects a 3D bar indicates an anomalous
event, he or she can isolate a single segment of the entire 3D
graph by gesture selecting the “Isolate 4-hour window” button
to drill down into the time window represented by the 3D bar in
question. Figure 3(b) shows the resulting “single window” bar
graph.
Figure 3 - (a) Snapshot of MINER illustrating how 3D bars are
selected by the user’s gaze, providing additional details such as
amplitude and timestamp. (b) Bar graph after isolating a single 4-
hour window.
Detailed reports. Upon isolating a specific 4-hour window
of interest, the user has the option to analyze further details on
the 3D bar graph. Specifically, MINER enables an additional
pop-up menu for the Y-axis, which allows for the selection of
additional axis parameters beyond time. For example, the data
values represented in the isolated 4-hour window can be plotted
against the source/destination subnet involved or against
individual IP addresses in a single source/destination subnet.
In Figure 4 we show a snapshot of a detailed 4-hour report.
Here, the “Bytes in” parameter was selected for the Z-axis and
“Source addr in 10.0.0.0/8” was selected for the Y-axis. The
large spike in “Bytes in” originating from source IP address
10.6.6.6 could be indicative of a potential anomalous network
event, such as a DoS attack.
Figure 4 - Snapshot of MINER with detailed 4-hour report for
number of “Bytes in” plotted against source IP address.
In Figure 5 we show a summary of the steps required to navigate
the graphs.
Figure 5 – MINER user navigation
DISCUSSION
After research and developing the application, we sought
feedback and evaluations from users. For this reason, and to get
a better understanding of how well MINER compares to
traditional desktop applications, we performed user testing.
These tests revealed strengths and limitations of our approach
and how it may be applicable in practice and to future designs.
User Testing
We asked participants to identify network events that
occurred in the provided dataset. Participants had the
opportunity to perform this task using both MINER on the
HoloLens and a conventional baseline application in a
Windows desktop. Upon conclusion, we invited each
participant to give qualitative feedback on his or her
impressions of the user experience in each case.
Baseline application. For the baseline component of the
experiment, we selected the desktop application Kibana
(Kibana). Kibana is a commonly-used web application built
upon Elasticsearch (Elasticsearch) that allows a user to build
visual representations using various 2D graph types. It also
supports real-time dashboards that can be used for security
analytics. In addition, other researchers have specifically used
it to analyze Netflow Data (Netflow Analysis with
Elasticsearch). We selected Kibana due its flexibility in
processing big data input files, the ability to manually search
and mine through data elements and the ability to display user-
interactive graphs.
To facilitate participants identifying network anomalies
using Kibana, we created a set of time series bar graph
visualizations for the entire week of data using a bin size of
three hours. Using the mouse, participants could zoom into
specific time frames, and the application would automatically
adjust the graph to display finer time resolution. Once a time
frame was selected, users drilled down into specific detail about
IP subnets and addresses. In Figure 6 we show an example of
the graphs available for participants to use.
We allotted one hour for each participant to perform the
test, including time for training. On average, participants
completed the testing in approximately 40 minutes.
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 2096
Figure 6 - Baseline Application (Kibana) graphs
Experiment design. We recruited ten study participants,
ranging in age from 30-60, each having a Master’s degree or
higher in computer science or electrical engineering. Cyber
security experience ranged greatly across participants, from
competent to expert.
For the initial phase testing we randomly assigned each
participant to a platform: five participants started with MINER
on the HoloLens and five participants started with the baseline
application. We prepared training materials that we presented
to users prior to beginning the task on their assigned platform.
Once familiar with the system, the users performed the actual
test and afterwards provided feedback on their experience.
For the second phase of user testing we selected the same
set of ten test candidates. We assigned each participant to the
platform he or she did not use in the first round. The second
round of testing used the same applications loaded with the
same training and testing data as the first round.
Experiment results. In the first test phase, we observed
that HoloLens participants spent 15% less time on average in
locating each event than Baseline participants, as shown in
Figure 7(a). However, according to individual results, as shown
in Figure 7(b), three Baseline users actually performed better
than the HoloLens users when detecting Denial of Service
attacks.
In the second test phase, on average, HoloLens
participants spent about the same time detecting two of the
event types (6% longer for Denial of Service and 10% longer
for Data Exfiltration) compared to the Baseline participants as
shown in Figure 7(b). HoloLens participants spent 22% less
time than the Baseline participants detecting one event type
(Port Scan).
Also, the average time to detect events decreased between
the first and second rounds of the testing. Baseline participants
in the second round spent an average of 36% less total time than
their first round counterparts, and similarly HoloLens
participants in the second round spent an average of 27% less
time than those did in the first round. This is to be expected, as
participants were conceptuality familiar with the overall goal
and strategy of performing the test.
Results from this experiment suggest that MINER
achieves comparable effectiveness and efficiency to a desktop
counterpart. These results offer promise for practical
deployments of the technology.
Figure 7 - Round 1 and 2: Detection Comparison (Average)
in seconds, lower scores are better
Figure 8 - Round 1 and 2: Detection Comparison
(Individual) in seconds, lower scores are better
Subjective Workload. Participants rated their workload
experience by using NASA-TLX (TLX @ NASA Ames).
Scores from both test rounds are shown in Table 1. Table 1 – Mean NASA Task Load Index (TLX) Scores are shown
for Mental Demand (MD), Physical Demand (PD), Temporal
Demand (TD), Performance (P), Effort (E), and Frustration (F)
Platform MD PD TD P E F Composite
Baseline 43.5 15.5 33.5 22.5 38.5 35.5 31.5
HoloLens 36 43.5 21 28.0 38.0 32.5 33.2
The composite scores, Baseline (M=31.5) and HoloLens
(M=33.2), are less than the midpoint (50) which suggest that
participants did not find the task too demanding and that they
found the HoloLens only slightly more demanding than the
Baseline tool. Participants found the HoloLens to have an
average of 64% more physical demand than the baseline tool,
which is to be expected given the weight of the device and
additional physical movement involved. Participants rated the
HoloLens an average of 37% less temporal demand than the
baseline tool. One possible reason might be the novelty of the
mixed reality experience lessening a sense of time pressure.
User feedback. In addition to the metrics above, we
interviewed each participant after the experiment about their
satisfaction. We gained valuable insight on what features they
felt worked well and how we might improve the application.
Overall, participants were split on which platform they
preferred. Some participants preferred the HoloLens, stating
that it was easier than using desktop tools, while other
participants preferred the desktop stating that it was a more
familiar environment.
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 2097
Several participants felt that the third dimension
significantly added value and that the data peaks stood out
visually which made it easier to zero in on anomalies. In
addition, although there was a learning curve, many participants
quickly became familiar with the concept of Time x Time x
Metric graphs, said the gestures felt natural, and that the
application was surprisingly intuitive.
Participants also experienced several shortcomings with
the HoloLens hardware that are described in the next section.
Limitations
The HoloLens hardware presents several challenges. At
1.28 pounds, some users find the device heavy and difficult to
wear for periods longer than 30 minutes. Some users found it
difficult to achieve a comfortable fit. Compared to the natural
world, the field of view is small at 35 degrees. The HoloLens
cursor can be difficult to keep steady and sometimes hand
gestures are not recognized.
While implementing MINER for the HoloLens, we also
identified critical tradeoffs between the level of detail used in
our 3D bar graphs and app performance, such as responsiveness
to user interaction. Specifically, as the graphs are scaled to
increasing numbers of bars, processing time for rendering and
performing collision detection becomes significant relative to
the frame update period. This can result in erratic image
displays, poor responsiveness to gazing/gestures, or stalled
applications.
In order to mitigate these issues, we fixed the time
resolution to five minutes over the entire one-week period. This
resulted in limiting the dimensions of the data matrices that
were generated to less than 50 x 50. We believe this was a
reasonable implementation compromise since the time
increment is comparable to the maximum resolution we
assumed for the Kibana baseline application (after final
zooming).
In addition to a fixed time resolution, a limiting function
was used in MINER to display only those matrix elements that
rise above a given threshold, which was arbitrarily set at 5% of
the maximum value (per matrix). Given that the indicators for
network attacks considered here consist predominantly of
anomalous peaks (as opposed to lower amplitude data), we
believe this particular implementation compromise does not
significantly advantage or disadvantage the HoloLens user over
the baseline Kibana app.
PRACTICTIONER TAKEAWAYS
We offer the following advice and takeaways:
Compared with a desktop application, network anomaly
analysis with HoloLens achieved comparable
effectiveness, efficiency, and satisfaction.
3D stereogram bar charts may provide faster and more
accurate visual recognition of data peaks than similar 2D
charts.
Callouts based on user gaze can be useful to provide
additional context sensitive detail.
Time x Time x Metric graphs have a short learning curve
and are a feasible approach to showing trends.
Time x Time x Metric graphs may be more intuitive when
using a full day window for the X-axis rather than a four
hour interval.
To minimize fatigue, choose interactions that do not
require extended periods of physical concentration and
focus which may lead to strain.
Minimize the use of air taps or consider other interface
devices. Research in this area has shown issues with
ergonomics of the air tap gesture(Looker & Garvey, 2015)
ACKNOWLEDGEMENTS
This material is based upon work supported by the U.S.
Government under contract HR98230-13-D-0055. The views
and conclusions contained in this document are those of the
authors and should not be interpreted as representing the official
policies, either expressed or implied, of the U.S. Government.
REFERENCES
Beitzel, S., Dykstra, J., Huver, S., Kaplan, M., Loushine, M.,
& Youzwak, J. (2016). Cognitive Performance
Impact of Augmented Reality for Network
Operations Tasks. Advances in Intelligent Systems,
pp. 139-152.
Beitzel, S., Dykstra, J., Toliver, P., & Youzwak, J. (2017).
Exploring 3D Cybersecurity Visualization with the
Microsoft HoloLens. Advances in Human Factors in
Cybersecurity, pp. 197--207.
Elasticsearch. (n.d.). Retrieved from
https://www.elastic.co/products/elasticsearch
Kibana. (n.d.). Retrieved from
https://www.elastic.co/products/kibana
Looker, J., & Garvey, T. (2015). Reaching for Holograms.
Proceedings from International Design Congress,
504-511.
Microsoft HoloLens. (n.d.). Retrieved from
https://www.microsoft.com/microsoft-hololens/en-us
Netflow Analysis with Elasticsearch. (n.d.). Retrieved from
http://www.ojscurity.com/2015/02/netflow-analysis-
with-elasticsearch.html
TLX @ NASA Ames. (n.d.). Retrieved from
https://humansystems.arc.nasa.gov/groups/TLX/
VAST Challenge 2013: Mini-Challenge 3. (n.d.). Retrieved
from
http://vacommunity.org/VAST+Challenge+2013%3A
+Mini-Challenge+3
Velamkayala, E. R., Zambrano, M. V., & Li, H. (2017,
September). Effects of HoloLens in Collaboration: A
Case in Navigation Tasks. In Proceedings of the
Human Factors and Ergonomics Society Annual
Meeting, Vol. 61, No. 1, pp. 2110-2114.
Wiegmann, D. A., Overbye, T. J., Hoppe, S. M., Essenberg,
G. R., & Sun, Y. (2006). Human factors aspects of
three-dimensional visualization of power system
information. Power Engineering Society General
Meeting, pp 7-pp.
Proceedings of the Human Factors and Ergonomics Society 2018 Annual Meeting 2098