New sequencing technologies have catapulted biologists into the realm of big data where the need for insight has sparked some interes9ng new biological visualiza9ons. The ability to sequence en9re genomes allows biologists to ask ques9ons on the gene systems level. However, trying to understand gene networks that have tens of thousands of individual genes is challenging. Even when visualized, gene networks are so large and complex that the visualiza9on itself may overwhelm the user and useful knowledge may be concealed in the plethora of visual informa9on. The gene Product Interac9on Conserved Topology visualizer (gPICTviz) is a complicated visualiza9on soGware that depicts individual genes as nodes and rela9onships between genes as edges displayed in a force-‐directed layout. This tool can illustrate thousands of nodes and exponen9ally more edges in a single overwhelming visual. The goal of this research was to iden9fy clusters of gene nodes to represent in an abstracted visual that a user can then choose to expand as needed in order to gain more detailed insight. This provides not only a more robust complex visualiza9on, but also saves compute 9me because the graphical component of the visualiza9on is the most computa9onally taxing. To accomplish the visualiza9on, OpenGL was used as well as C programming code for speed and MCL-‐edge soGware for cluster iden9fica9on. The results were a consolidated, more user-‐friendly gene network visualiza9on that aids biological researchers in finding the answers to ques9ons about or within the genome while promo9ng further integra9on of gene9cs and systems biology.
REU Site: Research Experience for Undergraduates in Collabora:ve Data Visualiza:on Applica:ons • June 1 – July 24, 2015 • Clemson University • Clemson, South Carolina
-‐ REU Funded by NSF ACI Award 1359223 Vetria L. Byrd, PI
-‐ Sponsored by Clemson University NSF Award #2009134
-‐ ACCAC Summer Research Scholar Program
Abstract
Detangling genetic “hairballs”: implementing an abstracted, gene cluster view for the gPICTviz visualization tool.
Kathleen E. Kyle1, Karan Sapra3, Amari Lewis2, Melissa C. Smith3, Jill Gemmill3, F. Alex Feltus4 1Florida State University, Tallahassee, FL, USA; 2Winston-Salem State University, Winston-Salem, NC, USA;
3Departments of Electrical and Computer Engineering and 4Genetics & Biochemistry, Clemson University, SC, USA. Methods & Tools
MCL-‐edge • Iden9fy gene clusters
Python
• Cluster ID genera9on/file reforma^ng
C • Store into data structures
OpenGL/C • Visualize
Figure 4: Original gPICTviz visualiza9on with clusters highlighted.
This visualiza9on provides a gene cluster view for the gene networks by depic9ng each gene cluster as a sphere where the sphere size represents the number of genes within that cluster. The edges between two clusters represent all single gene connec9ons between the clusters through varying line width. A thicker edge represents more connec9ons than a thinner edge. The three edge colors represent the different network rela9onships, two within network rela9onships and one between networks rela9onship for a two network graph. Reducing the ligh9ng in the visual can dim the edges and focus on the clusters. The visualiza9on tool also provides zoom and rotate capabili9es, and a search feature to find clusters containing specific gene IDs or gene ontology (GO) terms.
Conclusion
# of Clusters # of Edges FPS
Zma -‐ Osa 667 67,039 ~30
Ath -‐ Ath 1491 11,611 ~10
-‐ Improve ligh9ng and force-‐directed layout -‐ Add interac9vity such as clicking on a cluster to expand into nodal view -‐ Adjust code to u9lize the GPU -‐ Integrate with Amari Lewis’ project to include gene ontology informa9on
and beier the user interface and overall func9onality -‐ Scale to more than two gene networks -‐ Performance evalua9on:
Future Work
Interac:on and Vizualiza:on of Mul:ple layers of High Dimensional biological data. Amari Lewis1, Karan Sapra,2 Kathleen Kyle3, Alex Feltus2, Melissa Smith2,Jill Gemmill2. 1Winston-‐Salem State University, 2Clemson University, 3Florida State University. G3NA: A GPU Op:mized Global Gene Network Alignment Tool. Karan Sapra1, Melissa C. Smith1, F. Alex Feltus2, Asher Sampong3, Joshua A. Levine4, Anagha Joshi1. Departments of 1Electrical and Computer Engineering and 2Gene9cs & Biochemistry, Clemson University, SC, 29634, USA; 3Fort Valley State, Fort Valley, GA 31030, USA; 4School of Compu9ng, Clemson University, SC 29634, USA.
Acknowledgements
Advanced Visualiza9on
Division
Related Work
Research Ques:on
Visualiza:on Explained
Figure 2: Dark side of maize-‐rice gene networks visual
Figure 1: Two A. thaliana gene networks Figure 4
Will an abstracted, gene cluster view for gPICTviz facilitate expert inves:ga:on of mul:species gene:c systems?
Figure 3: Search feature highlights a cluster that contains “GRMZM2G161306”
The cluster view for gPICTviz shows promise as a higher-‐level interface for exploring complex gene9c systems. For the purposes of scaling this visualiza9on tool to greater than two networks, it will be necessary to provide abstracted viewing levels, and this cluster view is the first step in that direc9on.