RESULT
ChipSeq AnalysisTeam Members: Wen-Han Pang, Natalia ShatokinaFaculty Advisor: Dr. Russ Abbott, Dr. Sandra Sharp
California Institute of Technology Liaison: Department of Computer Science
College of Engineering, Computer Science, and TechnologyCalifornia State University, Los Angeles
INSIGHTS STEP BY STEPSTEP BY STEP INTO DNA!
intersection between any of number of regulator binding site data files, which could scale up to 23,000 records per files
DNA MODEL Regulator Protein
Find the spatial relationship between regulator binding site and gene model, which range from WHOLE GENOME
downstreamupstream
Find the gene model over / under certain RNA strength threshold through out the Genome
RDBMS
Schema
Javascript Dynamic Uploading
JSP/JDBCReal Time/ Interactive Interface
Create/Delete Tables On The Run
AJAX
Real Time Input Feedback
Raw Data
.xls, .txt format
A RDBMS APPROACH
A CLOUD COMPUTING APPROACH
Hadoop File System
Multiple Nodes Cluster
Map-Reducee Program
Function Programming
Pig Data Flow Framework
Query-Based Map-Reduce
Picture Credits: Gordon Kwan, UTMB Cell Biological Graduate Program