+ All Categories
Home > Documents > Weekly Report Start learning GPU

Weekly Report Start learning GPU

Date post: 14-Jan-2016
Category:
Upload: baxter
View: 14 times
Download: 0 times
Share this document with a friend
Description:
Weekly Report Start learning GPU. Ph.D. Student: Leo Lee Supervisor: Dr. Xiaowen Chu Date: Sep. 11, 2009. Outline. Protein identification and pFind GPU and data mining Research Plan. Protein identification and pFind. Background Identify flow Challenges - PowerPoint PPT Presentation
Popular Tags:
39
Weekly Report Start learning GPU Ph.D. Student: Leo Lee Supervisor: Dr. Xiaowen Chu Date: Sep. 11, 2009
Transcript
Page 1: Weekly Report Start learning GPU

Weekly ReportStart learning GPU

Ph.D. Student: Leo LeeSupervisor: Dr. Xiaowen ChuDate: Sep. 11, 2009

Page 2: Weekly Report Start learning GPU

Outline

Protein identification and pFind

GPU and data mining

Research Plan

Page 3: Weekly Report Start learning GPU

Protein identification and pFind

Background

Identify flow

Challenges

Could GPU be used?

Page 4: Weekly Report Start learning GPU

Protein identification and pFind

Background

Identify flow

Challenges

Could GPU be used?

Page 5: Weekly Report Start learning GPU

The Human Genome Project: China 1%

Page 6: Weekly Report Start learning GPU

Same gene , different protein

Page 7: Weekly Report Start learning GPU

Human Plasma ProteomeProject, USA

Human Disease Glycomics/Proteome Initiative (HGPI), Japan

Human Proteome Program: China in charge of liver

Page 8: Weekly Report Start learning GPU

Characters of Proteome

Page 9: Weekly Report Start learning GPU
Page 10: Weekly Report Start learning GPU

Protein identification and pFind

Background

Identify flow

Challenges

Could GPU be used?

Page 11: Weekly Report Start learning GPU

Mass Spectrometry Based Protein Identification

Mixed Proteins

>ipi|IPI00243451|IPI00243451.6 MDQHQHLNKTAESASSEKKKTRRCNGFKMFLAALSFSYIAKALGGIIMKISITQIERRFD…

TAESASSEKMFLAALSFSYIAK…

Digest

Mixed peptides

LC-MS/MS

Data

analyze

Protein sequence Peptide sequence

Merge

19-21-08 FT 893 MS2 9 avg #1 RT: 0.63 AV: 1 NL: 1.04E4T: FTMS + p NSI Full ms2 [email protected] [ 500.00-1600.00]

600 700 800 900 1000 1100 1200 1300 1400m/z

0

10

20

30

40

50

60

70

80

90

100

Relat

ive A

bund

ance

928.6396

929.9735

720.3784823.9249

916.4733769.9116 955.7405

1008.5148

1097.6791676.8584

1229.5820 1358.6410900.2117663.0114588.3018 1115.5698 1412.59381348.38761239.3015

Tandem MS

Page 12: Weekly Report Start learning GPU

Web search engine

Page 13: Weekly Report Start learning GPU

Protein identification SE

20040060080010001200

Go pFind

Sequence database

…KFDTGIPDGFAGFFGHYAQGGITFRH

EWTRJQIDF…

query

scoreTAESA

MFLAALS

…FSYIAK200400600800100012

00

20040060080010001200

……

Page 14: Weekly Report Start learning GPU

Upper bound of mass : 699.70

lower bound of mass 699.90

6 9 9 .7 8 T L K H L K6 9 9 .7 8 W D R D L6 9 9 .8 2 E L D G E R...

查询结果

200 40060080010001200

400.15 EVDG400.15 AAEE400.15 PSTD

…698.48 SVKKKK699.78 TLKHLK699.78 WDRDL

……

>IQPSKANMETEPDQ…>DEAVPPPALQLQFN……..

Protein sequence database

Protein identification SE

digestion

Page 15: Weekly Report Start learning GPU

20040060080010001200

20040060080010001200

20040060080010001200

……

>IQPSKANMETEPDQ…

>DEAVPPPALQLQFN…

>RQRAILKVMNTIGGE……

MS

Protein identification SEProtein

database

Page 16: Weekly Report Start learning GPU

>IQPSKANMETEPDQ…

>DEAVPPPALQLQFN…

>RQRAILKVMNTIGGE……

MS Protein database

Digest

400 EVDG

400 AAEE

400 PSTD

698 SVKKKK

699 TLKHLK

699 WDRDL

……

Peptide

Matching

Protein identification SE

Page 17: Weekly Report Start learning GPU

Protein identification and pFind

Background

Identify flow

Challenges

Could GPU be used?

Page 18: Weekly Report Start learning GPU

>IQPSKANMETEPDQ…>DEAVPPPALQLQFN…>RQRAILKVMNTIGGE…

MS Protein database

Digest

EVDGAAEEPSTD

SVKKKKTLKHLKWDRDL

……

Peptide

Matching

Challenges of PISE

Generation Speed keep increasing

Protein increaseexponentially

PTM leads to huge peptides

Page 19: Weekly Report Start learning GPU

E.g. Phosphorylation

Amino S, T and Y (HPO3,80Da)

- May be happen- 25 kinds of possibilities

PO3 PO3 PO3 PO3PO3

EMSVPSCQYILSATNR

Page 20: Weekly Report Start learning GPU

Identification of PTM

400 EVDG

400 AAEE

400 PSTD

631 EMSVPS

699 TLKHLK

699 WDRDL

……

Peptide

>IQPSKANMETEPDQ…

>DEAVPPPALQLQFN…

>RQRAILKVMNTIGGE……

Protein

Page 21: Weekly Report Start learning GPU

Protein identification and pFind

Background

Identify flow

Challenges

Could GPU be used? http://bioinformatics.oxfordjournals.org/cgi/

content/full/25/15/1937

Page 22: Weekly Report Start learning GPU

Protein identification on GPU

Each thread-each MS

Each thread-each score

Each thread-each “query” V1 Match V2

Seems valuable to think further!

Page 23: Weekly Report Start learning GPU

Outline

Protein identification and pFind

GPU and data mining

Research Plan

Page 24: Weekly Report Start learning GPU

Google 2009.09.11

CPU 133,000,000 Genome GPU 45,600

GPU 13,800,000 Proteomic GPU 7,830

GPGPU 621,000 Protein GPUProtein GPU 85,300

CUDA 6,040,000 Protein identification GPU

3,450

Data mining on GPU

77,700

Page 25: Weekly Report Start learning GPU

GPU and data mining

Characters of GPU GPU VS CPU

CUDA

Data mining on GPU

Page 26: Weekly Report Start learning GPU

Quadro FX 5600

NV35 NV40

G70G70-512

G71

Tesla C870

NV30

3.0 GHzCore 2 Quad3.0 GHz

Core 2 Duo3.0 GHz Pentium 4

GeForce8800 GTX

0

100

200

300

400

500

600

Jan 2003 Jul 2003 Jan 2004 Jul 2004 Jan 2005 Jul 2005 Jan 2006 Jul 2006 Jan 2007 Jul 2007

GF

LO

PS

1 Based on slide 7 of S. Green, “GPU Physics,” SIGGRAPH 2007 GPGPU Course. http://www.gpgpu.org/s2007/slides/15-GPGPU-physics.pdf

GPU VS CPU

Page 27: Weekly Report Start learning GPU

Design philosophies are different.

The GPU is specialized for compute-intensive, massively data parallel computation (exactly what graphics rendering is about) So, more transistors can be devoted to data processing rather than data

caching and flow control

The fast-growing video game industry exerts strong economic pressure for constant innovation

DRAM

Cache

ALUControl

ALU

ALU

ALU

DRAM

CPU GPU

Page 28: Weekly Report Start learning GPU

What is the GPU Good at?

The GPU is good at data-parallel processing The same computation executed on many data

elements in parallel – low control flow overhead with high SP floating point arithmetic intensity

Many calculations per memory access Currently also need high floating point to integer

ratio High floating-point arithmetic intensity and many data

elements mean that memory access latency can be hidden with calculations instead of big data caches – Still need to avoid bandwidth saturation!

Page 29: Weekly Report Start learning GPU

CUDA - No more shader functions. CUDA integrated CPU+GPU application C program

Serial or modestly parallel C code executes on CPU Highly parallel SPMD kernel C code executes on GPU

CPU Serial CodeGrid 0

. . .

. . .

GPU Parallel Kernel

KernelA<<< nBlk, nTid >>>(args);

Grid 1CPU Serial Code

GPU Parallel Kernel

KernelB<<< nBlk, nTid >>>(args);

Page 30: Weekly Report Start learning GPU

CUDA

Basic

Memory

Threads

Application performance

Page 31: Weekly Report Start learning GPU

Data mining on GPU

K-means

K-nn

Apriori

SVM

Page 32: Weekly Report Start learning GPU

K-means on GPU

A team at University of Virginia, led by Professor Skadron

HKUST && MSRA GPUMiner

LABS-hp

Page 33: Weekly Report Start learning GPU

Experiments -GPUMiner

Page 34: Weekly Report Start learning GPU

Experiments-HPL

Page 35: Weekly Report Start learning GPU

Data mining on GPU

The time of speed-up highly depends on the implementation Data transfer Memory CPU-GPU cooperation

Page 36: Weekly Report Start learning GPU

Outline

Protein identification and pFind

GPU and data mining

Research Plan

Page 37: Weekly Report Start learning GPU

Research Plan

Keep reading related papers GPU, data mining

Development Read our k-means program Try to speed it up Try protein identification on GPU

Page 38: Weekly Report Start learning GPU

Time schedule

Courses Thu. 6.30-9.30pm, data mining

TA Tue. 11.30-12.20am, Network security; Fri. 9.30-11.30am, Network security;

Page 39: Weekly Report Start learning GPU

Thank you for your listening


Recommended