+ All Categories
Home > Documents > Protein Prediction II Exercise

Protein Prediction II Exercise

Date post: 23-Feb-2016
Category:
Upload: hume
View: 26 times
Download: 0 times
Share this document with a friend
Description:
Protein Prediction II Exercise. Exercise – Project Layout. G eneral remarks – recap: Report 60pts, Exam 40 pts , weekly presentations of each group, one bad presentation allowed, groups of 3-4 students Contact & Questions: [email protected] only! - PowerPoint PPT Presentation
Popular Tags:
7
T. Hamp & L. Richter Protein Prediction II Exercise
Transcript
Page 1: Protein Prediction II Exercise

T. Hamp & L. Richter

Protein Prediction II Exercise

Page 2: Protein Prediction II Exercise

T. Hamp & L. Richter

Exercise – Project LayoutGeneral remarks – recap: Report 60pts, Exam 40 pts, weekly

presentations of each group, one bad presentation allowed, groups of 3-4 students

Contact & Questions: [email protected] only!

The exercise is taken from the CAFA competition

Prediction of HPO terms

HPO: Human phenotype ontology

2

Page 3: Protein Prediction II Exercise

T. Hamp & L. Richter

Terms – Definitions and ExplanationsAmino acids (aa): Building blocks for proteins, 20 different aa are

found in proteinsProtein sequence: String of characters representing a sequence of

amino acids (string from a 20 letter alphabet)The protein sequence defines the protein structure and the protein

function (within some limits)Proteins sequences are stored in large publicly available repositoriesOne of the most well known repositories is UniProt (

http://www.uniprot.org/) and its section Swiss-ProtBesides the sequence these databases hold additional information

about the protein, too

3

Page 4: Protein Prediction II Exercise

T. Hamp & L. Richter

Ontology (in information science)Ontology: An ontology represents knowledge as a set of concepts

within a domain, using a shard vocabulary to denote types, properties and interrelationships of those concepts

Human Phenotype ontology (HPO): Set of concepts describing human appearing (shape, health, a.s.f.)

HPO concepts are hierarchically ordered, i.e. there is a “is-a” relation ship.

they are arranged in a tree-like fashion

4

Page 5: Protein Prediction II Exercise

T. Hamp & L. Richter

Our competitionProteins are annotated (described) with experimentally determined

information

As time goes by: Proteins are associated with information about experimentally confirmed effects on the human phenotype

The associated term are taken form the Human Phenotype ontologyExperimental determination is slow and expensive

=> we try to predict associated HPO terms for the yet un-annotated

5

Page 6: Protein Prediction II Exercise

T. Hamp & L. Richter

More formal stepsFind a function that assigns a set of HPO terms T to a sequence s so

that the number of false assignment is minimal and the number of true assignments is maximal

Remember: The true evaluation is done after submission when so far not annotated sequences get experimentally determined annotations

6

Page 7: Protein Prediction II Exercise

T. Hamp & L. Richter

TasksDownload files from www.rostlab.org/~richter/pp2_files.tgz

Get familiar with the provided files

Especially the column names (look for at Uniprot and HPO)Read:

http://biofunctionprediction.org/sites/default/files/IntroductionCAFA_pedja.pdf

7


Recommended