+ All Categories
Home > Documents > UNIVERSIDADE DE LISBOArepositorio.ul.pt/bitstream/10451/4969/2/ulsd061791_td_tese.pdf ·...

UNIVERSIDADE DE LISBOArepositorio.ul.pt/bitstream/10451/4969/2/ulsd061791_td_tese.pdf ·...

Date post: 09-Nov-2018
Category:
Upload: dangdang
View: 213 times
Download: 0 times
Share this document with a friend
186
Transcript

UNIVERSIDADE DE LISBOA

FACULDADE DE CIÊNCIAS

DEPARTAMENTO DE BIOLOGIA ANIMAL

CHRONIC OBSTRUCTIVE PULMONARY

DISEASE: A PROTEOMICS APPROACH

BRUNO MIGUEL COELHO ALEXANDRE

DOUTORAMENTO EM BIOLOGIA

(BIOLOGIA MOLECULAR)

Tese Orientada pela Doutora Deborah Penque e pela

Professora Doutora Ana Maria Viegas Gonçalves Crespo

2011

v

Declaration

The research described in this thesis was performed at Laboratório de Proteómica,

Departamento de Genética, Instituto Nacional de Saúde Dr. Ricardo Jorge (INSA IP),

Lisboa, Portugal, at Laboratory of Proteomics and Analytical Technologies, National

Cancer Institute at Frederick (NCI-Frederick), Frederick, Maryland, United States of

America, and at Clinical Proteomics Facility, University of Pittsburgh Cancer Institute

(UPCI), Pittsburgh, Pennsylvania, United States of America.

This work was supported by Fundação para a Ciência e Tecnologia (FCT) PhD grant

SFRH/BD/2006/31415 and research grant POCI/SAU-MMO/56163/2004, and also by

FCT/Polyannual Funding Program and FEDER/Saúde XXI (Portugal).

The following chapters are based on articles published or submitted during the PhD:

Chapter II is based on the article:

Alexandre, B. M., Proteomic mining of the red blood cell: focus on the membrane

proteome. Expert review of proteomics 2010, 7, 165-168.

During the PhD the following articles were also published:

Cox, J., R, M. A. H., James, P., Jorrin-Novo, J. V., et al., Facing challenges in Proteomics

today and in the coming decade: Report of Roundtable Discussions at the 4th EuPA

Scientific Meeting, Portugal, Estoril 2010. Journal of proteomics 2011.

Bruno Miguel Coelho Alexandre

(Licenciado em Química pela Faculdade de Ciências da Universidade de Lisboa)

vi

vii

Acknowledgments

As minhas primeiras palavras vão para a Dr.ª Deborah Penque. Primeiro, por me ter

escolhido entre mais de 100 candidatos para ser bolseiro de investigação de um dos

seus projectos e, durante o 1º ano dessa bolsa, por me ter convidado a fazer

doutoramento no seu laboratório. Agradeço também todo o apoio que me deu

durante todos estes anos e especialmente por ter estado presente sempre que

necessário, mas dando-me liberdade para seguir o meu caminho de acordo com as

minhas ideias.

Gostaria igualmente de agredecer à Professora Doutora Ana Crespo por todo o apoio

que me deu durante o período de doutoramento. Pelo facto de ambos termos a

mesma formação base – Química – e também por ter anteriormente realizado

trabalho em DPOC, senti desde o início uma grande empatia e uma grande

compreensão da sua parte e senti sempre que compreendia aquilo que sentia ao estar

a trabalhar no ramo da Biologia, o que me ajudou bastante a lidar com os novos

conhecimentos e as dificuldades inerentes a esta nova aprendizagem.

I would like to thank Dr. Thomas Conrads for having me as a PhD student in his labs

(there were 3 different locations!) and for his help in every single step of the projects.

This includes acquiring the samples that made the serum project possible. I’d also like

to thank Dr. Brian Hood for his contribution to my education as scientist in the field of

proteomics and for his support at all times. I’d like to thank both for all their support

over the years at the professional level, always trying to provide me with everything I

needed during my time in the US and teaching me how to use new tools, new

instruments, etc., but on the top of all, I’d like to thank both for giving me the chance

of becoming their friends and be able to share unique moments as thanksgiving for

instance, which we don’t celebrate in Portugal. Thank you very much for all the good

moments.

viii

I’d like to thank Dr. Josip Blonder for his contribution for the red blood cell project,

along with Drs. Haleem Issaq and Timothy Veenstra.

I’d like to thank all volunteers for their collaboration including not only the ones that

were recruited in Portugal, but also the ones recruited in the US. Without them, it

wouldn’t be possible to perform this work.

Gostaria de agradecer a colaboração do grupo do Prof. Dr. Bugalho de Almeida, em

especial ao Dr. Carlos Lopes, no recrutamento de doentes de DPOC em Portugal e na

partilha de informação clínica, que é crucial para a realização de um trabalho em

proteómica clínica.

Uma palavra muito especial para as pessoas com quem partilhei o meu dia-a-dia no

laboratório, em especial a Patrícia Alves, o Nuno Charro, o João Banha e a Isabel

Oliveira, a que mais tarde se juntaram Fátima Vaz, Sofia Neves, Tânia Simões e,

recentemente, Vukosava Torres.

Quero também agradecer a todos meus amigos, eles sabem quem são, por todos os

momentos que tanto contribuiram para a manutenção da minha sanidade mental.

O meu grande agradecimento desta tese vai para a minha companheira, a mulher da

minha vida, a Filipa. O momento mais alto, não apenas destes 4 anos de

doutoramento, mas da minha vida, foi o dia do nosso casamento. Obrigado por todo o

apoio incondicional, mesmo quando em causa estavam momentos menos bons e de

grande sacrifício, como foram as temporadas que passei a trabalhar nos EUA. A minha

vida só faz sentido se puder caminhar junto a ti e é assim que quero passar todos os

meus dias.

Aos meus pais, tenho tudo para agradecer. Todo o apoio que sempre me transmitiram

ao longo da vida. O amor e o carinho que sempre me deram, a educação e a

responsabilidade que me incutiram desde sempre dando-me ao mesmo tempo

liberdade para traçar o meu caminho. Os pais não se escolhem, mas eu não podia ter

tido mais sorte.

ix

Para terminar, gostaria de deixar um agradecimento pelo apoio que tive por parte da

minha familia, em especial aos meus avós, à minha madrinha, ao Zé e às minhas lindas

primas Inês e Joana. Gostaria também de agradecer o apoio que sempre tive dos

“novos” elementos da minha família, em especial dos meus sogros Vitor e Lai, e da

minha cunhada Joana.

x

xi

Abstract

Chronic obstructive pulmonary disease (COPD) is characterized by chronic airflow

limitation that is not fully reversible even under bronchodilators effect, caused by a

mixture of small airway disease – obstructive bronchiolitis – and parenchymal

destruction – emphysema. At the present time, COPD is the fourth leading cause of

death and its prevalence and mortality are expected to continue increasing in next

decade.

Spirometry is the most reproducible way to measure lung function and is nowadays

the best tool to diagnose airflow limitation and, consequently, diagnose COPD itself.

Biomarkers for diagnosis and/or prognosis as well as novel targets for the

development of more effective therapies for COPD are still needed.

Proteomics has the capacity to provide large-scale information and consequently it has

the potential to expand previous knowledge on COPD. Surprisingly, given the need for

new biomarkers in COPD and the power of proteomics, proteomics have been quite

neglected so far. Up to date only 50 reports (14 are reviews) match the search at

Pubmed (http://www.ncbi.nlm.nih.gov/pubmed, accessed June 23, 2011) for COPD

proteomics. Hence, there is a clear need to engage clinically valuable proteomics

studies in order to match the need for new biomarkers in COPD.

In last decade, the shotgun proteomics approach has become the method of choice for

identifying and quantifying proteins in most large-scale studies. Compared with 2DE,

shotgun proteomics allows higher data throughput and better protein detection

sensitivity. This strategy is based on trypsin digestion of proteins into peptides. This

produces a complex peptide mixture that is then separated by one- or multiple

dimensional liquid chromatography (LC) and subjected to peptide sequencing using

tandem mass spectrometry (MS/MS) before automated database searching. In the

present work we have employed different methodologies within shotgun proteomics

to generate solid and comprehensive data in COPD.

There are quite a lot biological materials that can be used to investigate biomarkers for

this disease. Although COPD is now known to possess a systemic inflammation

xii

component which is responsible for affecting other organs, it is in the lung that the

events that lead to breathless take place. Investigating to the lung directly is therefore

an optimal strategy to be able to identify proteins that may not be detectable

elsewhere either because they are not present or diluted into undetectable

concentrations. But this means that lung tissue has to be collected by biopsy which is

an extremely invasive technique. But besides tissue biospecimens, other sources of

biological materials used to study COPD is biofluids which includes sputum,

bronchoalveolar and nasal lavage fluid, exhaled breath condensate, and blood.

It had been observed before by means of microscopy that red blood cells (RBCs) from

COPD patients showed deformations in their shape. RBCs are crucial to the uptake of

oxygen from the lungs to the cells and this transport is dependent on their ability to

change shapes rapidly while navigating through blood vessels. In addition, RBCs play a

crucial role in antioxidant defense when fighting against oxidative stress, which has

long been recognized as feature of COPD. In this work we made use of a RBC

membrane fractionation procedure, stable isotope labeling and bidimensional liquid

chromatography (strong cation exchange / reverse phase) before sample acquisition

using a high-resolution fourier transform - ion cyclotron resonance (FT-ICR) mass

spectrometer (Chapter III). A total of 4697 peptides were quantified as present in both

COPD and control spectra corresponding to 1083 proteins. Three-hundred and

fourteen proteins possessing at least two peptides were identified, 46% of which were

annotated as membrane proteins. Golgin-245/p230 (GOLGA4), was identified as

overexpressed in COPD, a protein which is reported to be essential for intracellular

trafficking and cell surface delivery of tumor necrosis factor-α (TNF), the main

proinflammatory cytokine made and secreted by inflammatory macrophages

enhancing activation and recruitment of T-cells and ensuring robust innate and

acquired immune responses. Chorein or Vacuolar protein sorting-associated protein

13A (VPS13A) is reported to play a role in the cytoskeleton organization has been

associated with thorny deformations of circulating erythrocytes, possibly due to red

cell membranes deformation. This protein was found to be underexpressed in COPD

patients when compared to controls by MS and this underexpression was confirmed by

WB. Consequently, underexpression of chorein may play an important role in the

deformation of COPD RBCs. Many other interesting proteins were identified in the

xiii

context of COPD and, additionally, there were a considerable number of proteins

described in RBC for the first time (Chapter III).

To overcome the difficulty of acquiring fresh biopsies of well characterized patients,

we have established in our laboratory a procedure to collect human fresh nasal

epithelial cells. We have shown previously that these cells presented similar proteome

of epithelial cells presented in the lower airway. Here, two different types of studies

were presented using these cells: a study performed on the effects of cigarette smoke,

which is the main risk factor for developing COPD (Chapter IV) and a comparative

proteomic study between COPD patients and healthy individuals (Chapter V). Both

were pioneer studies by investigating the proteome of fresh nasal epithelial cells from

cigarette smoker subjects (Chapter IV) or COPD patients (Chapter V). In both studies a

high-resolution mass analyzer, the orbitrap, was employed increasing the number of

confident peptide/protein identifications. In Chapter IV, ninety-six proteins were found

to be differentially expressed between the proteomes of healthy smokers and

nonsmokers. These proteins were related to processes of antigen presentation, cell-to-

cell signaling and interaction, cell morphology, drug metabolism, DNA repair, energy

production or mitochondrial dysfunction. Although requiring further orthogonal

validation, our data was consistent with previous evidences showing CD44, MUC5AC or

SOD2 differential modulation in smokers due to inflammatory response pathways.

In Chapter V, 89968 peptides and 1475 proteins were identified in total, of which 1173

proteins were identified by at least two peptides. We were able to confirm previous

evidences that UPR is activated in COPD patients since we were able to observe

overexpression in a considerable number of proteins involved in different protein

complexes involved in UPR. This includes overexpression of VCP, both components of

the Hsp10/Hsp60 chaperone complex (HSPD1 and HSPE1), CALR and two members of a

large ER-localized multiprotein complex of at least 11 proteins, PPIB and ERP29. We

also observed an increase in expression of proteins related to Nrf2-mediated oxidative

stress response such as GSTP1, TXNRD1 and GSR. Additionally, we also report an

increase in drug metabolism, as all significantly differentially expressed proteins

related to this biofunction were overexpressed in COPD: GSTP1, GSR, AKR1C3 and

ANXA2. Further validation by orthogonal methods is needed so that the activation of

xiv

UPR and Nrf2-mediated oxidative stress response and the increase in drug metabolism

on the nasal epithelial cells of COPD patients is fully confirmed.

In Chapter VI, serum collected from COPD patients was divided into 4 different groups

in all different combinations of presence/absence of the two main features of COPD,

chronic bronchitis and emphysema, to study their impact in the serum proteome. Due

to its complex protein mixture, serum was first immunodepleted from its most

abundant proteins, comprising about 94% of total protein content, before being

analyzed by 1D-PAGE – LC-MS/MS (GeLC-MS/MS) in a linear ion trap mass

spectrometer. This powerful strategy was able to identify as many as 2856 proteins, of

which 929 were identified by two or more peptides. Plasminogen was found to be

underexpressed in COPD patients that suffer simultaneously from emphysema and

chronic bronchitis, while it maintained about the same expression level over the three

other groups of COPD patients and this differential expression was successfully

validated by ELISA. It was possible to identify other interesting proteins as TRAF3IP2,

which is associated with innate immunity in response to pathogens, inflammatory

signals and stress and has also been implicated in airway hyperresponsiveness or

Isoform 1 of phosphatidylinositol-glycan-specific phospholipase D (GPLD1), which is

GPI degrading enzyme that was described to be responsible for secretion of prostasin,

which was the first of several membrane serine peptidases found to activate the

epithelium-sodium channel (ENaC). Prostasin was also reported to have a critical role

in regulating epithelial sodium transport in normal and pathological conditions in the

lung.

The work herein presented confirmed a few findings that had already been reported

and more important revealed new insights into COPD disease mechanisms as well as

provided new candidate biomarkers for these diseases. Further validation and

integration of all data obtained into a systems biology approach will certainly

contribute to increase knowledge of COPD and ultimately bringing the well being of

patients.

xv

Resumo

A Doença Pulmonar Obstrutiva Crónica (DPOC) é caracterizada por uma limitação

obstrutiva do fluxo aéreo do exterior para os alvéolos e destes para o exterior. Esta

limitação ventilatória não é completamente reversível mesmo após a administração de

um broncodilatador e tende a ser progressiva. As causas que conduzem a esta

limitação são um conjunto de patologias respiratórias do qual fazem parte a bronquite

crónica e o enfisema. Para a obstrução do débito de ar na bronquite crónica,

contribuem a inflamação das vias aéreas inferiores, a cicatrização das suas paredes, o

edema do seu revestimento, o muco e o espasmo do músculo liso. A bronquite crónica

manifesta-se por tosse frequente e produção aumentada de expectoração. No entanto

nem todos os doentes portadores de bronquite crónica têm ou irão desenvolver uma

limitação crónica do fluxo aéreo. No caso do enfisema, as paredes alveolares estão

destruidas, pelo que os bronquíolos perdem o seu apoio estrutural e, por isso, entram

em colapso quando expiram o ar. Por conseguinte, no enfisema a redução do fluxo de

ar é permanente e de origem estrutural. Actualmente, nenhum tratamento tem a

capacidade de reduzir a progressão da DPOC ou suprimir a inflamação das vias aéreas

de pequeno calibre e do parênquima pulmonar. Nos dias de hoje, a DPOC é a quarta

causa de morte no mundo e prevê-se que a sua prevalência e mortalidade continuem a

aumentar ao longo da próxima década. Em Portugal e de acordo com um relatório

divulgado pelo Observatório Nacional das Doenças Respiratórias (ONDR), cerca de 540

000 pessoas sofriam de DPOC em 2007, o que significa que 5,2% da população padece

desta condição. Recentemente, um relatório divulgado pela iniciativa Burden of

Obstructive Lung Disease (BOLD), revelou que a taxa de prevalência de DPOC em

Portugal é de 14,2%.

A espirometria é fundamental no diagnóstico e na avaliação da DPOC por ser o meio

mais objectivo, padronizado e facilmente reprodutivel de medir o grau de obstrução

das vias aéreas. Considera-se que existe obstrução brônquica e, portanto, DPOC,

quando após a administração de um broncodilatador a relação FEV1/FVC é menor do

xvi

que 70% (FEV1 – Volume Expiratório Máximo no 1.º segundo; FVC – Capacidade Vital

Forçada)

A Proteómica tem a capacidade de provedenciar informação em larga escala e,

consequentemente, de alargar o actual conhecimento da DPOC. Surpreendentemente

dada a necessidade de encontrar novos biomarcadores em DPOC e do potencial da

proteómica, esta tem sido negligenciada na investigação da doença até aos dias de

hoje. Actualmente, existem apenas 50 publicações científicas, dos quais 14 são artigos

de revisão, correspondentes a uma pesquisa no site da PubMed

(http://www.ncbi.nlm.nih.gov/pubmed, acedido em 23 de Junho de 2011) com o

termo “COPD Proteomics”. No entanto, quando pesquisados em separado o número

total de artigos com os termos “COPD” e “Proteomics” é de, aproximadamente, 30

000. Portanto, existe uma clara necessidade de produzir novos estudos de proteómica

para colmatar esta lacuna e encontrar novos biomarcadores para a DPOC. Na última

década, a abordagem designada por shotgun proteomics tornou-se a abordagem de

eleição para a identificação e a quantificação de proteínas em estudos de larga escala.

Comparando com a electroforese bidimensional (2DE), esta abordagem permite obter

um maior número de identificações de péptidos e proteínas por ter uma maior

sensibilidade. A estratégia de shotgun proteomics tem como base a digestão tríptica de

proteínas para péptidos que vai conduzir a uma mistura complexa de péptidos que são

separados através de uma ou múltiplas dimensões cromatográficas. Posteriormente

sofrem uma sequenciação peptídica através de um espectómetro de massa e pesquisa

automática contra uma base de dados. No presente trabalho utilizaram-se diferentes

metodologias dentro da abordagem shotgun proteomics com o objectivo de gerar

resultados contundentes em DPOC.

Existe uma vasta gama de materiais biológicos que podem ser usados para procurar

biomarcadores para esta doença. Embora a DPOC possua um componente de

inflamação sistémica que é responsável por afectar outros orgãos, é no pulmão que os

mecanismos que originam a falta de ar estão presentes. Assim sendo, investigar

directamente o pulmão é a estratégia ideal para poder encontrar as proteínas

responsáveis por esses mecanismos que noutros materiais biológicos podem não estar

presentes ou, quando presentes, se encontram em concentrações diminutas

conduzindo à sua não detecção. Isto significa ter acesso a tecido pulmonar que é

xvii

obtido através de biopsia, uma técnica extremamente invasiva. Para além de tecido

pulmonar, existem outras fontes de material biológico têm sido utilizadas para o

estudo da DPOC como são os casos de expectoração, lavado nasal e lavado

broncoalveolar, condensado do ar expirado e sangue e seus componentes. No

presente trabalho foram utilizados eritrócitos, células do epitélio nasal e soro para o

estudo da doença.

Foi reportado por meios de microscopia que os eritrócitos de doentes com DPOC

exibiam alterações na sua morfologia. Os eritrócitos são fundamentais no transporte

de oxigénio dos pulmões para as células e este transporte depende da sua capacidade

para mudar rapidamente a sua forma ao navegar pelos capilares sanguíneos. Os

eritrócitos desempenham um papel fundamental no combate ao stress oxidativo, que

é uma característica da DPOC. Devido à sua hipotética importância para a doença,

eritrócitos foram isolados a partir de sangue total de doentes com DPOC e indivíduos

saudáveis. Neste trabalho (Chapter III), foi utilizada uma técnica de fraccionamento

membranar em conjunto com uma marcação isotópica com 18O/16O e posterior

separação através de cromatografia bidimensional (cromatografia de troca catiónica e

de fase inversa). A cromatografia de fase inversa estava directamente acoplada a um

espectómetro de massa de alta resolução (espectómetro de massa de ressonância de

ião-ciclotrão com transformada de Fourier – FT-ICR). Foi possível quantificar um total

de 4697 péptidos presentes em ambos os grupos estudados, que corresponderam a

1083 proteínas. 314 proteínas foram identificadas por dois ou mais péptidos, 46% das

quais se encontram anotadas como proteínas de membrana. A proteína GOLGA4 foi

identificada como estando sobreexpressa em DPOC. Esta proteína está descrita como

sendo essencial para o tráfego intracelular e pela colocação à superfície da célula da

proteína tumor necrosis factor-α (TNF), a principal citocina pró-inflamatória produzida

e secretada por macrófagos para activação e recrutamento de células T. Foi possível

também identificar a proteína VPS13A que está associada a deformações em

eritrócitos circulantes, possivelmente devido a deformações na membrana. Como

resultado do estudo concluiu-se que em doentes com DPOC esta proteína se encontra

subexpressa, resultado que foi confirmado por Western Blot.

Devido à dificuldade em adquirir amostras de biópsias pulmonares em doentes com

DPOC, foi desenvolvido no nosso laboratório um procedimento para obter células do

xviii

epitélio nasal. Em trabalhos anteriores foi demonstrado que estas células apresentam

um comportamento semelhante às células epiteliais das vias respiratórias inferiores.

Dois diferentes tipos de estudo foram realizados com estas células: um estudo sobre

os efeitos do fumo do tabaco, que é o principal factor de risco para o desenvolvimento

de DPOC (Chapter IV) e um estudo comparativo entre doentes com DPOC e indivíduos

saudáveis (Chapter V). Ambos os estudos são estudos pioneiros uma vez que o

proteoma de células do epitélio nasal nunca tinha sido descrito quer em indivíduos

saudáveis fumadores, quer em doentes com DPOC. No primeiro estudo, 96 proteínas

foram identificadas como diferencialmente expressas entre fumadores e não

fumadores saudáveis. Estas proteínas estão relacionadas com processos de

apresentação de antigénios, sinalização e interacção celular, morfologia celular,

metabolismo de xenobióticos, reparação de DNA, produção de energia e disfunção

mitocondrial. Os resultados obtidos foram consistentes com anteriores evidências que

mostram uma diferente modulação de CD44, MUC5AC ou SOD2 em fumadores devido

a processos de resposta inflamatória.

No segundo estudo, foram identificados um total de 89968 péptidos correspondentes

a 1475 proteínas das quais 1173 foram identificadas por dois ou mais péptidos. Foi

possível confirmar resultados obtidos em estudos anteriores que referiam que

mecanismos de Unfolded Protein Response (UPR) se encontravam activados em

doentes com DPOC, uma vez que foi observada a sobreexpressão de um númeo

considerável de proteínas envolvidas em diferentes complexos de proteínas

relacionados com UPR. Exemplos dessas proteínas incluem a VCP, os dois

componentes do complexo Hsp10/Hsp60, a CALR e dois membros de um grande

complexo composto por, pelo menos, 11 proteínas, localizado no retículo

endoplasmático: PPIB e ERP29. Foi também possível observar um aumento na

expressão de proteínas relacionadas com os mecanismos de resposta ao stress

oxidativo mediados por Nrf2, como GSTP1, TXNRD1 e GSR.

Relativamente ao estudo efectuado com o soro (Chapter VI), quatro grupos foram

constituídos para estudar as diferentes combinações de presença/ausência dos dois

principais componentes da doença: a bronquite crónica e o enfisema. Uma vez que o

soro é uma mistura complexa de proteínas, o soro de doentes de DPOC foi

primeiramente imunodepletado das suas proteínas mais abundantes, que

xix

representam cerca de 94% do seu conteúdo total de proteína. As amostras de soro

imunodepletado foram analisadas através de 1D-PAGE – LC-MS/MS (GeLC-MS/MS).

Esta poderosa estratégia permitiu a identificação de 33049 péptidos correspondentes

a 2856 proteínas, das quais 929 foram identificadas por dois ou mais péptidos. Foi

possível observar uma subexpressão de PLG em doentes com DPOC que sofrem

simultaneamente de enfisema e bronquite crónica. Esta subexpressão foi validada

através de ELISA. Outras proteínas de interesse foram encontradas diferencialmente

expressas em cada um dos grupos de doentes com DPOC, entre as quais TRAF3IP2, que

está associada à resposta imunitária inata a patogéneos e à resposta exacerbada das

vias respiratórias.

O trabalho aqui descrito confirmou evidências anteriormente reportadas e ao mesmo

tempo formulou novas hipóteses para os mecanismos da doença e revelou potenciais

novos biomarcadores. Embora alguns resultados necessitem de posterior validação

através de técnicas ortogonais, o presente trabalho tornou possível a descoberta de

proteínas e, inclusivamente, vias metabólicas que não tinham sido anteriormente

associadas à DPOC. Este trabalho enfatiza também o uso de células epiteliais nasais na

investigação da patogénese da doença, uma vez que pode conduzir/conduz à

identificação de novos biomarcadores específicos desta doença.

xx

xxi

List of Symbols and Abbreviations

2DE 2-Dimensional Electrophoresis

ACN AcetoNitrile

AMB AMmonium Bicarbonate

CID Collision-Induced Dissociation

COPD Chronic Obstructive Lung Disease

DNA DeoxyriboNucleic Acid

ELISA enzyme-linked immunosorbent assay

ER Endoplasmic Reticulum

ERAD Endoplasmic Reticulum-Associated Degradation

FEV1 Forced Expiratory Volume in 1 second

FVC Forced Vital Capacity

GOLD Global Initiative for Obstructive Lung Diseases

IPA Ingenuity Pathway Analysis

IPI International Protein Index

LC Liquid Chromatography

LIT Linear Ion Trap

MARS Multiple Affinity Removal System

MGG May-Grunwald-Giemsa

MS Mass Spectrometry

PAGE PolyAcrylamide Gel Electrophoresis

RBC Red Blood Cell

RNS Reactive Nitrogen Species

ROS Reactive Oxygen Species

RSD Relative Standard Deviation

RT Room Temperature

SCX Strong Cation exchange

SDS Sodium Dodecyl Sulfate

TBST Tris Buffer Saline Tween20

xxii

TFA TriFluorocetic Acid

UPR Unfolded Protein Response

WB Western Blot

WHO World Health Organization

Xcorr charge state dependent cross correlation

ΔCn delta correlation

xxiii

Contents

Declaration ........................................................................................................................ v

Acknowledgments ........................................................................................................... vii

Abstract ............................................................................................................................ xi

Resumo ............................................................................................................................ xv

List of Symbols and Abbreviations ................................................................................. xxi

Contents ....................................................................................................................... xxiii

List of Figures ................................................................................................................. xxv

List of Tables ................................................................................................................. xxix

Preface .......................................................................................................................... xxxi

Chapter I - General Introduction ...................................................................................... 1

CHRONIC OBSTRUCTIVE PULMONARY DISEASE ........................................................... 2

History ....................................................................................................................... 2

Definition and prevalence ......................................................................................... 2

Diagnosis and classification ....................................................................................... 5

Risk factors ................................................................................................................ 6

Economic burden .................................................................................................... 10

PROTEOMICS ............................................................................................................... 11

Background and state-of-the-art ............................................................................ 11

Proteomics in COPD ................................................................................................ 15

Chapter II - Proteomic Mining Of The Red Blood Cell: Focus On The Membrane

Proteome ........................................................................................................................ 17

Chapter III - Quantitative Profiling of the Erythrocyte Membrane Proteome Isolated

from Patients Diagnosed with Chronic Obstructive Pulmonary Disease ....................... 29

Chapter IV - A comparative, Global Proteomic Analyses of Human Nasal Epithelial Cells

Obtained by Nasal Brushing in Nonsmoking versus Smoking Healthy Individuals ........ 63

Chapter V - Proteomic Profiling of Nasal Epithelial Cells in Chronic Obstructive

Pulmonary Disease ......................................................................................................... 93

Chapter VI - Serum Proteomics of Chronic Obstructive Pulmonary Disease Patients . 123

xxiv

Chapter VII - Concluding Remarks and Future Perspectives ........................................ 145

xxv

List of Figures

Figure I.1: Changes in lung parenchyma of COPD patients. Source: Barnes, PJ.

Adapted from www.goldcopd.org 3

Figure I.2: Trends in mortality rates for the six leading causes of death in the

US, 1970-2002 [13]. 4

Figure I.3: Inflammatory cells involved in COPD. Adapted from [30]. 7

Figure I.4: Pathogenesis of COPD. Source: Barnes PJ. Adapted from

www.goldcopd.org. 8

Figure I.5: Pulmonary and systemic inflammatory events associated with

COPD [25]. 10

Figure I.6: Proteomics timeline indicating important scientific contributions

to proteomics development for the past five decades [46]. 13

Figure I.7: General view of the experimental steps and flow of data in

shotgun proteomics analysis [48]. 14

Figure I.8: Workflow illustrating different proteomics-based approaches and

major steps required for proteomic biomarker discovery. 15

Figure III.1: Basic scheme of methodology showing main steps of sample

preparation. 39

Figure III.2: SCX chromatogram displaying sample separation into ten

fractions. 40

Figure III.3: Subcellular location of the 314 proteins identified by at least two

peptides according to both gene ontology annotations and ingenuity systems

knowledgebase. 41

xxvi

Figure III.4: Biological processes (panels A and C) and molecular functions

(panels B and D) for the whole proteins identified in both COPD patients and

control subjects (panels A and B) and for differentially (above 1.5-fold)

expressed proteins only (panels C and D). Information gathered from

PANTHER software. 42

Figure III.5: Proteins identified in both samples within the two main RBC

membrane protein complexes. Adapted from [20]. 44

Figure III.6: Main protein-protein interaction network comprising 43

members generated by Cytoscape 2.6.3 using PINA database. 47

Figure III.7: Overrepresented biological processes (GO) for the differentially

(above 1.5-fold) expressed proteins in COPD patients. 48

Figure III.8: Western blot validation showing both representative close-up

views of each Ab reaction and graphic representation of the relative

normalized abundance of (A) Acylamino-acid-releasing enzyme (AARE ), (B)

ALDOA, (C) VPS13A and (D) CYB5R3, using the full intensity of the respective

Ponceau-stained lane in the nitrocellulose membrane for normalization (n=3

independent replicates/each Ab reaction). The antigen–antibody complex

was detected by ECL (GE Healthcare) and Progenesis PG200v2006 software

(Nonlinear Dynamics) was used for densitometry analysis. 50

Figure IV.1: MGG-staining of nasal cells collected by brushing. Magnification:

80x 71

Figure IV.2: Basic workflow of the methodology employed for the study of

the nasal epithelial cells proteome of healthy smokers and nonsmokers. 72

Figure IV.3: Venn diagram showing the overlap in proteins identified by at

least two peptides between the two groups under analysis. 73

.

xxvii

Figure IV.4: Hierarchical cluster of the significantly differentially expressed

proteins. Protein abundances are displayed as normalized expression. X-axis

labels refer to information displayed in Table IV.1. 75

Figure IV.5: Top protein network as obtained from Ingenuity Pathway

Analysis. 84

Figure V.1: Hierarchical clustering of the significantly differentially expressed

proteins between COPD patients and healthy individuals. Protein

abundances are displayed as normalized expression. 104

Figure V.2: Top protein network as obtained from Ingenuity pathway

analysis. 110

Figure V.3: Merged network comprising all 44 proteins found eligible for

networks analysis by Ingenuity pathway analysis. 111

Figure VI.1: Basic scheme of the methodology employed to study COPD

patients’ proteome. 132

Figure VI.2: Cellular location of proteins identified by two or more peptides. 133

Figure VI.3: Hierarchical clustering exhibiting relative abundance of the

eighty-six significantly differentially expressed proteins across the four

groups under analysis. 135

Figure VI.4: ELISA determination of serum plasminogen in COPD patients. X-

axes labels refer to nomenclature displayed in Table VI.1. 137

Figure VI.5: Western blot analysis of APOE. Primary antibody dilution:

1:1,000. Secondary antibody dilution: 1:50,000. 138

xxviii

xxix

List of Tables

Table I.1: Classification of COPD stages based on spirometry [9]. FEV1 = Forced

expiratory volume in 1 sec; FVC = forced vital capacity; PaO2 = arterial partial

pressure of oxygen. 5

Table III.1: Main characteristics of both control and patient groups 34

Table III.2: Profile of the COPD patients. 34

Table III.3: Predominant pathways associated to COPD patients when

compared to healthy smokers as provided by PANTHER. 43

Table III.4: Ten most overexpressed proteins in COPD erythrocyte ghost as

provided by Ingenuity systems knowledgebase. a) Swiss-Prot/Uniprot

accession number. 45

Table III.5: Proteins associated to oxidative stress present in top-10 networks.

a) Swiss-Prot/Uniprot accession number; b) According to ingenuity pathways

analysis. 46

Table IV.1: Main characteristics of the biological replicates of the samples

under analysis. 68

Table IV.2: Differentially expressed proteins in smokers (S) when compared to

nonsmokers (NS) exhibiting a >95% confidence interval. Cellular location and

functional type were retrieved by Ingenuity knowledgebase (Ingenuity

Systems). 76

Table IV.3: Top 5 protein interaction networks generated from proteins found

to be significantly differentially expressed proteins between smokers and

nonsmokers. 82

Table IV.4: Top 10 significant biofunctions in disease and disorders observed in 83

xxx

differentially expressed proteins of smokers when compared to nonsmokers.

Table V.I: Demographics of biological replicates. 98

Table V.2: Smoking history of biological replicates. 98

Table V.3: Differentially expressed proteins in COPD patients when compared

to healthy individuals exhibiting a >95% confidence interval. Fold change along

with cellular location and functional type retrieved by Ingenuity

knowledgebase (Ingenuity Systems) are also provided. 105

Table V.4: Protein interaction networks generated by IPA from 44 proteins

found to be eligible for network analysis among the 46 significantly

differentially expressed proteins between COPD patients and healthy

individuals. 107

Table V.5: Top 25 significant biofunctions generated from significantly

differentially expressed proteins on COPD patients when compared to healthy

individuals. 108

Table V.6: Top 10 significant biofunctions within diseases and disorders

together with proteins involved in each biofunction. 111

Table V.7: Top 10 significant biofunctions within molecular and cellular

functions together with proteins involved in each biofunction. 112

Table V.8: Top 10 significant biofunctions within physiological system

development and function together with proteins involved in each biofunction. 113

Table VI.1: Main characteristics of the biological replicates for each of the

groups under analysis. “No features” (Group A) refers to emphysema and

chronic bronchitis only. (Biol Rep- Biological Replicate; BMI- body mass index). 127

Table VI.2: ELISA determination of serum PLG in COPD patients. 137

xxxi

Preface

The goal of the work herein presented is to identify new biomarkers for chronic

obstructive pulmonary disease and to provide new insights into its pathogenesis and

pathology. A huge amount of data was generated in the different studies and it was

not possible to include it due to space constrains. Therefore information mentioned as

Supporting Information is provided in the CD attached to this thesis.

Chapter I (“General Introduction”) is a general introduction where several aspects of

COPD are discussed and where state-of-the-art of proteomics is described with the aim

of explaining the advantages on applying proteomics to meet the needs for the

discovery of new biomarkers in COPD.

Chapter II (“Proteomic Mining Of The Red Blood Cell: Focus On The Membrane

Proteome”) is an introduction to red blood cell membrane proteome that has been

published in Expert Reviews of Proteomics (see “List of Publications”) and although the

author of this manuscript and of the present thesis is the same, this chapter is herein

reproduced after written authorization provided by Expert Reviews Ltd., London, UK.

Chapter II acts as an extended introduction to Chapter III (“Quantitative Profiling of the

Erythrocyte Membrane Proteome Isolated from Patients Diagnosed with Chronic

Obstructive Pulmonary Disease”), since Chapter III describes work performed on the

red blood cell of COPD patients.

Chapters IV and V address work done on the nasal epithelial cell proteome. In Chapter

IV (“A comparative, global proteomic analyses of human nasal epithelial cells obtained

by nasal brushing in non-smoking versus smoking healthy individuals”), the effects of

cigarette smoke, the main risk factor for developing COPD, are addressed and in

Chapter V (“Proteomic profiling of nasal epithelial cells in chronic obstructive

pulmonary disease”) the nasal epithelial proteome of COPD patients is compared to

the one of healthy individuals.

Chapter VI (“Serum proteomics of chronic obstructive pulmonary disease patients”)

reports the proteome investigation of immunodepleted serum samples and compares

the proteome among four different groups of COPD patients, divided by

xxxii

presence/absence of the two main clinical features of COPD: chronic bronchitis and

emphysema.

Finally, in Chapter VII (“Concluding Remarks and Future Perspectives”) presents the

main conclusions and achievements of this work and, at the same time, points to

future work.

1

Chapter I

General Introduction

2

CHRONIC OBSTRUCTIVE PULMONARY DISEASE

History

Chronic obstructive pulmonary disease (COPD) has certainly long existed, but first

reports that may be traced to this disease only dates from the seventeenth century. It

was in 1679 that took place the first hypothetical report of COPD cases when Bonet

described emphysema as a condition of “voluminous lungs” [1]. Almost a century

ahead, Giovanni Morgagni described 19 cases of “turbid” lungs in 1769 and 20 years

later an emphysematous lung is illustrated by Matthew Baillie [1]. Early reports of

chronic bronchitis were generated in 1814 by Badham who used the word catarrh to

refer to the chronic cough and mucus hypersecretion that are key symptoms. He also

described bronchiolitis and chronic bronchitis as disabling disorders [1]. The

emphysema component of disease was beautifully described by Laënnec (1821) in his

Treatise of diseases of the chest. He recognized that emphysema lungs were

hyperinflated and did not empty well [1]. Spirometer was invented in 1846 by John

Hutchinson [1]. This device is today absolutely necessary to the correct diagnosis and

management of COPD. However, Hutchinson’s instrument only measured vital

capacity. A century went by until Tiffeneau was able to add the concept of timed vital

capacity as a measure of airflow [2]. Gaensler introduced the concept of the air

velocity index based on Tiffeneau’s work and later the forced vital capacity [3], which is

the foundation of the FEV1 and FEV1/FVC percent and spirometry became complete as

a COPD diagnostic instrument. What was once called chronic obstructive

bronchopulmonary disease, chronic airflow obstruction, chronic obstructive lung

disease, nonspecific chronic pulmonary disease, and diffuse obstructive pulmonary

syndrome, was coined COPD in 1965 by William Briscoe [4].

Definition and prevalence

COPD is characterized by chronic airflow limitation that is not fully reversible even

under bronchodilators effect, caused by a mixture of small airway disease –

obstructive bronchiolitis – and parenchymal destruction – emphysema (Figure I.1).

Indeed, main components of COPD are chronic bronchitis and emphysema. Chronic

bronchitis is defined by the presence of chronic recurrent increase in bronchial

3

secretions sufficient to cause expectoration. These secretions must be present in most

days for a minimum of three months per year for at least two consecutive years and

cannot be attributed to other disorders [5]. Noteworthy, not every patient with

chronic bronchitis has or will develop chronic airflow limitation [6]. Emphysema is

defined anatomically by permanent, destructive enlargement of airspaces distal to the

terminal bronchioles without obvious fibrosis [5].

Figure I.1: Changes in lung parenchyma of COPD patients. Source: Barnes, PJ. Adapted

from www.goldcopd.org

Associated chronic inflammation causes changes and narrowing of the small airways

leading to airway remodeling. Parenchyma destruction is responsible for the loss of

alveolar attachments and decrease of lung elastic recoil [7]. These changes reduce the

ability of the airways to remain open during expiration. No currently available

treatments reduce the progression of COPD or suppress the inflammation in small

airways and lung parenchyma [8]. According to the Global initiative for chronic

Obstructive Lung Disease (GOLD)’s last report, COPD is defined as a preventable and

treatable disease with some significant extrapulmonary effects that may contribute to

the severity in individual patients [9].

COPD is a major cause of morbidity and mortality in adults, and its incidence has been

increasing worldwide. In 2000, approximately 2.7 million deaths were caused by COPD

4

[10]. At the present time, COPD is the fourth leading cause of death and its prevalence

and mortality are expected to continue increasing in next decade [9-11]. Additionally,

COPD is the only major cause of death that is increasing in prevalence worldwide [12],

while others causes have been declining since 1970 [13] (Figure I.2). Not only mortality

but also morbidity associated with COPD are often underestimated by healthcare

providers and patients as COPD is frequently underdiagnosed and undertreated [14].

Figure I.2: Trends in mortality rates for the six leading causes of death in the US,

1970-2002 [13].

In Portugal and according to a report released by the national observatory for

respiratory diseases (Observatorio Nacional das Doencas Respiratorias – ONDR), there

5

were about 540 000 people suffering from COPD in 2007, which means that 5.2 % of

the population is estimated to suffer from this disease [15]. In contrast, a recent report

of the Burden of Obstructive Lung Disease (BOLD) initiative in Portugal revealed that

prevalence rate of COPD in Portugal is 14.2% [16].

Diagnosis and classification

Spirometry is the most reproducible way to measure lung function and is nowadays

the best tool to diagnose airflow limitation and, consequently, diagnose COPD itself.

Spirometry should be performed post-bronchodilator administration to minimize

variability and if the forced expiratory volume in one second to forced vital capacity

ratio (FEV1/FVC) is lower than 0.7 then there is lung obstruction and the severity of the

disease will depend on FEV1 value according to GOLD guidelines [9]. Classification of

the disease according to the four stages is documented in Table I.1.

Table I.1: Classification of COPD stages based on spirometry [9]. FEV1 = Forced

expiratory volume in 1 sec; FVC = forced vital capacity; PaO2 = arterial partial pressure

of oxygen.

Disease

stage

Main characteristics

1: Mild COPD

FEV1/FVC < 70%

FEV1 ≥ 80% predicted

With or without symptoms

2: Moderate

COPD

FEV1/FVC < 70%

50% ≤ FEV1 < 80% predicted

With or without symptoms

3: Severe COPD

FEV1/FVC < 70%

30% ≤ FEV1 < 50% predicted

With or without symptoms

4: Very severe

COPD

FEV1/FVC < 70%

FEV1 < 30% predicted or < 50% predicted plus presence of chronic respiratory failure

(PaO2 < 60 mm Hg while breathing room air at sea level)

6

Risk factors

Cigarette smoke is the most commonly encountered risk factor for COPD. Cigarette

smoking is the leading cause of preventable death worldwide and yet, despite anti-

smoking campaign efforts from such organizations as the European Respiratory Society

[12], American Thoracic Society [17] or the World Health Organization (WHO) [18], the

number of smokers keeps increasing. Thus, global epidemic of tobacco-associated

diseases has progressively worsened.

Cigarette smokers have a higher prevalence of respiratory symptoms, lung function

abnormalities, a greater rate of decline in forced expiratory volume in the first second,

FEV1, and higher death rates for COPD than nonsmokers [9, 12]. A 25-year follow up

study of the general population concluded that 92% of COPD deaths occurred in

subjects who were current smokers at the beginning of the follow up period and that

after 25 years of smoking, at least 25% of smokers without initial disease will develop

clinically significant and 30-40% will have COPD [19]. The fact that not all smokers

develop clinically significant COPD, suggests that genetic factors may modify each

individual risk [12].

COPD is a polygenic disease and a classical example of gene-environment interaction

[9]. The only proven genetic risk factor for COPD is the hereditary deficiency of α1-

antitrypsin, a major circulating inhibitor of serine proteases, in which a smoker will

considerably increase the risk for COPD [20]. Gene mutations and polymorphisms have

been studied and several candidate genes associated with COPD phenotypes have

been reported, but so far none has been validated [21-24]. Occupational dust, outdoor

and indoor pollution, socioeconomic status and genetic determinants are also

associated with the development of COPD [12].

Pathology, pathogenesis and pathophysiology

Cigarette smoke and other noxious particles cause amplified lung inflammation in

patients that develop COPD. This may induce parenchymal tissue destruction

(emphysema) and disturb normal repair and defense mechanisms resulting in small

airway inflammation [25, 26]. Emphysema and small airway inflammation and damage

lead to the enlargement of alveolar air spaces, airway wall fibrosis, loss of elastic recoil,

smooth muscle hypertrophy, goblet cell hyperplasia and mucus plugging. Inflammatory

7

exudates accumulate in the small airways lumen due to reduced mucociliary escalator

function [25]. The physiological consequences are airway collapse over expiration

leading to airflow obstruction and hyperinflation (air trapping) which ultimately results

in characteristic symptom of breathless and progressive airflow limitation that may

lead to death. In general, inflammatory and structural changes in the airways increase

with disease severity and persist on smoking cessation [27-29].

COPD is characterized by a specific pattern of inflammation which involves neutrophils,

macrophages and lumphocytes [7]. These cells release inflammatory mediators and

interact with structural cells in airways and lung parenchyma (Figure I.3).

Figure I.3: Inflammatory cells involved in COPD. Adapted from [30].

The wide variety of inflammatory mediators that have been shown to be increased in

COPD patients [30] attract inflammatory cells from circulation amplifying the

inflammatory process and inducing structural changes that may lead to emphysema

and mucus hypersecretion [31].

8

Lung inflammation is believed to be further augmented by oxidative stress and an

excess of proteinases in the lung. These two mechanisms are key players in COPD

pathology (Figure I.4).

Figure I.4: Pathogenesis of COPD. Source: Barnes PJ. Adapted from

www.goldcopd.org.

It has long been proposed that several proteases disrupt connective tissue

components, as elastin, in lung parenchyma to produce emphysema and that there is

an imbalance in COPD patients between proteases and endogenous antiproteases

which should protect the lung against protease-derived effects. In COPD, the

exogenously and endogenously derived oxidants have been found to inactivate

antiproteinases such as α1-antitrypsin [32]. Evidences of elastin degradation in COPD

have been demonstrated and although early attention was directed to neutrophil

elastase, many other proteases have been reported to be able to degrade elastin [7].

9

Oxidants are generated endogenously and exogenously, with cigarette smoke being

heavily implicated in the latter as it contains many oxygen free radicals [25]. Under

normal circumstances and despite permanent exposure to high oxygen levels, the lung

is able to manage oxidant species by neutralizing them with several antioxidant

mechanisms in the human respiratory tract [7, 25, 33]. Oxidative stress occurs when

reactive oxygen species (ROS) are produced in excess of the antioxidant defense

mechanisms resulting in harmful effects such as damage to lipids, proteins and

deoxyribonucleic acid (DNA) [7, 34]. Inflammatory and structural cells that are

activated in the airways of COPD patients produce ROS, including neutrophils,

eosinophils, macrophages and epithelial cells [7, 34, 35]. Alveolar macrophages are

activated by free radicals and react by producing high levels of mediators, some of

which are chemotactic for neutrophils and macrophages (Figure I.3), as well as ROS

and also reactive nitrogen species (RNS), with resultant local and systemic

inflammation [25].

It is increasingly recognized that the inflammatory response associated with COPD

extend beyond the lung [36]. Evidence of systemic inflammation includes activated

circulating inflammatory cells and elevated levels of both inflammatory cytokines and

acute phase proteins as C-reactive protein, fibrinogen, leukocytes and tumor necrosis

factor (TNF-α) in COPD patients when compared to healthy subjects [37].

The origin of systemic inflammation in COPD is still unclear and requires further

investigation, but it is likely to be a consequence of a number of factors, including

individual susceptibility and the direct effects of hypoxia and noxious substances as the

one of cigarette smoke on the peripheral vasculature and circulating inflammatory

cells [25]. Alternatively, the observed inflammation may be a consequence of

‘overspill’ from the lung to the peripheral circulation [25]. Systemic inflammation is

directly linked to a number of complications commonly encountered in COPD patients

including, but not limited to, cachexia, skeletal muscle dysfunction, depression,

osteoporosis, diabetes/glucose intolerance, autoimmune disorders and cardiovascular

diseases (Figure I.5) [25, 36].

10

Figure I.5: Pulmonary and systemic inflammatory events associated with COPD [25].

Economic burden

Among respiratory diseases, COPD is the leading cause of lost work days. In the United

States of America, medical costs credited to COPD were estimated at $32.1 billion [38].

In the European Union, productivity losses are estimated to amount to a total of €28.5

billion annually [12]. The total COPD-related expenses for outpatient care is €4.7

billion, while inpatient care generates costs of €2.9 billion, followed by expenses for

pharmaceuticals at €2.7 billion [12]. Therefore, according to these data provided by

the European Respiratory Society and the European Lung Foundation within the

11

European Lung White Book, indirect costs represent the major financial burden in

COPD and, more importantly, the total costs associated with the disease are quite

relevant. Another study compared the costs of COPD in Spain, USA, Sweden, Holland

and Italy. Global annual costs for each country ranged €109-541 million whilst annual

costs per patient were €151-3,91 [39].

Economic burden is likely to be underestimated since, for example, the economic value

of the care provided by family members is not generally acknowledged. Long-term

home care provided by relatives for COPD patients has a negative impact on

professional careers for both patients and their family members [5]. Hence, COPD

represents a very important threat to global economies.

PROTEOMICS

The Proteome is, by definition, the total set of proteins expressed by a given cell,

tissue, organ or organism at a certain time and under certain conditions. Proteomics is

defined by the large-scale study of the proteome. The human genome has been

sequenced a decade ago and about 20,000 genes were accounted. Genome

sequencing contains valuable information to proteomics that can take this knowledge

to a higher level providing new insights into the pathophysiology of many diseases that

may be translated to new prognosis and diagnosis, but also to novel therapeutical

treatments.

Background and state-of-the-art

The word proteome was coined at the Siena 2D electrophoresis meeting in 1994 [40].

The advent of proteomics has brought with it the hope of discovering novel

biomarkers that can be used to diagnose diseases, predict susceptibility and monitor

progression, among many other applications. This hope is built on the ability of

proteomic technologies, such as mass spectrometry (MS), to identify hundreds of

proteins in complex biofluids such as plasma and serum. Very few if any analytical

instruments surpass the mass spectrometer in the versatility of its application in both

basic and applied research, as it is the case of biomarker discovery. To support this

12

statement, it is sufficient to mention that mass spectrometry can be used for

applications ranging from characterization of electronic excited states and vibrational

levels of simple molecules to the construction of protein interaction maps in

multicellular organisms. This is also the result of almost hundred years of mass

spectrometry utilization since Sir J. J. Thomson was able to create the first mass

spectrometer in 1913 [41]. For over 80 years ionization methods had excluded the

study of large molecules, including peptides and proteins. In the 1980s, this paradigm

changed with the introduction of new ionization methods as electrospray ionization

(ESI) [42] and matrix-assisted laser desorption ionization (MALDI) [43, 44]. These

simple and sensitive ionization methods have been coupled to different types of

analyzers such as triple quadrupoles, three-dimensional ion traps, and time of flight

(TOF) , including its orthogonal version which allowed coupling of TOF to both pulsed

(MALDI) and continuous (ESI) ionization types [40].

A further impetus was given to the process of ion analysis through the

commercialization of hybrid configurations that have been intensively used in

proteomics including, but not limited to, TOF-TOF, ion trap-Fourier transform (FT)-ion

cyclotron resonance (ICR), and quadrupole-TOF. These combinations have a direct

impact on sensitivity and resolution of the sequence information that can be obtained

when performing tandem MS analysis [40].

At the same time MS was evolving, there were many advances in other fields that were

crucial to the development of proteomics as sample preparation techniques and

bioinformatics tools. One of the most widely used separation procedures is two-

dimensional gel electrophoresis (2DE) which consists in the separation of a complex

protein mixture according to physicochemical properties of proteins. First, proteins are

separated in one dimension according to their isolectric point through isoelectric

focusing (IEF) in immobilized pH gradient (IPG) strips, and then separated over a

second dimension according to their molecular weight in a sodium dodecyl sulfate

polyacrylamide gel electrophoresis (SDS-PAGE). This separation method was described

as it is used today in 1975 by O’Farrell [45]. SDS-PAGE is one of many achievements

that took place in the last 50 years and that established proteomics in the first line of

clinical research at the present time (Figure I.6) [46].

13

Figure I.6: Proteomics timeline indicating important scientific contributions to

proteomics development for the past five decades [46].

14

Figure I.7: General view of the experimental steps and flow of data in shotgun

proteomics analysis [48].

In last decade, the shotgun proteomics approach has become the method of choice for

identifying and quantifying proteins in most large-scale studies [47-50]. Compared with

2DE, shotgun proteomics allows higher data throughput and better protein detection

sensitivity. This strategy is based on digesting proteins (usually with trypsin) into

peptides. This produces a complex peptide mixture that is then separated by one- or

multiple dimensional liquid chromatography (LC) and subjected to peptide sequencing

using tandem mass spectrometry (MS/MS) before automated database searching

(Figure I.7). This strategy is compatible with the use of labeled samples for quantitative

15

purposes such as stable isotope labeling by amino acids in cell culture (SILAC) [51],

isotope coded affinity tags (ICAT) [52], isobaric tags for relative and absolute

quantitation (iTRAQ™) [53] or by O16/O18 exchange [54].

The chance of combining different techniques along sample preparation steps with

different separation methods and different types of mass spectrometers generates

multiple complementary approaches whose results can be combined to achieve a

higher level of understanding (Figure I.8).

Figure I.8: Workflow illustrating different proteomics-based approaches and major

steps required for proteomic biomarker discovery.

Proteomics in COPD

There is still some ambiguousness concerning the disease-specific molecular

mechanisms of the inflammatory process and acute exacerbation of COPD. Potential

biomarkers which are specific for COPD have not been fully identified and validated,

even though there is a great need for such biomarkers [55]. Proteomic technologies

allow for identification of protein changes caused by the disease process and recent

16

advances, especially at mass spectrometry and bioinformatics levels, raise the chances

to identify novel putative biomarkers. In a recent review, Chen and coworkers provide

information on putative biomarkers for COPD generated from several proteomics

studies in lung tissues, bronchoalveolar lavage fluid (BALF) and sputum [55].

Concerning lung tissues, two studies mentioned in this review yielded 12 differentially

expressed proteins when compared to healthy controls, including surfactant protein A

(SP-A) and matrix metalloproteinase-13 (MMP-13) [56, 57]. Regarding BALF, there

were four proteins reported to be differentially expressed in COPD (Neutrophil

defensins 1 and 2 and calgranulin A and B) [58] and in sputum, clara cell secretory

protein (CCSP) and again SP-A were the two proteins reported as potential biomarkers

[57, 59].

Surprisingly, to date only 50 reports (14 are reviews) match the search at Pubmed

(http://www.ncbi.nlm.nih.gov/pubmed, accessed June 23, 2011) for COPD proteomics,

while proteomics and COPD account for about 30,000 each when separately searched.

Hence, there is a clear need to engage clinically valuable proteomics studies in order to

match the need for new biomarkers in COPD.

17

Chapter II

Proteomic Mining Of The Red

Blood Cell: Focus On The

Membrane Proteome

18

Proteomic Mining Of The Red Blood Cell: Focus On The Membrane Proteome

Bruno M. Alexandre1*

1 Laboratório de Proteómica, Departamento de Genética, Instituto Nacional de Saúde

Dr. Ricardo Jorge (INSA-IP), Lisboa, Portugal

Keywords: Red blood cell; Membrane proteins; Membrane proteomics; Clinical

proteomics; Malaria.

*Corresponding author: Bruno M. Alexandre, Laboratório de Proteómica,

Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo

Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8138,

Fax: +351 21752 6410, e-mail: [email protected]

Alexandre, B. M., Proteomic mining of the red blood cell: focus on the membrane

proteome. Expert review of proteomics 2010, 7, 165-168.

Manuscript reproduced under written authorization from Expert Reviews Ltd., London,

UK.

19

The plasma membrane is strategically located in the interface between the inside and

the outside of the cell. Membrane proteins, as part of the plasma membrane, act as

key players mediating diverse cellular functions including, but not limited to,

metabolite and ion transport, intercellular communication, cell adhesion and cell

movement [1].

The main function of the red blood cell (RBC) is to mediate O2/CO2 exchange between

cells/tissues and lungs and this is only achieved due to the morphology and mechanical

deformability of the RBC membrane, which is responsible for its capability to perfuse

across the vessels and capillaries along its 120 days journey. The RBC membrane

possesses concomitant distinctive features as high elasticity (with little increase on

surface area) and robustness (stronger than steel in terms of structural resistance) [2].

These unique properties result from a composite structure in which cholesterol and

phospholipids, that compose the plasma membrane envelope, are anchored to a two-

dimensional elastic network of skeletal proteins through transmembrane proteins

embedded in the lipid bilayer [2, 3]. Appropriate function of integral membrane

proteins and their interaction with the cytoskeleton are vital for the maintenance of

structural stability and RBC shape. Failure on any of these events/constituents is the

cause for many red cell disorders [4, 5]. Hence, studying the membrane proteome is

important for comprehending the biology of disease states in the quest for novel

biomarkers and consequently is also important at the pharmacological level as many

successful drugs known to date target membrane proteins modulating its activity [6].

Since its very beginning, the study of membrane proteomes turned out to be a major

challenge. In 1974, a review was published concerning the organization of the human

RBC membrane compiling several studies and hypothesizing on a possible arrangement

for the most abundant RBC integral membrane proteins and interactors [7]. Giving its

importance, it is not surprising that membrane proteins have been studied by a variety

of biochemical techniques. One of those techniques is two-dimensional gel

electrophoresis (2-DE), which paved the way to the proteomics era [8]. In 1978, two

years after the application of the O’Farrell 2-DE system [9] to the study of membrane

proteins [10], over 200 spots were resolved from human RBC membranes [11]. But the

application of 2-DE to the study of membrane proteins was far from being ideal. This is

due to the unique characteristics of membrane proteins as their amphiphilic nature

20

(poor solubility in the aqueous buffers used for isoelectric focusing), high isoelectrical

points [pIs can be higher than the upper limit of the immobilized pH gradient (IPG)

strips] and low abundance, which greatly difficult its detection through 2-DE [1, 12-14].

To overcome the solubility issue, Rosenblum et al. used different concentrations of

urea, NP-40 detergent and mercaptoethanol to detect about 600 spots using silver

staining [15]. However, the real improvement on the methods/studies presented in

the large majority of the publications released until early 90s relied essentially on the

number of additional spots and the reproducibility of the new/modified

methodologies rather than identification and classification of hypothetical new

proteins found. One must bear in mind that before the development of an analytical

technique for naïve peptide/protein identifications, in the case, peptide mass

fingerprint (PMF) [16-20], proteins could only be identified by means of targeted

approaches as comigration with known proteins or immunoblotting, a more sensitive

technique [21]. Therefore, the first ‘serious’ proteomic study on RBC membranes, i.e.,

the first study where modern mass spectrometry (MS) and database searching in the

post-genomic era was employed to study RBC membranes was the one performed by

Low and co-workers in 2002 [22]. Using one-dimensional gel electrophoresis (1-DE)

and 2-DE, silver staining and in-gel trypsin digestion of selected spots followed by

matrix-assisted laser desorption ionization – time of flight (MALDI-TOF) MS, the

authors were able to identify 84 unique proteins: 59 proteins were identified by 2-DE

and 44 by 1-DE (19 proteins were common to both approaches). In addition, several

isoforms were found in the study. The first in-depth study on the RBC proteome was

conducted in 2004, where cytoplasmic and membrane fractions were further

fractionated and resulting sub-fractions were then analyzed (and classified) resulting in

the identification of 181 unique proteins, 91 of which were identified from the

membrane fraction [23]. In 2005, Tyan et al. were able to identify 272 proteins using a

trypsin-immobilized chip used for protein digestion prior to two-dimensional high

performance liquid chromatography – electrospray ionization (2D-HPLC-ESI)-MS/MS

[24]. In the same year, Bruschi et al. [25] presented an approach to improve the

analysis of high Mw proteins in 2-DE. The authors used diluted Immobiline gels

combined with sample delipidation generating gels with more than 500 spots,

including filamentous proteins such as spectrins and ankyrins and integral membrane

21

proteins as bands 3, 4.1 and 4.2 [25]. Still in 2005, Kakhniashvili et al. used two-

dimensional fluorescence difference gel electrophoresis (2D-DIGE) to compare the RBC

membrane profile of one sickle cell disease (SCD) patient to one healthy individual and

came up with 49 differentially expressed spots using a threshold of 2.5-fold. Selected

spots were further analyzed by LC-MS after in-gel trypsin digestion to identify 44

protein forms from 22 unique proteins [26]. The same strategy was employed to

investigate the therapeutic action of hydroxyurea in SCD [27]. In 2006, Pasini and

colleagues published the most complete study on the human RBC membrane

proteome known to date [28]. By combining sample preparation techniques and top

quality MS instruments such as quadrupole time-of-flight (Q-TOF) and Fourier

transform – ion cyclotron resonance (FT-ICR) they were able to identify 314 membrane

proteins (and also 252 soluble proteins). A very promising gel-based approach for

analysis of membrane proteins is two-dimensional blue-native (BN)/SDS PAGE. This

novel approach was applied to the study of the RBC membrane by van Gestel et al.

who were able to detect 146 spots, from which 524 unique proteins were identified by

LC-MS. This data was compared to two other comprehensive datasets produced by

Pasini et al. [28] and Bosman et al. [29] and it was exciting to observe that only 112

from a total of 1431 unique proteins were commonly identified in the three studies. In

addition, the authors were able to use BN/SDS PAGE in combination with CyDye

labeling to quantitatively analyze samples from healthy volunteers and a patient

suffering from congenital anemia [30], an approach that can potentially be used for

biomarker discovery. Noteworthy, although only targeted to cytoplasmic proteins, is

the study carried out by Roux-Dalvai et al., since the use of a new technology, peptide

ligand library, was responsible for the identification of as many as 1578 proteins from a

highly purified preparation of RBCs.

Datasets from some of the studies herein presented were gathered on a minireview

paper [31], but unfortunately the authors have not provided information on the

accession numbers, therefore making it difficult to understand how many proteins

were found to present and also to compare newly obtained datasets to the data

collected so far. Indeed, no review produced to date compiles the information

concerning protein identifications (and classification) together with the correspondent

22

accession number (e.g., UniProt, IPI) in the different studies, including very interesting

recently published reviews [32-35].

Conclusion & future outlook

The RBC proteome was believed to be a simple one taking into account that RBCs are

enucleated cells and lack internal organelles and protein synthesis machinery. But in

the past decade, the knowledge on the RBC proteome increased dramatically and

changed this picture. This was due not only to the development on sample preparation

techniques (e.g., fractionation, depletion/enrichment), but mostly to the use of

sophisticated mass spectrometers, appropriate search algorithms and to

comprehensive human protein databases. In order to entirely comprehend the action

of the RBC, and particularly the different roles played by its membrane, it is necessary

to identify every single protein, their structure, function, posttranslational

modifications, interactions, location and abundance in the cell. RBC membrane

proteome revelation will have a tremendous impact in medicine, including hot topics

as transfusion medicine [33, 36, 37] and malaria. Malaria, which is caused by an

eukaryotic protist of the genus Plasmodium, is responsible for the death of about 3

million people worldwide [38]. Malaria parasites first invade hepatocytes of the human

host before traveling into the blood to infect RBCs. As circulating infected RBCs are

removed in the spleen, P. falciparum (responsible for 80% of malaria cases and 90%

deaths from malaria) exhibits adhesion proteins at RBC surface causing RBCs to attach

to the blood vessels. These surface adhesion proteins as Plamodium falciparum

erythrocyte membrane protein 1, PfEMP1, are exposed to the immune system and

would be, therefore, an easy target. But remarkably, the parasites stay one step

forward from the immune system by presenting extreme diversity in PfEMP1 isoforms:

there are over 60 variations of the protein within a single parasite and virtually

limitless versions within parasite populations [39]. Furthermore, there are other RBC

proteins of particular interest as Glucose-6-phosphate dehydrogenase or Duffy

antigens (used by P. vivax to enter the cell), whose expression deficiency in RBCs

results in increased protection against P. vivax and severe malaria [40-42]. These facts

by themselves present major challenges for clinicians and researchers and set

important cases for the continued detailed study of the RBC membrane proteome. But

23

clinical proteomics applications are far from being limited to diseases that directly

affect the RBC; other diseases that although not affecting the RBC directly, provoke

alterations in the RBC, are no less mandatory to be exploited for diagnostic purposes.

For instance, when RBCs are depleted of critical enzymes needed for intermediary

metabolism and antioxidant activity, it results in oxidation of critical membrane

proteins, lipids and hemoglobin which lead to distortion and rigidity of the RBC

membrane.

It is also important to acknowledge that RBCs are one of the most abundant cells in

humans and are involved in numerous processes through the interplay with other

blood cells and endothelial cells for a time period that may last as long as 4 months,

thus potentially accumulating modifications on their proteins (surface and membrane

will likely be the more affected ones) that can indirectly report the underlying features

of a specific pathology, ultimately before the symptoms were ever manifested.

24

REFERENCES

[1] Rabilloud, T., Membrane proteins and proteomics: love is possible, but so difficult.

Electrophoresis 2009, 30 Suppl 1, S174-180.

[2] Mohandas, N., Gallagher, P. G., Red cell membrane: past, present, and future.

Blood 2008, 112, 3939-3948.

[3] Mohandas, N., Evans, E., Mechanical properties of the red cell membrane in

relation to molecular structure and genetic defects. Annual review of biophysics and

biomolecular structure 1994, 23, 787-818.

[4] An, X., Mohandas, N., Disorders of red cell membrane. British journal of

haematology 2008, 141, 367-375.

[5] Delaunay, J., The molecular basis of hereditary red cell membrane disorders. Blood

reviews 2007, 21, 1-20.

[6] Hopkins, A. L., Groom, C. R., The druggable genome. Nat Rev Drug Discov 2002, 1,

727-730.

[7] Steck, T. L., The organization of proteins in the human red blood cell membrane. A

review. The Journal of cell biology 1974, 62, 1-19.

[8] Penque, D., 2009, pp. 155-172.

[9] O'Farrell, P. H., High resolution two-dimensional electrophoresis of proteins. The

Journal of biological chemistry 1975, 250, 4007-4021.

[10] Ames, G. F., Nikaido, K., Two-dimensional gel electrophoresis of membrane

proteins. Biochemistry 1976, 15, 616-623.

[11] Rubin, R. W., Milikowski, C., Over two hundred polypeptides resolved from the

human erythrocyte membrane. Biochimica et biophysica acta 1978, 509, 100-110.

[12] Santoni, V., Molloy, M., Rabilloud, T., Membrane proteins and proteomics: un

amour impossible? Electrophoresis 2000, 21, 1054-1070.

[13] Rabilloud, T., Chevallet, M., Luche, S., Lelong, C., Fully denaturing two-dimensional

electrophoresis of membrane proteins: a critical update. Proteomics 2008, 8, 3965-

3973.

[14] Zhang, H., Lin, Q., Ponnusamy, S., Kothandaraman, N., et al., Differential recovery

of membrane proteins after extraction by aqueous methanol and trifluoroethanol.

Proteomics 2007, 7, 1654-1663.

25

[15] Rosenblum, B. B., Hanash, S. M., Yew, N., Neel, J. V., Two-dimensional

electrophoretic analysis of erythrocyte membranes. Clinical chemistry 1982, 28, 925-

931.

[16] Pappin, D. J., Hojrup, P., Bleasby, A. J., Rapid identification of proteins by peptide-

mass fingerprinting. Curr Biol 1993, 3, 327-332.

[17] Henzel, W. J., Billeci, T. M., Stults, J. T., Wong, S. C., et al., Identifying proteins

from two-dimensional gels by molecular mass searching of peptide fragments in

protein sequence databases. Proceedings of the National Academy of Sciences of the

United States of America 1993, 90, 5011-5015.

[18] Mann, M., Hojrup, P., Roepstorff, P., Use of mass spectrometric molecular weight

information to identify proteins in sequence databases. Biological mass spectrometry

1993, 22, 338-345.

[19] James, P., Quadroni, M., Carafoli, E., Gonnet, G., Protein identification by mass

profile fingerprinting. Biochemical and biophysical research communications 1993, 195,

58-64.

[20] Yates, J. R., 3rd, Speicher, S., Griffin, P. R., Hunkapiller, T., Peptide mass maps: a

highly informative approach to protein identification. Analytical biochemistry 1993,

214, 397-408.

[21] Towbin, H., Staehelin, T., Gordon, J., Electrophoretic transfer of proteins from

polyacrylamide gels to nitrocellulose sheets: procedure and some applications.

Proceedings of the National Academy of Sciences of the United States of America 1979,

76, 4350-4354.

[22] Low, T. Y., Seow, T. K., Chung, M. C., Separation of human erythrocyte membrane

associated proteins with one-dimensional and two-dimensional gel electrophoresis

followed by identification with matrix-assisted laser desorption/ionization-time of

flight mass spectrometry. Proteomics 2002, 2, 1229-1239.

[23] Kakhniashvili, D. G., Bulla, L. A., Jr., Goodman, S. R., The human erythrocyte

proteome: analysis by ion trap mass spectrometry. Mol Cell Proteomics 2004, 3, 501-

509.

[24] Tyan, Y. C., Jong, S. B., Liao, J. D., Liao, P. C., et al., Proteomic profiling of

erythrocyte proteins by proteolytic digestion chip and identification using two-

26

dimensional electrospray ionization tandem mass spectrometry. Journal of proteome

research 2005, 4, 748-757.

[25] Bruschi, M., Seppi, C., Arena, S., Musante, L., et al., Proteomic analysis of

erythrocyte membranes by soft Immobiline gels combined with differential protein

extraction. Journal of proteome research 2005, 4, 1304-1309.

[26] Kakhniashvili, D. G., Griko, N. B., Bulla, L. A., Jr., Goodman, S. R., The proteomics of

sickle cell disease: profiling of erythrocyte membrane proteins by 2D-DIGE and tandem

mass spectrometry. Experimental biology and medicine (Maywood, N.J 2005, 230, 787-

792.

[27] Ghatpande, S. S., Choudhary, P. K., Quinn, C. T., Goodman, S. R., Pharmaco-

proteomic study of hydroxyurea-induced modifications in the sickle red blood cell

membrane proteome. Experimental biology and medicine (Maywood, N.J 2008, 233,

1510-1517.

[28] Pasini, E. M., Kirkegaard, M., Mortensen, P., Lutz, H. U., et al., In-depth analysis of

the membrane and cytosolic proteome of red blood cells. Blood 2006, 108, 791-801.

[29] Bosman, G. J., Lasonder, E., Luten, M., Roerdinkholder-Stoelwinder, B., et al., The

proteome of red cell membranes and vesicles during storage in blood bank conditions.

Transfusion 2008, 48, 827-835.

[30] van Gestel, R. A., van Solinge, W. W., van der Toorn, H. W., Rijksen, G., et al.,

Quantitative erythrocyte membrane proteome analysis with Blue-Native/SDS PAGE.

Journal of proteomics 2009.

[31] Goodman, S. R., Kurdia, A., Ammann, L., Kakhniashvili, D., Daescu, O., The human

red blood cell proteome and interactome. Experimental biology and medicine

(Maywood, N.J 2007, 232, 1391-1408.

[32] Pasini, E. M., Lutz, H. U., Mann, M., Thomas, A. W., Red blood cell (RBC)

membrane proteomics - Part I: Proteomics and RBC physiology. Journal of proteomics

2009.

[33] Pasini, E. M., Lutz, H. U., Mann, M., Thomas, A. W., Red Blood Cell (RBC)

membrane proteomics - Part II: Comparative proteomics and RBC patho-physiology.

Journal of proteomics 2009.

[34] Liumbruno, G., D'Alessandro, A., Grazzini, G., Zolla, L., Blood-related proteomics.

Journal of proteomics 2009.

27

[35] D'Alessandro, A., Righetti, P. G., Zolla, L., The red blood cell proteome and

interactome: an update. Journal of proteome research, 9, 144-163.

[36] Liumbruno, G., D'Amici, G. M., Grazzini, G., Zolla, L., Transfusion medicine in the

era of proteomics. Journal of proteomics 2008, 71, 34-45.

[37] Bosman, G. J., Lasonder, E., Groenen-Dopp, Y. A., Willekens, F. L., et al.,

Comparative proteomics of erythrocyte aging in vivo and in vitro. Journal of proteomics

2009.

[38] Snow, R. W., Guerra, C. A., Noor, A. M., Myint, H. Y., Hay, S. I., The global

distribution of clinical episodes of Plasmodium falciparum malaria. Nature 2005, 434,

214-217.

[39] Chen, Q., Schlichtherle, M., Wahlgren, M., Molecular aspects of severe malaria.

Clin Microbiol Rev 2000, 13, 439-450.

[40] Foller, M., Bobbala, D., Koka, S., Huber, S. M., et al., Suicide for survival--death of

infected erythrocytes as a host mechanism to survive malaria. Cell Physiol Biochem

2009, 24, 133-140.

[41] Rowe, J. A., Opi, D. H., Williams, T. N., Blood groups and malaria: fresh insights

into pathogenesis and identification of targets for intervention. Curr Opin Hematol

2009, 16, 480-487.

[42] Chootong, P., Ntumngia, F. B., Vanbuskirk, K. M., Xainli, J., et al., Mapping epitopes

of the Plasmodium vivax Duffy binding protein with naturally acquired inhibitory

antibodies. Infection and immunity 2009.

29

Chapter III

Quantitative Profiling of the

Erythrocyte Membrane Proteome

Isolated from Patients Diagnosed

with Chronic Obstructive

Pulmonary Disease

30

Quantitative Profiling of the Erythrocyte Membrane Proteome Isolated from Patients

Diagnosed with Chronic Obstructive Pulmonary Disease

Bruno M. Alexandre1,2, Nuno Charro1,2, Carlos Lopes3, Pilar Azevedo3, António Bugalho

de Almeida3, King C. Chan2, Haleem Issaq2, Timothy D. Veenstra2, Josip Blonder2,

Deborah Penque1*

1 Laboratório de Proteómica, Departamento de Genética, Instituto Nacional de Saúde

Dr. Ricardo Jorge (INSA-IP), Lisboa, Portugal

2 Laboratory of Proteomics and Analytical Technologies, SAIC-Frederick Inc., National

Cancer Institute at Frederick, Frederick MD, USA

3 Clínica Universitária de Pneumologia, HSM, Universidade de Lisboa, Portugal

Keywords: 16O/18O stable isotopic labelling, Chronic obstructive pulmonary disease,

Membrane proteins, Red blood cells, LC-MS/MS

*Corresponding author: Deborah Penque, Ph.D., Laboratório de Proteómica,

Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo

Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8137,

Fax: +351 21752 6410, E-mail: [email protected]

31

ABSTRACT

Structural alterations in erythrocyte shape/metabolism have been described as playing

an important role in the pathophysiology of COPD. Whether these

structural/metabolic dysfunctions alter erythrocyte’s membrane proteome in patients

diagnosed with COPD remained to be determined. The goal of this study was the

comparative proteomic profiling of the erythrocyte membranes isolated from

peripheral blood of smokers diagnosed with COPD with healthy smokers, by using

differential 16O/18O stable isotope labeling followed by strong cation exchange (SCX)

fractionation and high resolution LC-LIT/FTICR-MS. Two-hundred and nineteen

proteins were identified as significantly differentially expressed in COPD erythrocyte

membranes. Functional analysis indicates that the main pathway networks associated

with these proteins are related to cell-to-cell signaling and interaction, hematological

system development, function and immune response, oxidative stress and

cytoskeleton. Chorein, which is reported to play a role in the cytoskeleton and whose

defects had been associated with the presence of thorny deformations of circulating

erythrocytes possibly due to red cell membranes deformation was found to be

underexpressed in COPD patients. The potential relevance of this and other proteins in

the COPD erythrocyte dysfunction and/or COPD disease is discussed.

32

INTRODUCTION

Chronic obstructive pulmonary disease (COPD) is characterized by chronic airflow

limitation that poorly responds to bronchodilators. COPD is mainly caused by small

airway disease (obstructive bronchiolitis) and parenchymal destruction (emphysema).

A chronic inflammatory response induces narrowing of the small airways that leads to

airway remodeling. Subsequent parenchyma destruction is responsible for the loss of

alveolar attachments and decrease of lung elastic recoil [1]. These pathological

processes reduce the ability of the airways to remain open during expiration. There is

no evidence that currently available treatments significantly reduce the progression of

COPD or suppress the inflammation in small airways and adjacent lung parenchyma

[2]. Spirometry is the tool of choice to assess airflow limitation, diagnose and classify

the severity of COPD [3]. In 2000, approximately 2.7 million deaths were caused by

COPD [4] placing this disease as the fourth leading cause of death in the world [5]. This

substantial morbidity associated with COPD is often underestimated by healthcare

providers and patients because the COPD is frequently under-diagnosed and under-

treated [6]. Cigarette smoking is by far the most common risk and the most significant

promoting factor for COPD. Cigarette smokers have higher prevalence of respiratory

symptoms and lung function abnormalities, including greater annual rate of decline in

FEV1 resulting in higher COPD incidence and mortality when compared to nonsmokers

[4]. According to Løkke et al. (2006) [7], approximately 90% of COPD patients are

smokers or ex-smokers.

COPD is primarily a lung disease that also displays significant systemic effects [8-10]. It

has been indicated that structural alterations in erythrocyte shape/metabolism play an

important role in the pathophysiology of COPD. Erythrocytes were also reported to act

as biosensors for the monitoring of the oxidative imbalance during the course of COPD

[11]. For instance, when erythrocytes are depleted of critical enzymes (e.g., reduced

glutathione) needed for intermediary metabolism and antioxidant activity, it results in

oxidation of membrane proteins, lipids and hemoglobin, causing distortion and rigidity

of the cell membrane. In this respect, erythrocytes have been reported as biosensors

for oxidative imbalance monitoring during the course of COPD [11]. Structural

alterations in erythrocyte shape and changes in blood rheology were suggested to play

an important role in the development of COPD. Using scanning electron microscopy

33

(SEM), light fluorescence microscopy and electron paramagnetic resonance (EPR),

Santini et al. were able to show that the surface of erythrocytes from COPD patients is

greatly altered with respect to control RBCs. They also observed important alterations

in actin and spectrin distribution and an increase in membrane rigidity in the context

of COPD [12].

The goal of the present study was to examine global changes in protein expression

between the erythrocyte’s membrane proteome isolated from peripheral blood of

smokers diagnosed with COPD and of healthy smokers. For this purpose, 16O/18O stable

isotope labeling coupled with 2D-LC-MS/MS was employed for differential profiling of

erythrocyte microsomal fraction obtained from peripheral blood [13]. Relative changes

in protein concentrations were determined using quantitative proteomic profiling that

relies on trypsin-mediated 16O/18O stable isotope labeling, strong cation exchange

(SCX) fractionation and high resolution LC-LIT/FTICR-MS.

MATERIALS AND METHODS

The study was approved by the Ethics Committee of both Hospital de Santa Maria

(HSM)-Lisboa and Instituto Nacional de Saude Dr. Ricardo Jorge (INSA)-Lisboa After

informed consent, healthy subjects (n=28) and patients diagnosed with COPD (n=25)

according to the GOLD guidelines [3] were recruited from the Clinica Universitaria de

Pneumologia, HSM. All individuals were matched for age (≥ 45 years old), gender and

smoking habits (Table III.1). Healthy subjects were current smokers presenting no signs

or symptoms of any respiratory or other chronic diseases. All COPD patients were

smokers or ex-smokers. Patients experiencing disease exacerbation and/or suffering

from any additional respiratory or chronic disease (e.g., asthma) were excluded. The

most relevant COPD clinical features observed in patients are listed in TableIII.2.

Purification of Red Blood Cells (RBCs) from Whole Blood

Fresh peripheral blood was obtained from patients and controls into a 4.9 ml vacuum

blood-collection tube containing 1.6 mg EDTA/ml blood and kept no longer than 4

hours at 4ºC until RBC purification to avoid degradation. To isolate RBCs, whole blood

was centrifuged for 5 minutes at 2,000 x g using a swinging bucket rotor (Heraeus

Multifuge general purpose tabletop centrifuge). Both the plasma and the fraction

34

Table III.1: Main characteristics of both control and patient groups.

Control group COPD group

Subjects (n) 28 25

Male (n) 13 17

Age (yr)

(mean ± SD) 45 12 61 11

Current Smokers (%) 100 86

Smoking history (pack-yr)

(mean ± SD) 20 9 32 14

FEV1/FVC (%)

(mean ± SD) 80 9 66 16

FEV1 (%)

(mean ± SD) 92 12 81 29

Table III.2: Profile of the COPD patients.

COPD (n=25)

Age of diagnosis (yr)

(mean ± SD) 55 11

Average number of annual exacerbations

(mean ± SD) 2.4 0.9

P(O2) (mm Hg)

(mean ± SD) 78.8 15.1

P(CO2) (mm Hg)

(mean ± SD) 41.7 8.7

Chronic bronchitis (%) 76

Emphysema (%) 48

Respiratory Insufficiency (%) 36

containing mostly leukocytes and thrombocytes (buffy coat) were removed by careful

suction. The remaining packed RBCs was centrifuged for another 2 minutes for

complete removal of the buffy coat. The RBC fraction was washed three to four times

with 3 volumes of an isotonic buffer [0.9% NaCl (w/v) solution, pH 8.0] and between

washes centrifuged at 2,000 x g for 4 minutes at 4 °C to completely remove any left

buffy layer vestige.

35

RBC Ghost Preparation

RBC ghosts were immediately prepared [14] from the enriched RBC pellets and lysed

by incubation in 1 volume of 5 mM phosphate buffer, pH 7.4, containing protease

inhibitors, for 15 minutes at RT. The samples were then diluted to 20 volumes of the

same buffer and centrifuged at 4,500 x g for 10 minutes at 4 ºC. Resulting RBC ghosts

were washed three times in the same buffer and centrifuged for 10 minutes at 27,300

x g (Sorvall RC5C plus, SS-34 rotor). Finally, to ensure hemoglobin-free ghosts and save

the whitish pellet only, pellets were centrifuged for 5 minutes at 25,000 x g using a

benchtop centrifuge (Eppendorf Centrifuge 5417 R, FA-45-24-11 rotor) and stored at -

80 ºC till further use.

Microsomal Preparation

Isolated erythrocyte ghosts from the same group (control or patient) were pooled into

2 ml siliconized tubes in the presence of 50 mM ammonium bicarbonate and 1 mM

TCEP (final concentration). The two pooled ghost samples were lysed by sonication

(Bransonsonic 1510R-DTH, Danbury CT, USA) and then centrifuged at 100,000 x g for 1

hour (Beckman 50 Ti rotor). The membrane pellets were resuspended by sonication

(Branson Digital Sonifier 250, Danbury CT, USA) in 100 mM sodium carbonate,

incubated with agitation for 2 hours at 4 ºC and centrifuged at 100,000 x g for 90

minutes (OPTIMA TLX Ultracentrifuge) to pellet purified membranes. The pellets were

washed three times with d.d. H2O, resuspended in 50 mM ammonium bicarbonate,

followed by BCA protein assay (Pierce, Rockford IL, USA). Equal amount of control and

compared sample were centrifuged for 1 hour at 100,000 x g, lyophilized to dryness

and resuspended in methanol (60% v/v, Omnisolv, EM Science, Gibbstown, NJ, USA)

buffered with 50 mM ammonium bicarbonate using intermittent sonication in a water

bath. Tryptic (Promega, Madison, WI, USA) digestion was carried out at 37 ºC in the

60% methanol/buffer solution using a 1:20 w/w trypsin-to-protein ratio as previously

described [15].

Differential, post-digestion 16O/18O Labeling.

The 18O labeling was performed employing trypsin catalyzed 18O exchange at 1:20

trypsin/protein ratio in an organic/aqueous system consisting of 20% (v/v) methanol/

36

80% (v/v) 25mM ammonium bicarbonate, pH 7.9, prepared in H218O, as described in

detail by Blonder and co-workers [13]. After 4 hours of incubation, a second shot of

trypsin (containing 1 µg) was added to each sample in order to increase (18O)

incorporation efficiency. Exchange reactions were quenched by boiling the samples for

10 minutes in a water bath and after placing them on ice, the pH of each sample was

adjusted by TFA to pH of 2.5. Samples were then pooled and immediately lyophilized

to dryness.

Peptide Fractionation by Strong Cation Exchange Liquid Chromatography

Lyophilized peptides were dissolved in 200 µL of 45% acetonitrile containing 0.1%

formic acid prior to strong cation exchange (SCX) chromatography. The sample was

resolved into 10 fractions using a microcapillary LC system (Model 1100, Agilent

Technologies Inc., Palo Alto, CA) as previously described [13]. Briefly, peptide fractions

were eluted with an ammonium formate multistep gradient at a flow rate of 200

µl/minute as follows: 0-1% B in 2 minutes, 1-10% B in 60 minutes, 10-62 % B in 20

minutes, 62-100% B in 3 minutes. Mobile phase A was 45% CH3CN and mobile phase B

was 45% CH3CN, 0.5M ammonium formate pH 3. The SCX-LC fractions were lyophilized

to dryness and reconstituted in 0.1% formic acid immediately prior to MS/MS analysis.

Nanoflow RPLC–MS/MS Analysis

Nanoflow RPLC of each SCX fraction was carried on an Agilent 1100 nanoflow LC

system (Palo Alto, CA) using a 75 m (inner diameter) x 360 m (outer diameter) x 10

cm long in house packed fused silica capillary column (Polymicro Technologies Inc.,

Phoenix, AZ) using 3 m, 300 Å pore size C18 media (Vydac, Hysperia, CA). The column

was coupled to a hybrid linear ion trap-Fourier transform ion cyclotron resonance (MS)

(LTQ-FT, ThermoElectron, San Jose, CA) using the nano-electrospray ionization source

supplied by the manufacturer. After injecting 5 µl of sample, the column was washed

for 30 minutes (at 0.5 l/minute) with 2% B, and peptides eluted (at 0.25 l/minute)

using a linear gradient as follows: 2-60% B in 100 minutes, 60-98% B in 20 minutes,

98% B for 20 minutes. The column was re-equilibrated with 2% B for 30 minutes prior

to subsequent sample loading using the flow rate of 0.5 µl/minute. Mobile phase A

was 0.1% formic acid in H2O and mobile phase B was 0.1% formic acid in acetonitrile.

37

The MS was operated in a data-dependent mode where the five most intense ions

detected in each FTICR-MS scan (m/z 200-2000) were selected for MS/MS in the ion

trap (precursor selection from m/z 400-2000). Normalized collision energy of 36% was

employed for collision-induced dissociation (CID) along with dynamic exclusion of 90

seconds to reduce redundant selection of peptides for CID. The ESI voltage and the

heated capillary temperature were set at 1.6 kV and 160 C, respectively.

Data Processing

CID spectra were analyzed using SEQUEST, on a Beowulf 20-node parallel virtual

machine cluster computer (ThermoElectron) against a non-redundant human

proteome database. A dynamic modification +4.008 Da was set on the C-terminus for

18O labeled peptides. Required precursor ion mass tolerance was 0.08 Da in MS mode

and 0.5 Da in MS2 mode. Only peptides possessing tryptic termini (allowing for up to

two internal missed cleavages), delta-correlation scores ( Cn) ≥0.08 and charge state-

dependent cross correlation (Xcorr) criteria: ≥1.9 for *M+H++1 peptides, ≥2.2 for

*M+H++2 peptides, ≥3.5 for *M+H++3, ≥4.5 for *M+H++4 peptides were considered

legitimate identifications. Relative abundances for differentially labeled isotopomeric

peptides were calculated from their mono-isotopic peaks and respective extracted ion

chromatogram areas calculated using XPRESS software (Thermoelectron, San Jose, CA)

and are reported as heavy-to-light 18O/16O ratio (i.e., 18O labeled COPD sample /16O

normal sample) for a particular peptide/protein.

Protein Annotation and Classification

Protein annotation properties were acquired using Protein Information and Knowledge

Extractor (PIKE) and Protein ANalysis THrough Evolutionary Relationships (PANTHER)

softwares (available on http://proteo.cnb.uam.es:8080/pike/ and

http://www.pantherdb.org/, respectively). Ingenuity Pathways Analysis (Ingenuity

Systems®, www.ingenuity.com) software was also used to retrieve protein information

through its knowledgebase and, importantly, to analyze potential protein-protein

interactions and group the identified proteins into signaling pathways. For the latter

purpose, we also used Cytoscape (v2.6.3, available on http://www.cytoscape.org/)

loaded with two different databases: human protein reference database (HPRD,

38

release date September 2007) and protein interaction network analysis (PINA, release

date April 2009). BiNGO, a Java-based tool that is implemented as a plug-in for

Cytoscape was utilized to check the overrepresented gene ontology terms. To check

this we set the Hypergeometric Test as the statistical test, Benjamini & Hochberg's FDR

correction for multiple testing correction and 0.05 as the significance level.

Transmembrane Domain Prediction and Hydropathicity Calculation

Alpha-helical transmembrane domains (TMD) were mapped using TMHMM available

at http://www.cbs.dtu.dk/services/TMHMM [16, 17] while protein grand average of

hydropathicity (GRAVY) scores [18] were calculated using ProtParam tool available at

the ExPASy Proteomics Server (http://www.expasy.org/tools/protparam.html).

Western Blot

Protein extracts obtained from lysed RBCs were quantified (BCA; Thermo Scientific

Pierce BCA protein assay) and 10 µg of each sample were separated in triplicate using

4-12 % (w/v) polyacrylamide gels (NuPAGE Novex Bis Tris, Invitrogen), transferred into

nitrocellulose membranes (Protran, Whatman). Membranes were probed with 1:6000

rabbit anti-CYB5R3 (Sigma-Aldrich), 1:250 mouse anti-ALDOA (Abcam), 1:1000 rabbit

anti-AARE (Abcam) or 1:200 rabbit anti-VPS13A (Abcam), for 2h at RT and developed

using enhanced chemiluminescence - ECL (Thermo Scientific Pierce ECL Western

Blotting Substrate). Antibody dilutions were all made in PBS containing 5% (w/v) fat

free milk. All membranes were washed 5 times for 10 minutes with stripping buffer

[1.5 % (w/v) Glycine, 0.1 % SDS (w/v), 1 % Tween 20 (v/v), pH 2.2], twice with PBS and

once with PBS-T before reprobing with the next primary antibody. The abundance of

the tested proteins in RBCs was calculated from densitometry of immunoblots (n=3

replicates) using Progenesis PG200v2006 software (Nonlinear Dynamics). The

corresponding Ponceau-stained lane total intensity (nitrocellulose membrane) was

used for western blot normalization.

RESULTS

To investigate into enriched erythrocyte membrane proteins for potential biomarkers

of COPD, erythrocyte microsomal fractions were prepared from pooled RBC ghost

39

samples isolated from peripheral blood of COPD patients (n=25) and healthy smoker

controls (n=28). Briefly, erythrocyte ghosts (cytological evaluation available on

Supplemental data – Figure 1 – SD_F1) were lysed by sonication in 50 mM ammonium

bicarbonate, centrifuged and resulting pellets incubated with sodium carbonate.

Samples were then normalized to same protein amount and trypsin digested before

they were labeled employing a trypsin catalyzed 18O exchange buffered system

containing 20% (v/v) methanol [13]. The COPD and control samples were pooled

together and separated into ten fractions through strong cation exchange

chromatography. Each one of these fractions was loaded into a reverse phase column

coupled online to a linear ion trap – ion-ciclotron resonance mass spectrometer

operating in a data-dependent mode where the five most intense ions detected in

each FTICR-MS scan (m/z 200-2000) were selected for MS/MS in the ion trap. Resulting

data was searched via Bioworks (SEQUEST) employing standard filtering criteria (see

material and methods for details). Relative peptide/protein abundances between

COPD patients’ and controls’ samples were quantified using Xpress software

(Thermoelectron, San Jose, CA). The general workflow displaying the main steps

involved, ranging from RBC ghost purification, to mass spectrometry (MS) and

corresponding data analysis is shown in Figure III.1.

Figure III.1: Basic scheme of methodology showing main steps of sample preparation.

40

We estimated a false positive rate for this dataset to be less than 5%, in accordance to

probability-based evaluation of peptide and protein identifications from tandem mass

spectra and SEQUEST analysis of the human proteome [19]. A total of 4697 peptides

were quantified as present in both COPD and control spectra corresponding to 1083

proteins (Supplemental Table III.1, Supporting Information). Three-hundred and

fourteen proteins possessing at least two identified peptides were selected for relative

quantification by calculating the 18O/16O ratio using XPRESS software (Supplemental

Table III.1, Supporting Information). Figure III.2 shows the SCX chromatogram of

sample separation into ten fractions.

Figure III.2: SCX chromatogram displaying sample separation into ten fractions.

Two-hundred and nineteen proteins were identified significantly over- and

underexpressed in COPD samples when applying a 1.5-fold threshold to the dataset.

These proteins were further analyzed by bioinformatics tools for cellular location and

functional annotation. Due to inconsistency when classifying proteins according to

their subcellular location through different softwares, we chose to manually curate

41

each one of the identified proteins. In order to get more information, we used both

gene ontology annotations and ingenuity knowledgebase. We were able to classify 310

out of the 314 proteins. Sample preparation toward enrichment of membrane proteins

was successful as 46% of the identified proteins were membrane proteins as displayed

in Figure III.Figure III..

Figure III.3: Subcellular location of the 314 proteins identified by at least two

peptides according to both gene ontology annotations and ingenuity systems

knowledgebase.

Moreover, there were proteins categorized as cytoplasmic proteins that belong to

cytoskeleton network as nebulin, spectrin or cytoskeleton-associated protein 5. We

also evaluated protein classification according to biological processes and molecular

functions for the whole dataset (proteins identified by at least two peptides) and for

differentially expressed proteins in COPD patients set by a threshold of 1.5-fold over-

or underexpression (Figure III.4) using PANTHER. This software was also used to get

the predominant pathways (Table III.3).

Transmembrane domain and hydrophobicity index analysis

We extended the analysis of our data using the TMHMM [16, 17] algorithm to map α-

helical integral membrane proteins and the GRAVY [18] index calculation to

42

characterize hydropathical character of this dataset. The GRAVY index is a global

descriptor of protein solubility, and corresponds to the sum of hydrophobicity values

for each of the amino-acids in the protein, normalized according to protein length

(Note: proteins exhibiting positive GRAVY values were recognized as hydrophobic

while proteins exhibiting negative GRAVY values were recognized as hydrophilic).

Figure III.4: Biological processes (panels A and C) and molecular functions (panels B

and D) for the whole proteins identified in both COPD patients and control subjects

(panels A and B) and for differentially (above 1.5-fold) expressed proteins only

(panels C and D). Information gathered from PANTHER software.

The TMHMM algorithm classified a total of 89 proteins as α-helical integral membrane

proteins possessing at least one transmembrane domain (Supplemental Table III.3,

Supporting Information). Of these, 40 proteins were characterized as hydrophobic

based on their positive GRAVY value. Also, there were found 16 proteins possessing a

positive GRAVY index although no transmembrane domain was predicted

(Supplemental Table III.3, Supporting Information). Therefore, according to GRAVY

43

index or TMHMM predictions, we were able to classify 34% of the dataset as potential

membrane proteins, which is consistent to what we found from information gathered

from protein databases (Figure III.).

Table III.3: Predominant pathways associated to COPD patients when compared to

healthy smokers as provided by PANTHER.

Category name (PANTHER Accession) # genes

Percent of gene hit

against total #

genes

Percent of gene hit

against total #

Pathway hits

Inflammation mediated by chemokine and

cytokine signaling pathway (P00031) 20 3,6% 6,5%

Ubiquitin proteasome pathway (P00060) 14 2,5% 4,5%

Wnt signaling pathway (P00057) 12 2,2% 3,9%

Integrin signalling pathway (P00034) 11 2,0% 3,5%

Angiogenesis (P00005) 10 1,8% 3,2%

Parkinson disease (P00049) 10 1,8% 3,2%

Huntington disease (P00029) 10 1,8% 3,2%

PDGF signaling pathway (P00047) 8 1,4% 2,6%

B cell activation (P00010) 8 1,4% 2,6%

Cytoskeletal regulation by Rho GTPase (P00016) 7 1,3% 2,3%

Apoptosis signaling pathway (P00006) 6 1,1% 1,9%

T cell activation (P00053) 6 1,1% 1,9%

Nicotinic acetylcholine receptor signaling

pathway (P00044) 6 1,1% 1,9%

Thyrotropin-releasing hormone receptor

signaling pathway (P04394) 6 1,1% 1,9%

Oxytocin receptor mediated signaling pathway

(P04391) 6 1,1% 1,9%

Hallmark Red Blood Cell Membrane ProteinsError! Reference source not found.Figure

II.5 highlights proteins that were identified in this study as part of the two

macromolecular complexes of membrane proteins of major importance to structural

integrity of RBC membrane [20]. As expected, the highest number of peptide counts

(1092), accounting for 23% of total peptide counts, belong to the most copious protein

in RBC membrane, band 3, also termed anion exchanger 1. Proteins as Glucose

transporter type 1, Glycophorin C, 55 kDa erythrocyte membrane protein, Kell blood

44

group glycoprotein, Erythrocyte band 7 integral membrane protein or Erythrocyte

phospholipid scramblase are examples of other important erythrocyte membrane

proteins that were also identified (Figure III.5). Additional information on relative

abundance of these RBC membrane proteins in COPD patients compared to healthy

controls is available in Supplemental Figure III.2 and Supplemental Table III.4,

Supporting Information.

Figure III.5: Proteins identified in both samples within the two main RBC membrane

protein complexes. Adapted from [20].

Differentially expressed proteins in COPD

Molecules showing high changes in their relative abundance between control and

patient groups were carefully evaluated. Top-ten overexpressed proteins identified in

COPD RBC microsomal fraction are displayed in Table III.4. Their functions are related

to transport (e.g., H+ transport across the cellular membranes), proteasomes, ATP-

binding cassettes, kinases, chemokines, among others. Underexpressed proteins are

45

associated with cystokeleton networks [e.g., Chorein or Vacuolar protein sorting-

associated protein 13A (VPS13A), Kinesin-2 (KIF2A), xenobiotic metabolism [e.g,

cytochrome b5 reductase 3 (CYB5R3), protein kinase c iota type (PRKCI)].

Table III.4: Ten most overexpressed proteins in COPD erythrocyte ghost as provided

by Ingenuity systems knowledgebase. a) Swiss-Prot/Uniprot accession number.

Accession

numbera

Description Fold

Change Type Location

P27449 ATPase, H+ transporting, lysosomal

16kDa, V0 subunit c 6,73 transporter Cellular membrane

Q15836 Vesicle-associated membrane protein 5,96 other Plasma membrane

Q13439 golgi autoantigen, golgin subfamily a, 4 5,31 other Cellular membrane,

Golgi membrane

P51665 proteasome (prosome, macropain) 26S

subunit, non-ATPase, 7 5,07 other Nucleus

P49721 proteasome (prosome, macropain)

subunit, beta type, 2 5,00 peptidase Nucleus

Q96L73 nuclear receptor binding SET domain

protein 1 4,44

transcription

regulator Nucleus

Q9NRK6 ATP-binding cassette, sub-family B

(MDR/TAP), member 10 4,42 transporter

Membrane

fraction,

mitochondria inner

membrane

P02462 collagen, type IV, alpha 1 4,04 other

Basement

membrane,

extracellular space

Q9UL99 hyaluronoglucosaminidase 4 3,53 enzyme Unknown

Q14146 KIAA0133 3,36 other Cellular membrane,

cytoplasm, nucleus

‘Interactome’ using Ingenuity Pathway Analysis

Data were also analyzed through the use of Ingenuity Pathway Analysis software

(Ingenuity® Systems, www.ingenuity.com) applying a 1.5-fold change (over or

underexpression) expression value cutoff. According to Ingenuity knowledgebase, from

the total 314 identified proteins submitted for analysis, 305 were found mapped and

46

159 molecules were found to be eligible for analysis. Eligible molecules were searched

against the Ingenuity knowledgebase and top-10 networks according to the statistic

score are displayed in Supplemental Figure III.3. Interestingly, top-10 networks were

interconnected together and so it was possible to merge them (Supplemental Figures

III.4 and III.5, Supporting Information, respectively). The top network (possessing the

highest statistic score) was related to cell-to-cell signaling and interaction,

hematological system development and function and immune response (Supplemental

Figure III.6, Supporting Information).

There were found four proteins associated with oxidative stress in the top-10

networks. Catalase (CAT) is associated to cellular movement, hematological system

development and function and immune response (network 2). CAT was found to be

overexpressed by two fold in COPD patients. At the same level of overexpression was

found peroxiredoxin 2 (PRDX2, network 1). In contrast, myeloperoxidase (MPO,

networks 1 and 4) and nuclear factor of kappa light polypeptide gene enhancer

(NFKB2, network 8) were found to be underexpressed by two fold in this study as

shown in Table III.5.

Table III.5: Proteins associated to oxidative stress present in top-10 networks. a)

Swiss-Prot/Uniprot accession number; b) According to ingenuity pathways analysis.

Gene

Symbol Description

Accession

numbera

Fold

Change Type Networks

b

CAT Catalase P04040 2,22 enzyme 2

MPO Myeloperoxidase P05164 -2,22 enzyme 1, 4

NFKB2

nuclear factor of kappa light

polypeptide gene enhancer in

B-cells 2 (p49/p100)

Q00653 -2,86 transcription

regulator 8

PRDX2 peroxiredoxin 2 P32119 2,31 enzyme 1

Interestingly, among the thirteen members of the solute carrier (SLC) family identified,

only two of them (SLC2A4, SLC30A1) were found overexpressed in COPD and this

overexpression was below the 1.5-fold threshold. All the other eleven members of this

family were quantified as underexpressed, six of which beyond the 1.5-fold threshold

(highlighted in Supplemental Table III.5, Supporting Information). A similar pattern was

47

observed for proteasome proteins, but with inverse expression. There were eleven

proteasome proteins quantified in this study and only two (PSMB1, PSMC2) were

downregulared in COPD (Supplemental Table III.6).

Protein-Protein Interactions using Cytoscape 2.6.3

Ingenuity Systems uses its own knowledge base to group proteins into pathways.

Opposite to this, Cytoscape is an open source bioinformatics software platform for

visualizing molecular interaction networks and biological pathways through an

interaction database that is selected by the user and that is matched to the dataset.

Figure III.6: Main protein-protein interaction network comprising 43 members

generated by Cytoscape 2.6.3 using PINA database. Red- Significantly differential

overexpressed proteins in COPD; Green- Significantly differentially underexpressed

proteins in COPD; Purple- Not significantly differential expressed proteins (1.5 fold

threshold).

48

In this work, two protein-protein interaction databases: human protein reference

database (HPRD) and protein interaction network analysis (PINA) were used to

investigate protein-protein interactions within our dataset alone, i.e. no first neighbors

were allowed.

Figure III.7: Overrepresented biological processes (GO) for the differentially (above

1.5-fold) expressed proteins in COPD patients.

Using HPRD, we were able to generate two networks containing more than 3 proteins

(Supplemental Figure 7, Supporting Information). Of particular interest, there was a

group of interacting proteins comprising a network of 25 proteins (Supplemental

Figure 8, Supporting Information). However, using PINA it was possible to merge these

two networks through ANK1 and constitute a larger network (Supplemental Figure 9,

Supporting Information). Also, since PINA set the interaction between PSMD7 and

PSMD6 it was possible to add some more members constituting a network of about 43

proteins (Figure III.6). This network could achieve at least 45 proteins by connecting

PLSCR1 to CRK and VAV1 as their interactions were shown when using HPRD

(Supplemental Figure 8, Supporting Information). PINA was the chosen approach for

evaluating differentially expressed proteins over 1.5-fold between patients and control

groups (Supplemental Figure 10, Supporting Information). A protein-protein

interaction network of 14 members was generated, from which only 3 proteins were

49

found to be overexpressed in COPD patients’ RBCs. It was interesting to observe that

generated protein-protein interactions exhibit the same pattern (over- or

underexpression) within their members. One of these networks grouped five

proteosome proteins, all found to be overexpressed in COPD patients – PSMD2,

PSMD3, PSMD6, PSMD7 and PSMA6.

BinGO, a freely available Java-based tool that is implemented as a plug-in for

Cytoscape was also employed. BinGO determines which gene ontology (GO) categories

are statistically over- or underrepresented in the submitted gene list. Again, both the

whole dataset (Supplemental Figure 11, Supporting Information) and differentially

expressed proteins (over 1.5-fold, Figure III.7) subset were submitted. As expected,

overrepresented terms are quite similar when running both analyses. There were a

few terms that were only overrepresented when the whole dataset was submitted and

thus when looking into data generated from differentially expressed proteins only, two

branches were maintained: terms related to regulation of ubiquitin-protein ligase and

catalytic activity and to proteasomal ubiquitin-dependent protein catabolic process.

Western blot validation

Biochemical validation of the results obtained from MS was performed by WB analysis.

Equal amounts of total protein extracts obtained from either controls or patients were

used in three independent experiments to quantify the relative abundance of NADH-

cytochrome b5 reductase 3 (CYB5R3), fructose-bisphosphate aldolase A (ALDOA),

acylamino-acid-releasing enzyme (AARE) and vacuolar protein sorting-associated

protein 13A (VPS13A) by densitometric analysis of immunoblots (Figure III.8). In the

absence of an internal housekeeping control, normalization to Ponceau-stained full

lane membrane intensity was the employed method. Data obtained by WB presented

the same expression trend, therefore confirming the results previously obtained by

MS, except for AARE where WB showed decrease abundance of this protein in COPD

patients, while the opposite was observed by MS.

50

Figure III.8: Western blot validation showing both representative close-up views of

each Ab reaction and graphic representation of the relative normalized abundance of

(A) Acylamino-acid-releasing enzyme (AARE ), (B) ALDOA, (C) VPS13A and (D)

CYB5R3, using the full intensity of the respective Ponceau-stained lane in the

nitrocellulose membrane for normalization (n=3 independent replicates/each Ab

reaction). The antigen–antibody complex was detected by ECL (GE Healthcare) and

Progenesis PG200v2006 software (Nonlinear Dynamics) was used for densitometry

analysis.

DISCUSSION

Human RBCs have a life-span of about 120 days and during this time they travel across

the body through the blood stream and communicate with different types of

metabolites, cells and tissues. COPD is not known to affect RBCs directly, but may be

responsible for alterations that could be exploited at the proteome level for diagnostic

purposes or at least to gain a better understanding of the disease and its implications

in RBCs. The RBC proteome was believed to be simple as RBCs lack internal organelles.

51

However, in the past years, the knowledge of the RBC proteome increased

dramatically and changed this idea [21-29]. This fact is due not only to the

development of sample preparation (fractionation/enrichment) techniques, but mostly

to the use of sophisticated mass spectrometers, appropriate search algorithms and to

comprehensive human protein databases.

The RBC membrane is a composite structure in which spectrin-actin based membrane

skeletal network is coupled to a lipid bilayer either by direct interaction with lipids or

by linker proteins which interact simultaneously with the cytoplasmic domain of

transmembrane proteins and spectrin. Linkage of membrane proteins (e.g. Stomatin

that was identified by 339 peptides), glycophorin C (132 peptides identified) and band

3 (1092 peptides identified) to spectrin-actin based skeleton by linker proteins 4.1R

and ankyrin has been well established for over two decades. These two major

complexes – ankyrin and 4.1R complexes – are shown in Figure III.5 where proteins

identified within the present study are reported as highlighted (see Supplemental

Figure III.2 and Supplemental Table III.4, Supporting Information for additional

information on relative abundance of these proteins). Erythrocyte morphology, their

mechanical deformability and elasticity are crucial for them to preserve their function

in oxygen uptake, especially as they pass across the circulating system at lung level

which must be carried out in an ordered and sequential manner. In the present study it

was possible to identify an important number of differentially expressed proteins

directly or indirectly linked to the erythrocyte plasma membrane as integral

membrane proteins or cytoskeletal proteins that may lead to alterations in the COPD

patients’ erythrocyte membrane.

Among the most overexpressed proteins found is ATPase, H+ transporting, lysosomal

16 kDa, V0 subunit c (ATP6V0C) which is related to oxidative stress. V1 domain is

cytosolic and it is the ATP catalytic site whereas V0 is the transmembrane domain. This

protein was reported to be a proton-transporting two-sector ATPase complex across

membranes [30] and is involved in oxidative phosphorylation pathway. Golgin-

245/p230 (GOLGA4), overexpressed in COPD, is reported to be essential for

intracellular trafficking and cell surface delivery of tumor necrosis factor-α (TNF) [31].

TNF is the main proinflammatory cytokine made and secreted by inflammatory

52

macrophages enhancing activation and recruitment of T-cells and ensures robust

innate and acquired immune responses. ATP-binding cassette, subfamily B (MDR/TAP)

member 10 (ABCB10) or ABC-me (ABC-mitochondrial erythroid), also overexpressed in

COPD, is located in the inner mitochondrial membrane and was suggested to play a

role in erythroid differentiation. ABC-me is induced during erythroid differentiation in

cell lines, where hem biosynthesis predominantly occurs, and its overexpression

enhances hemoglobin synthesis in erythroleukemia cells [32, 33]. Yet, its physiological

role in humans is still an open question. Hyaluronoglucosaminidase 4 (HYAL4), a

protein whose subcellular location is still unknown was also found to be overexpressed

in COPD patients and this protein’s activity is reported to be regulated by IL1B, an

interleukin that is associated with inflammation in airway diseases [34, 35].

Chorein or Vacuolar protein sorting-associated protein 13A (VPS13A) is a large protein

whose predicted molecular weight is approximately 360 kDa that is reported to play a

role in the cytoskeleton and intracellular transport (most likely Golgi to endosome

transport) and defects in this protein are associated with the presence of

acanthocytosis, thorny deformations of circulating erythrocytes, possibly due to red

cell membranes deformation [36, 37]. The mechanism by which acanthocytosis is

formed is not known, but it is hypothesized to be due to expansion of the outer leaflet

of the lipid bilayer of RBC membranes, in contrast to stomatocytosis, which results

from expansion of the inner leaflet [38]. This protein was found to be underexpressed

in COPD patients when compared to controls by MS and this underexpression was

confirmed by WB. Consequently, changes in VPS13A may play a central role in the

deformation of COPD RBCs that has already been reported before [12]. Another

protein associated with the cytoskeleton found to be underexpressed in COPD in this

study was kinesin heavy chain member 2A (KIF2A). Kinesin-2 is one of the most

ubiquitously expressed of the molecular motors known as kinesin superfamily proteins

(KIFs) that had been implicated key players in the intracellular transport system, which

is essential for cellular function and morphology [39]. Members of this superfamily

have been shown to transport membrane-bound organelles, protein complexes and

mRNAs to specific destinations along microtubules while hydrolyzing ATP for energy

[40]. Kinesin-2 interacts with tumor suppressor adenamatous polyposis coli (APC) and

53

this interaction is essential for transport of APC along microtubules to the tips of

membrane protrusions [41, 42]. Kinesin-2 is a heterotrimeric complex composed of a

KIF3A/3B heterodimer and an adaptor protein, the kinesin superfamily-associated

protein 3 (KAP3). APC interacts with KIF3A/3B via an association with KAP3. APC

activates APC-stimulated guanine nucleotide exchange factor (Asef) and regulates the

actin cytoskeletal networks, cell morphology, adhesion and migration [41]. In addition

to VPS13A underexpression, KIF2A, may also be contributing to the changes in COPD

patients’ RBCs shape, which in turn may be responsible for defective oxygen uptake

and deliver.

Cytochrome b5 reductase 3 (CYB5R3) was also found to be underexpressed in COPD

and its underexpression in COPD patients RBCs was confirmed by WB. The enzyme

cytochrome b5 reductase 3 catalyses the transfer of reducing equivalents from the

physiological electron donor, NADH, generated in the Emben Meyerhof pathway, to

the small haemoprotein of cb5. As indicated by its alternative name, methaemoglobin

reductase, one of the major activities of this protein is the reduction of

methaemoglobin. Hence, deficiency of this protein is associated with

methaemoglobinemia. When oxygen is released to the tissues, the iron atom is stored

at ferrous (Fe2+) state. In contrast, if the atom is oxidized to the ferric state (Fe3+) as the

result of oxidative stress (e.g. as a result of tobacco smoke), it lacks the electron

required to combine with oxygen and so, once the hem moiety is in the ferric state,

hemoglobin is incapable of transporting oxygen [45, 46]. The oxidative stress caused by

tobacco smoke leads to consequences in the oxidant and antioxidant balance [47],

which is one of the hallmarks of the disease [48].

Antioxidant proteins catalase and peroxiredoxin 2 were found 2.2 and 2.3-fold

overexpressed, respectively. Catalase is a homotetrameric antioxidant enzyme that

decomposes hydrogen peroxide into water and oxygen and is especially concentrated

in erythrocytes. Catalase has been shown to be increased in hyperoxia and its

significance in pulmonary defense, especially at the alveolar level is important [49].

Peroxiredoxins comprise a large group of proteins whose function is to catalyse the

degradation of lipid hydroperoxides and hydrogen peroxide [49, 50]. Peroredoxins are

a recently described family of nonselenoperoxidases that catalyses the reduction of a

54

broad spectrum of peroxides. Their function can be also associated with cellular

signaling mechanism during oxidative stress and, in human lungs, peroxiredoxins have

been implicated to have an important role in protection against exogenous as well as

endogenous oxidant challenge [51]. Peroxiredoxin 2 has been reported to be elevated

in lung carcinomas [52]. Another interesting fact is that amongst the thirteen members

of the solute carrier family, only two of them were found overexpressed in COPD and

this overexpression was below the 1.5-fold threshold. All the other members of this

family that were identified were underexpressed, six of which beyond the 1.5-fold

threshold (Supplemental Table 5, Supporting Information). A similar pattern was

observed for proteasome proteins, but with inverse expression. There were eleven

proteasome proteins quantified in this study and only two were downregulared in

COPD (Supplemental Table 6, Supporting Information). One of these is PSMB2 whose

activity is regulated by nicotine [53].

CONCLUSION

This work intended to provide new insights into which events may be leading to RBC

membrane deformation in COPD patients. These events have been reported before by

means of electronic microscopy, but there was no information available to what

molecular biology is concerned. In the present study it was possible to find some

differentially expressed proteins that may be contributing to this process of membrane

deformation. Chorein (VPS13A), a large protein whose predicted molecular weight is

approximately 360 kDa, was reported to play a role in the cytoskeleton and

intracellular transport. Importantly, defects in this protein are associated with the

presence of acanthocytosis, thorny deformations of circulating erythrocytes, possibly

due to red cell membranes deformation. This protein was found to be underexpressed

in COPD patients when compared to controls by MS and this underexpression was

confirmed by WB. There were a considerable number of proteins with none or very

few information which difficult to achieve a higher level of understanding to what

other proteins may be involved in this deformation from the dataset we produced. The

final step will be to set the connection between what is happening at RBCs’ membrane

level to the pathophysiology of COPD and to this extent new studies will be necessary

55

to assess whether RBCs can be a source of biomarkers for the prognostic/diagnosis of

this disease.

56

ACKNOWLEDGEMENTS

The authors would like to thank all patients and healthy individuals who voluntarily

participated in this study; Maria Teresa Seixas and colleagues from Laboratorio de

Hematologia, Instituto Nacional de Saude Dr. Ricardo Jorge (INSA) for complete blood

count (CBC); Pedro Loureiro, Arminda Vilares and Ana Cardoso for blood collection;

colleagues from Laboratorio de Proteomica (INSA) and National Cancer Institute at

Frederick (NCI-SAIC); Dr. Patricia Gomes-Alves for assistance with densitometry

including the respective statistical analysis. This work was partially supported by

Fundação para a Ciência e a Tecnologia (FCT)/FEDER (POCTI/SAU-MMO/56163/2004),

FCT/Poly-Annual Funding Program and FEDER/Saude XXI Program (Portugal). BMA and

NC are recipients of FCT doctoral fellowship (SFRH/BD/31415/2006 and

SFRH/BD/27906/2006).

57

REFERENCES

[1] Barnes, P. J., Shapiro, S. D., Pauwels, R. A., Chronic obstructive pulmonary disease:

molecular and cellular mechanisms. Eur Respir J 2003, 22, 672-688.

[2] Barnes, P. J., Stockley, R. A., COPD: current therapeutic interventions and future

approaches. Eur Respir J 2005, 25, 1084-1106.

[3] Global initiative for chronic obstructive lung disease 2008.

[4] Lopez, A. D., Shibuya, K., Rao, C., Mathers, C. D., et al., Chronic obstructive

pulmonary disease: current burden and future projections. Eur Respir J 2006, 27, 397-

412.

[5] Geneva 2000.

[6] Pauwels, R. A., Rabe, K. F., Burden and clinical features of chronic obstructive

pulmonary disease (COPD). Lancet 2004, 364, 613-620.

[7] Goodman, S. R., Kurdia, A., Ammann, L., Kakhniashvili, D., Daescu, O., The human

red blood cell proteome and interactome. Experimental biology and medicine

(Maywood, N.J 2007, 232, 1391-1408.

[8] Agusti, A., Systemic effects of chronic obstructive pulmonary disease: what we

know and what we don't know (but should). Proceedings of the American Thoracic

Society 2007, 4, 522-525.

[9] Agusti, A., Soriano, J. B., COPD as a systemic disease. Copd 2008, 5, 133-138.

[10] Agusti, A., Systemic effects of COPD: just the tip of the Iceberg. Copd 2008, 5, 205-

206.

[11] Lucantoni, G., Pietraforte, D., Matarrese, P., Gambardella, L., et al., The red blood

cell as a biosensor for monitoring oxidative imbalance in chronic obstructive

pulmonary disease: an ex vivo and in vitro study. Antioxidants & redox signaling 2006,

8, 1171-1182.

[12] Santini, M. T., Straface, E., Cipri, A., Peverini, M., et al., Structural alterations in

erythrocytes from patients with chronic obstructive pulmonary disease. Haemostasis

1997, 27, 201-210.

58

[13] Blonder, J., Chan, K. C., Issaq, H. J., Veenstra, T. D., Identification of membrane

proteins from mammalian cell/tissue using methanol-facilitated solubilization and

tryptic digestion coupled with 2D-LC-MS/MS. Nature protocols 2006, 1, 2784-2790.

[14] Dodge, J. T., Mitchell, C., Hanahan, D. J., The preparation and chemical

characteristics of hemoglobin-free ghosts of human erythrocytes. Archives of

biochemistry and biophysics 1963, 100, 119-130.

[15] Blonder, J., Yu, L. R., Radeva, G., Chan, K. C., et al., Combined chemical and

enzymatic stable isotope labeling for quantitative profiling of detergent-insoluble

membrane proteins isolated using Triton X-100 and Brij-96. Journal of proteome

research 2006, 5, 349-360.

[16] Sonnhammer, E. L., von Heijne, G., Krogh, A., A hidden Markov model for

predicting transmembrane helices in protein sequences. Proceedings / ... International

Conference on Intelligent Systems for Molecular Biology ; ISMB 1998, 6, 175-182.

[17] Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E. L., Predicting

transmembrane protein topology with a hidden Markov model: application to

complete genomes. Journal of molecular biology 2001, 305, 567-580.

[18] Kyte, J., Doolittle, R. F., A simple method for displaying the hydropathic character

of a protein. Journal of molecular biology 1982, 157, 105-132.

[19] Qian, W. J., Liu, T., Monroe, M. E., Strittmatter, E. F., et al., Probability-based

evaluation of peptide and protein identifications from tandem mass spectrometry and

SEQUEST analysis: the human proteome. Journal of proteome research 2005, 4, 53-62.

[20] Mohandas, N., Gallagher, P. G., Red cell membrane: past, present, and future.

Blood 2008, 112, 3939-3948.

[21] Low, T. Y., Seow, T. K., Chung, M. C., Separation of human erythrocyte membrane

associated proteins with one-dimensional and two-dimensional gel electrophoresis

followed by identification with matrix-assisted laser desorption/ionization-time of

flight mass spectrometry. Proteomics 2002, 2, 1229-1239.

[22] Tyan, Y. C., Jong, S. B., Liao, J. D., Liao, P. C., et al., Proteomic profiling of

erythrocyte proteins by proteolytic digestion chip and identification using two-

59

dimensional electrospray ionization tandem mass spectrometry. Journal of proteome

research 2005, 4, 748-757.

[23] Kakhniashvili, D. G., Bulla, L. A., Jr., Goodman, S. R., The human erythrocyte

proteome: analysis by ion trap mass spectrometry. Mol Cell Proteomics 2004, 3, 501-

509.

[24] Bruschi, M., Seppi, C., Arena, S., Musante, L., et al., Proteomic analysis of

erythrocyte membranes by soft Immobiline gels combined with differential protein

extraction. Journal of proteome research 2005, 4, 1304-1309.

[25] Pasini, E. M., Kirkegaard, M., Mortensen, P., Lutz, H. U., et al., In-depth analysis of

the membrane and cytosolic proteome of red blood cells. Blood 2006, 108, 791-801.

[26] Roux-Dalvai, F., Gonzalez de Peredo, A., Simo, C., Guerrier, L., et al., Extensive

analysis of the cytoplasmic proteome of human erythrocytes using the peptide ligand

library technology and advanced mass spectrometry. Mol Cell Proteomics 2008, 7,

2254-2269.

[27] Pasini, E. M., Lutz, H. U., Mann, M., Thomas, A. W., Red Blood Cell (RBC)

membrane proteomics - Part II: Comparative proteomics and RBC patho-physiology.

Journal of proteomics 2009.

[28] Liumbruno, G., D'Alessandro, A., Grazzini, G., Zolla, L., Blood-related proteomics.

Journal of proteomics 2009.

[29] Alexandre, B. M., Proteomic mining of the red blood cell: focus on the membrane

proteome. Expert review of proteomics 2010, 7, 165-168.

[30] Simckes, A. M., Swanson, S. K., White, R. A., Chromosomal localization of three

vacuolar-H+ -ATPase 16 kDa subunit (ATP6V0C) genes in the murine genome.

Cytogenetic and genome research 2002, 97, 111-115.

[31] Lieu, Z. Z., Lock, J. G., Hammond, L. A., La Gruta, N. L., et al., A trans-Golgi network

golgin is required for the regulated secretion of TNF in activated macrophages in vivo.

Proceedings of the National Academy of Sciences of the United States of America 2008,

105, 3351-3356.

60

[32] Herget, M., Tampe, R., Intracellular peptide transporters in human--

compartmentalization of the "peptidome". Pflugers Arch 2007, 453, 591-600.

[33] Shirihai, O. S., Gregory, T., Yu, C., Orkin, S. H., Weiss, M. J., ABC-me: a novel

mitochondrial transporter induced by GATA-1 during erythroid differentiation. The

EMBO journal 2000, 19, 2492-2502.

[34] Anderson, G. P., COPD, asthma and C-reactive protein. Eur Respir J 2006, 27, 874-

876.

[35] Rahman, I., Adcock, I. M., Oxidative stress and redox regulation of lung

inflammation in COPD. Eur Respir J 2006, 28, 219-242.

[36] Kurano, Y., Nakamura, M., Ichiba, M., Matsuda, M., et al., In vivo distribution and

localization of chorein. Biochemical and biophysical research communications 2007,

353, 431-435.

[37] Walker, R. H., Liu, Q., Ichiba, M., Muroya, S., et al., Self-mutilation in chorea-

acanthocytosis: Manifestation of movement disorder or psychopathology? Mov Disord

2006, 21, 2268-2269.

[38] Iolascon, A., Perrotta, S., Stewart, G. W., Red blood cell membrane defects.

Reviews in clinical and experimental hematology 2003, 7, 22-56.

[39] Scholey, J. M., Kinesin-II, a membrane traffic motor in axons, axonemes, and

spindles. The Journal of cell biology 1996, 133, 1-4.

[40] Miki, H., Okada, Y., Hirokawa, N., Analysis of the kinesin superfamily: insights into

structure and function. Trends in cell biology 2005, 15, 467-476.

[41] Akiyama, T., Kawasaki, Y., Wnt signalling and the actin cytoskeleton. Oncogene

2006, 25, 7538-7544.

[42] Jimbo, T., Kawasaki, Y., Koyama, R., Sato, R., et al., Identification of a link between

the tumour suppressor APC and the kinesin superfamily. Nature cell biology 2002, 4,

323-327.

[43] Liao, R., Sun, J., Zhang, L., Lou, G., et al., MicroRNAs play a role in the development

of human hematopoietic stem cells. Journal of cellular biochemistry 2008, 104, 805-

817.

61

[44] Sasaki, T., Shiohama, A., Minoshima, S., Shimizu, N., Identification of eight

members of the Argonaute family in the human genome small star, filled. Genomics

2003, 82, 323-330.

[45] Percy, M. J., Lappin, T. R., Recessive congenital methaemoglobinaemia:

cytochrome b(5) reductase deficiency. British journal of haematology 2008, 141, 298-

308.

[46] Jaffe, E. R., Methemoglobin pathophysiology. Progress in clinical and biological

research 1981, 51, 133-151.

[47] Mak, J. C., Pathogenesis of COPD. Part II. Oxidative-antioxidative imbalance. Int J

Tuberc Lung Dis 2008, 12, 368-374.

[48] Santos, M. C., Oliveira, A. L., Viegas-Crespo, A. M., Vicente, L., et al., Systemic

markers of the redox balance in chronic obstructive pulmonary disease. Biomarkers

2004, 9, 461-469.

[49] Rahman, I., Biswas, S. K., Kode, A., Oxidant and antioxidant balance in the airways

and airway diseases. European journal of pharmacology 2006, 533, 222-239.

[50] Chae, H. Z., Robison, K., Poole, L. B., Church, G., et al., Cloning and sequencing of

thiol-specific antioxidant from mammalian brain: alkyl hydroperoxide reductase and

thiol-specific antioxidant define a large family of antioxidant enzymes. Proceedings of

the National Academy of Sciences of the United States of America 1994, 91, 7017-7021.

[51] Lehtonen, S. T., Markkanen, P. M., Peltoniemi, M., Kang, S. W., Kinnula, V. L.,

Variable overoxidation of peroxiredoxins in human lung cells in severe oxidative stress.

American journal of physiology 2005, 288, L997-1001.

[52] Lehtonen, S. T., Svensk, A. M., Soini, Y., Paakko, P., et al., Peroxiredoxins, a novel

protein family in lung cancer. International journal of cancer 2004, 111, 514-521.

[53] Rezvani, K., Teng, Y., Shim, D., De Biasi, M., Nicotine regulates multiple synaptic

proteins by inhibiting proteasomal activity. J Neurosci 2007, 27, 10508-10519.

63

Chapter IV

A comparative, Global Proteomic

Analyses of Human Nasal

Epithelial Cells Obtained by Nasal

Brushing in Nonsmoking versus

Smoking Healthy Individuals

64

A comparative, global proteomic analyses of human nasal epithelial cells obtained by

nasal brushing in nonsmoking versus smoking healthy individuals

Bruno M. Alexandre1, Nicholas W. Bateman2, Brian L. Hood2, Mai Sun2, Thomas P.

Conrads2* and Deborah Penque1*

1Laboratorio de Proteómica, Departamento de Genética, Instituto Nacional de Saúde

Dr. Ricardo Jorge (INSA-IP), Av. Padre Cruz 1649-016 Lisboa, Portugal and the

2Department of Pharmacology & Chemical Biology, University of Pittsburgh Cancer

Institute, University of Pittsburgh

Keywords: Nasal epithelial cells, nasal brushing, tobacco, lung, cigarette smoke,

proteomics.

*Corresponding authors: Deborah Penque, Ph.D., Laboratório de Proteómica,

Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo

Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8137,

Fax: +351 21752 6410, E-mail: [email protected] and Thomas P.

Conrads, Ph.D., 204 Craft Avenue, Suite B401, Pittsburgh, PA, 15213, Tel: 412-641-

7556, Fax: 412-641-2356, E-mail: [email protected]

65

ABSTRACT

Cigarette smoking is the leading cause of preventable death worldwide and yet,

premature tobacco-attributable deaths are projected to rise from 5.4 million in 2004

to 8.3 million in 2030, about 10% of the deaths worldwide, with more than 80% in

developing countries. The nasal epithelium is the initial point of contact of the

respiratory tract to the external environment, and is continuously subjected to the

influence of irritating particles and chemicals as the ones present in cigarette smoke.

This is a pioneer work as for the first time the proteome of nasal epithelial cells

obtained from smoker subjects is revealed and compared to the one of non-smokers.

Moreover, samples were analyzed by a high-resolution mass spectrometer which was

capable of generating over 900 protein identifications by two or more peptides.

Ninety-six proteins were found to be differentially expressed between the proteomes

of healthy smokers and non-smokers, which were related to processes of antigen

presentation, cell-to-cell signaling and interaction, cell morphology, drug metabolism,

DNA repair, energy production or mitochondrial dysfunction. Previous evidences as the

overexpression of CD44 and MUC5AC due to cigarette smoking were confirmed, but

importantly, many proteins related to the aforementioned processes and others had

never been associated with cigarette smoking.

66

INTRODUCTION

Cigarette smoking is the leading cause of preventable death worldwide and yet,

despite anti-smoking campaign efforts from the European Respiratory Society [1],

American Thoracic Society [2] or the World Health Organization (WHO) [3], the

number of smokers keeps increasing. Thus, global epidemic of tobacco-associated

diseases has progressively increased. According to the 2008 WHO Report, premature

tobacco-attributable deaths from ischemic heart disease, cerebrovascular disease,

chronic obstructive pulmonary disease among others are projected to rise from 5.4

million in 2004 to 8.3 million in 2030, about 10% of the deaths worldwide, with more

than 80% in developing countries [4]. Equally troubling to the medical impact of

smoking is the economic burden, where the recent global annual cost estimate of 500

billion US dollars that is spent on caring for and treating tobacco-related illnesses is

projected to reach 1 trillion by 2030 [3].

Cigarette smoke is a complex mixture of over 4000 substances, including antigenic,

cytotoxic, mutagenic and carcinogenic agents, that are inhaled directly or as products

of high temperature combustion on the end of the cigarette [5, 6]. This includes high

levels of oxidants and reactive oxygen species (ROS) detected in both mainstream and

sidestream smoke [5-7]. Cigarette smoke-mediated oxidative stress produces DNA

damage and activates survival signalling cascades resulting in uncontrolled cell

proliferation and transformation [7, 8]. Oxidative stress that ensues, when the

antioxidant defenses are depleted, is accompanied by further increase in ROS

production in lung epithelial cells [8]. The nasal epithelium is the initial point of contact

of the respiratory tract to the external environment, and is continuously subjected to

the influence of irritating particles and chemicals as the ones present in cigarette

smoke, viruses, bacteria, airborne allergens and other environmental pollutants. The

major function of the nasal epithelium has been regarded to be primarily that of a

physical barrier, but recent evidence strongly supports that epithelial cells are quite

active metabolically and capable of modulating a variety of inflammatory processes

and immune responses [9, 10].

Although investigations into the effects of cigarette smoke are plentiful, to the best of

our knowledge, there is no work describing the proteomic alterations induced in nasal

epithelial cells from chronic smoking reported so far. Moreover, reported studies on

67

cigarette smoke are often based on animal models or cultured cells treated with

cigarette smoke extract. Culture conditions of airway epithelial cells, their proliferation

and immortalization may influence their protein expression levels and therefore their

action. To overcome these issues, the present investigation was conducted utilizing

freshly obtained epithelial cells collected by nasal brushing from nonsmokers and

smokers. Our group has successfully demonstrated that nasal brushing is capable of

yielding numerous and well-preserved dissociated cells that are representative of the

human superficial respiratory mucosa [11] and their utility in the study of the

monogenic disease cystic fibrosis by proteomics [12, 13]. Nasal epithelial cells were

also reported to constitute an accessible surrogate for studying lower airway

inflammation [14]. Liquid chromatography-tandem mass spectrometry and spectral

counting were utilized to derive protein abundance changes in the nasal epithelial cells

caused by the chronic exposure to the cigarette smoke in smokers as compared to

nonsmokers.

MATERIALS AND METHODS

Individuals and Sample Collection

The study was approved by the Ethics Committee of both Hospital de Santa Maria,

Lisbon and Instituto Nacional de Saude Dr. Ricardo Jorge (INSA)-Lisboa. After informed

consent, nasal epithelial cells were collected by nasal brushing as previously described

[11, 15], from nonsmokers (n=8) and cigarette smokers individuals (n=10). To be

included within the smoker group, individuals had to be smokers for at least 20 years,

smoking at least 10 cigarettes per day. All subjects presented no signs or symptoms of

any respiratory or other chronic diseases. Lung function was evaluated by means of

spirometry and FEV1/FVC > 0.7 was set to be the criterion for a normal lung function.

Nonsmokers and smokers’ individuals were matched for age (54±3.1 and 51±5.4 years

old, respectively) and gender (20% and 12.5% of males, respectively). Cell suspensions

from each individual were cytospun onto a microscopy slide, stained with May-

Grünwald-Giemsa (MGG-) staining and examined for evidence of epithelial cells

(ciliated, goblet and basal cells) and for red blood cell contamination. Only samples

presenting about 0-1% of red blood cell contamination were included in the study.

68

Nasal Epithelial Cells Lysis

Cell suspensions were centrifuged after collection and pelleted cells were resuspended

in the presence of 10 mM Tris-Cl pH 7.6 in 1 mM EDTA containing protease inhibitors.

Cells were lysed by intermittent sonication cycles (10 cycles of 10 sec-pulse followed

by 30 sec pause on dry ice). Lysates were centrifuged twice at 2000 x g for 3 min at 4 °C

to discard any unlysed cells or cell debris. Before storing at -80 °C, an aliquot of 10 μL

from each individual was removed to perform a BCA protein assay (Pierce, Rockford IL,

USA).

Sample Preparation for LC-MS/M

Two biological replicates were constituted within each of the groups under analysis

(Table IV.1). Each biological replicate containing 30 μg of total cell lysate of each of the

groups under analysis (nonsmokers and smokers), was spiked with 3 pmol of chicken

ovalbumin.

Table IV.1: Main characteristics of the biological replicates of the samples under

analysis.

Pool n Biological

Replicate n Age (y) FVC(%) FEF(%) FEV1(%) FEV1/FVC(%)

Nonsmokers 10 1 5 54 ± 2.9 97 ± 20.5 81 ± 17.9 98 ± 19.7 86 ± 4.1

2 5 53 ± 3.6 98 ± 10.4 84 ± 17.6 98 ± 13.8 88 ± 11.8

Smokers 8 1 4 52 ± 7.7 112 ± 19.5 68 ± 6.5 108 ± 15.6 79 ± 3.1

2 4 50 ± 1.7 103 ± 18.9 76 ± 24.0 98 ± 21.9 80 ± 1.9

Each sample was loaded into duplicate gel lanes onto 1D SDS-PAGE on a 4-12% bis-tris

gel (NuPAGE, Invitrogen, Carlsbad, CA) and electrophoresed for approximately 10 min

at a constant voltage of 150 V. Gels were stained with Coomassie blue (SimplyBlue

SafeStain, Invitrogen) and bands belonging to the same sample were excised, sliced

into small pieces and pooled together into the same tube. Gel slices were destained in

50% acetonitrile (ACN) and 50mM ammonium bicarbonate (AMB) overnight at 4 ºC

and in the next morning for another hour. Fully destained gel slices were dehydrated in

69

100% AcN. Gel slices were then rehydrated in 25 mM AMB containing 20 µg/mL

porcine sequencing grade modified trypsin (Promega, Madison, WI) on ice for 45 min.

This solution was discarded and a 25 mM AMB solution was added to the gel slices and

incubated overnight at 37 °C. Tryptic peptides were extracted with 70% ACN and 5%

formic acid (FA) and dried by vacuum centrifugation. Each digest was resuspended in

60 µl of 0.1% trifluoroacetic acid (TFA).

Proteomic analysis by liquid chromatography-tandem mass spectrometry

Peptide digests were resolved by nanoflow reverse-phase liquid chromatography

(Ultimate 3000, Dionex Inc.) coupled online via electrospray ionization to a hybrid

linear ion trap - Orbitrap mass spectrometer (LTQ-Orbitrap, ThermoFisher Scientific,

Inc., San Jose, CA). Five injections of 2 µL of peptide extracts corresponding to 1 μg

total protein were resolved on 100 μm i.d. by 360 μm o.d. by 200 mm long fused silica

capillary columns (Polymicro Technologies, Phoenix, AZ) slurry-packed in-house with 5

μm, 300 Å pore size C-18 silica-bonded stationary phase (Jupiter, Phenomenex,

Torrance, CA). After sample injection, peptides were eluted from the column using a

linear gradient of 2% mobile phase B (100% AcN and 0.1% formic acid) to 40% mobile

phase B over 125 min at a constant flow rate of 200 nL/min followed by a column wash

consisting of 95% B for an additional 30 min at a constant flow rate of 400 nL/min. The

LTQ-Orbitrap MS was configured to collect high resolution (R=60,000 at m/z 400)

broadband mass spectra (m/z 375-1800) from which the thirteen most abundant

peptide molecular ions dynamically determined from the MS scan were selected for

tandem MS using a relative CID energy of 30%. Dynamic exclusion was utilized to

minimize redundant selection of peptides for CID.

Peptide Identification and Spectral Count Analysis

Peptide identifications were obtained by searching the LC-MS/MS data utilizing

SEQUEST (Thermo Scientific BioWorks 3.2) on a 72 node Beowulf cluster against a

UniProt-derived human proteome database (version 03/10) obtained from the

European Bioinformatics Institute (EBI). Search parameters consisted of enzyme:

70

trypsin (KR); enzyme limits: full enzymatic-cleavage at both ends; missed cleavages

sites: 2; peptide tolerance: 20 ppm; fragment ion tolerance: 0.5 amu; and variable

modifications on methionine of 15.99492 m/z. Resulting peptide identifications were

filtered according to specific SEQUEST scoring criteria [delta correlation (ΔCn) ≥ 0.08

and charge state dependent cross correlation (Xcorr) ≥ 1.9 for *M+H+1+ (mass+proton),

≥ 2.2 for *M+2H+2+, ≥ 3.5 for *M+3H+3++ and ≥ 3.0 for *M+4H+4++. Differences in

protein abundance between the samples were derived by spectral counting (SC).

Peptides whose sequence mapped to multiple protein isoforms were grouped as per

the principle of parsimony [16]. A value of 0.5 was added to each spectral count value

prior to log2 transformation to enable ratio values to be calculated for proteins

identified in one group, but not another [17]. Proteins which exhibited a >95%

confidence interval from the mean for each comparison performed were considered

statistically significant.

Bioinformatic analyses

Uniprot accessions corresponding to proteins identified by at least two peptides were

mapped to HUGO (HGNC) gene symbols utilizing Ingenuity Pathway Analysis (IPA)

(Ingenuity® Systems, www.ingenuity.com). Accessions which failed to map were

converted to IPI identifiers with the mapping utility available at www.uniprot.org and

remapped to IPA to maximize protein identifications available for downstream

bioinformatic analyses.

Protein localization and subtype assignments were derived from IPA-mapped data

sets. Functional analysis of significant protein lists were performed utilizing the “Core

Analysis” function in IPA using default parameters (p<0.05, Fischer’s Exact test).

Network and protein interaction analyses were also performed utilizing IPA for

significant proteins in which a maximum of 35 proteins per network assignment were

allowed. ProteinCenter (Thermo Fischer Scientific) was used to retrieve information on

gene ontology terms to annotate proteins according to biological process and

molecular function using default parameters.

71

RESULTS AND DISCUSSION

Global proteomics analysis of nasal epithelial cells from smokers and nonsmoker

subjects.

It has been demonstrated that nasal brushing is capable of yielding numerous and

well-preserved dissociated cells, representative of the human superficial respiratory

mucosa [11]. To confirm this statement, nasal epithelial cell suspensions from each

individual were evaluated upon collection. Cells were harvested and the cytospins

were prepared and stained with MGG in order to evaluate the presence of ciliated,

goblet and basal cells and also for red cell contamination (Figure IV.1).

Figure IV.1: MGG-staining of nasal cells collected by brushing. Magnification: 80x.

After microscope examination, nasal epithelial samples presenting 0-1% of red blood

cell contamination were processed as described in Figure IV.2. Two biological

replicates were constituted for each of the groups under analysis, smokers and

nonsmoker subjects. Five injections corresponding to 1 µg total protein from each of

the biological replicates were analyzed by LC-MS/MS using a high-resolution mass

spectrometer resulting in the identification of a total of 42666 peptides and 1190

72

proteins in total (Supplemental Table IV.1, Supporting Information), of which 910 were

identified by at least two peptides (Supplemental Table IV.2A, Supporting Information).

Figure IV.2: Basic workflow of the methodology employed for the study of the nasal

epithelial cells proteome of healthy smokers and nonsmokers.

Information on the respective Log2 Smokers/Nonsmokers Ratio of each of these

proteins, on cellular location and functional type, and on the profile of the number of

transmembrane domains can be found in Supplemental Tables IV.2B, IV.3, and IV.4,

Supporting Information, respectively. Also, information on the gene ontology terms for

biological process and molecular function is also provided in Supplemental Figures IV.1

and IV.2, Supporting Information, respectively. Protein digests equivalency was

determined by comparing the total number of peptides identified in each of the

analytical samples, which resulted in a calculated relative standard deviation (RSD) of

8.5%. This was also evaluated through comparison of total peptides identified for

chicken ovalbumin, which was added in equal amounts to each of the analyzed

samples according to the workflow displayed in Figure IV.2. RSD for chicken ovalbumin

was calculated to be as low as 5.3%. Equivalency in peptide load was also determined

73

by comparison of the spectral count values for the “housekeeping” protein actin,

cytoplasmic 1 (ACTB, Uniprot Accession (Acc): P60709), commonly used to correct for

protein loading in western blot analysis [18], which revealed a RSD of 13.5%. When

performing the nasal brushing procedure, some local bleeding may occur and this is

responsible for the identification of proteins as hemoglobin subunits alpha, beta,

gamma and gamma-1 (HBA1, HBB, HBD and HBG1, respectively) or erythrocyte band 7

integral membrane protein (STOM). Nonetheless, the epithelial origin of most cells

obtained by nasal brushing procedure was confirmed by the identification of proteins

such as keratin type I cytoskeletal 19 (KRT19, Acc: P08727), keratin type II cytoskeletal

8 (KRT8, Acc: P05787), palate, lung and nasal epithelium carcinoma (PLUNC, Acc:

Q9NP55), long palate, lung and nasal epithelium carcinoma (LPLUNC, Acc: Q8TDL5),

epithelial cell adhesion molecule (EPCAM, Acc: P16422), and a handful of mucins such

as mucin-1 (MUC1, Acc: P15941), mucin-2 (MUC2. Acc: Q02817), mucin-4 (MUC4, Acc:

Q99102), mucin-5AC (MUC5AC, Acc: P98088) or mucin 5B (MUC5B, Acc: Q9HC84)

which are reported to be expressed at relatively high levels in the human respiratory

tracts when compared to other mucin genes [19]. Proteins identified in both groups

under analysis revealed a considerable overlap of 74.7%, as 680 proteins were

commonly identified from the total 910 proteins identified by at least two peptides

(Figure IV.3)

Figure IV.3: Venn diagram showing the overlap in proteins identified by at least two

peptides between the two groups under analysis.

74

Among those proteins found in only one group is Hypoxia upregulated protein 1

(HYOU1), which was confidently identified in the smoker group only. Expression of this

protein is involved in stress-dependent induction resulting in the accumulation of this

protein in the endoplasmic reticulum (ER) under hypoxic conditions. Also, HYOU1 is

suggested to have an important cytoprotective role in hypoxia-induced cellular

perturbation since suppression of this protein is associated with accelerated apoptosis

[20]. Hierarchical clustering was used to arrange proteins according to similarity in

protein expression across the samples under analysis [21, 22]. Not only does this

retrieve information from the datasets without any knowledge a priori as it also

delivers the output graphically in a very intuitive way. When applying this method to

our data, biological replicates from nonsmoker samples clustered together and apart

from the smoker samples (Supplemental Figure IV.3, Supporting Information). This is

very important as we move towards comparative proteomics. As biological replicates

present similar protein expression and, therefore, cluster together, then the

comparative proteomics analysis is reinforced.

Comparative proteomics analysis of smoker and nonsmoker subjects.

After submitting spectral counts of proteins identified by at least two peptides in

smoker and nonsmoker groups to a t-test, 96 proteins exhibited a >95% confidence

interval and were therefore considered to be significantly differentially expressed.

However, there was an entry that has been removed from the Uniprot database

(C9JFA0) and therefore 95 proteins were considered in the comparative analysis of

smoker and nonsmoker groups. Figure IV.4 shows the hierarchical clustering of these

proteins exhibiting their relative expression. Information on the identity of these

proteins as well as on its cellular location and functional type is exhibited in Table IV.2.

In order to acquire a better understanding in the context of biology, these proteins

were analyzed through the “Core Analysis” of Ingenuity Pathway Analysis (IPA). When

submitting this set to IPA, 87 proteins were found to be eligible for network analysis

and 86 to be eligible for functions and pathway analysis. Significantly differentially

expressed proteins were grouped into networks and top 5 networks possessing the

highest statistic score are shown in Table IV.3. Functions associated with top 3

networks include antigen presentation, cell-to-cell signaling and interaction, cell

75

morphology, drug metabolism, DNA replication, recombination and repair and energy

production, which is consistent to what has been reported to be the main effects of

cigarette smoke [7, 8, 23].

Figure IV.4: Hierarchical cluster of the significantly differentially expressed proteins.

Protein abundances are displayed as normalized expression. X-axis labels refer to

information displayed in Table IV.1.

Top network which comprises 24 proteins identified in the present work,

corresponding to 28% of the significantly differentially expressed proteins eligible for

network analysis, is displayed in Figure IV.5. Networks 2 and 3 are available as

Supplemental Figures IV.4 and IV.5, Supporting Information.

76

Table IV.2: Differentially expressed proteins in smokers (S) when compared to

nonsmokers (NS) exhibiting a >95% confidence interval. Cellular location and

functional type were retrieved by Ingenuity knowledgebase (Ingenuity Systems).

Uniprot

Acc.

HGNC

Symbol Entrez Gene Name

S/NS Log2

Ratio

Cellular

Location

Functional

Type

P24666 ACP1 acid phosphatase 1, soluble -3,46 Cytoplasm phosphatase

Q9C0K3 ACTR3C ARP3 actin-related protein 3

homolog C (yeast)

-2,00 unknown other

Q8WXS8 ADAMTS14 ADAM metallopeptidase

with thrombospondin type 1

motif, 14

-2,81 Extracellular

Space

peptidase

P14550 AKR1A1 aldo-keto reductase family

1, member A1 (aldehyde

reductase)

-0,68 Cytoplasm enzyme

Q13740 ALCAM activated leukocyte cell

adhesion molecule

-0,51 Plasma

Membrane

other

P51649 ALDH5A1 aldehyde dehydrogenase 5

family, member A1

0,85 Cytoplasm enzyme

Q07960 ARHGAP1 Rho GTPase activating

protein 1

2,00 Cytoplasm other

P25705 ATP5A1 ATP synthase, H+

transporting, mitochondrial

F1 complex, alpha subunit 1,

cardiac muscle

0,60 Cytoplasm transporter

P36542 ATP5C1 ATP synthase, H+

transporting, mitochondrial

F1 complex, gamma

polypeptide 1

2,59 Cytoplasm transporter

Q13867 BLMH bleomycin hydrolase -3,59 Cytoplasm peptidase

Q86WA6 BPHL biphenyl hydrolase-like

(serine hydrolase)

-2,32 Cytoplasm enzyme

Q9HB07 C12ORF10 chromosome 12 open

reading frame 10

-2,00 unknown other

Q8TDL5 C20ORF11

4

chromosome 20 open

reading frame 114

0,45 Extracellular

Space

other

P16070 CD44 CD44 molecule (Indian

blood group)

0,91 Plasma

Membrane

other

77

Q59FS9 CD81 CD81 molecule 1,59 Plasma

Membrane

other

P60953 CDC42 cell division cycle 42 (GTP

binding protein, 25kDa)

-3,00 Cytoplasm enzyme

Q07065 CKAP4 cytoskeleton-associated

protein 4

3,32 Cytoplasm other

P17540 CKMT2 creatine kinase,

mitochondrial 2

(sarcomeric)

1,00 Cytoplasm kinase

Q9Y696 CLIC4 chloride intracellular

channel 4

2,00 Plasma

Membrane

ion channel

P13073 COX4I1 cytochrome c oxidase

subunit IV isoform 1

1,59 Cytoplasm enzyme

P50416 CPT1A carnitine

palmitoyltransferase 1A

(liver)

2,00 Cytoplasm enzyme

P00387 CYB5R3 cytochrome b5 reductase 3 2,54 Cytoplasm enzyme

P08574 CYC1 cytochrome c-1 3,17 Cytoplasm enzyme

P39656 DDOST dolichyl-

diphosphooligosaccharide--

protein glycosyltransferase

1,59 Cytoplasm enzyme

Q9NY33 DPP3 dipeptidyl-peptidase 3 -0,97 Cytoplasm peptidase

P05198 EIF2S1 eukaryotic translation

initiation factor 2, subunit 1

alpha, 35kDa

-2,00 Cytoplasm translation

regulator

B7Z3Q9 EML2 echinoderm microtubule

associated protein like 2

-3,46 Cytoplasm other

P58107 EPPK1 epiplakin 1 1,87 Cytoplasm other

A2A2Y4 FRMD3 FERM domain containing 3 -1,59 unknown other

P21217 FUT3 fucosyltransferase 3

(galactoside 3(4)-L-

fucosyltransferase, Lewis

blood group)

2,00 Cytoplasm enzyme

P50395 GDI2 GDP dissociation inhibitor 2 -1,46 Cytoplasm other

Q9HC38 GLOD4 glyoxalase domain

containing 4

-2,32 Cytoplasm enzyme

P63244 GNB2L1 guanine nucleotide binding

protein (G protein), beta

0,95 Cytoplasm enzyme

78

polypeptide 2-like 1

P15586 GNS glucosamine (N-acetyl)-6-

sulfatase

-1,46 Cytoplasm enzyme

P17174 GOT1 glutamic-oxaloacetic

transaminase 1, soluble

(aspartate aminotransferase

1)

-0,78 Cytoplasm enzyme

P48637 GSS glutathione synthetase -0,88 Cytoplasm enzyme

P46976 GYG1 glycogenin 1 -2,00 Cytoplasm enzyme

P19367 HK1 hexokinase 1 -0,79 Cytoplasm kinase

P00738 HP haptoglobin 0,81 Extracellular

Space

peptidase

HSFY1 HSFY1 heat shock transcription

factor, Y-linked 1

-1,59 Nucleus transcription

regulator

P34931 HSPA1L heat shock 70kDa protein 1-

like

-0,56 Cytoplasm other

P38646 HSPA9 heat shock 70kDa protein 9

(mortalin)

-0,31 Cytoplasm other

Q92598 HSPH1 heat shock 105kDa/110kDa

protein 1

-2,00 Cytoplasm other

P23276 KEL Kell blood group, metallo-

endopeptidase

-1,59 Plasma

Membrane

peptidase

Q6UXB3 LYPD2 LY6/PLAUR domain

containing 2

-1,42 unknown other

Q9HCC0 MCCC2 methylcrotonoyl-CoA

carboxylase 2 (beta)

2,59 Cytoplasm enzyme

P23368 ME2 malic enzyme 2, NAD(+)-

dependent, mitochondrial

-0,56 Cytoplasm enzyme

Q16798 ME3 malic enzyme 3, NADP(+)-

dependent, mitochondrial

-2,00 Cytoplasm enzyme

Q9Y2Q9 MRPS28 mitochondrial ribosomal

protein S28

-1,59 Cytoplasm other

Q96DH6 MSI2 musashi homolog 2

(Drosophila)

-3,00 Cytoplasm other

P26038 MSN moesin -1,42 Plasma

Membrane

other

Q9Y6C9 MTCH2 mitochondrial carrier

homolog 2 (C. elegans)

2,59 Cytoplasm other

79

P98088 MUC5AC mucin 5AC, oligomeric

mucus/gel-forming

0,45 Extracellular

Space

other

P35580 MYH10 myosin, heavy chain 10,

non-muscle

1,59 Cytoplasm other

Q16795 NDUFA9

(includes

EG:4704)

NADH dehydrogenase

(ubiquinone) 1 alpha

subcomplex, 9, 39kDa

2,81 Cytoplasm enzyme

Q9NX14 NDUFB11 NADH dehydrogenase

(ubiquinone) 1 beta

subcomplex, 11, 17.3kDa

-2,00 Cytoplasm enzyme

O75306 NDUFS2 NADH dehydrogenase

(ubiquinone) Fe-S protein 2,

49kDa (NADH-coenzyme Q

reductase)

3,17 Cytoplasm enzyme

O75489 NDUFS3 NADH dehydrogenase

(ubiquinone) Fe-S protein 3,

30kDa (NADH-coenzyme Q

reductase)

3,81 Cytoplasm enzyme

Q969S2 NEIL2 nei endonuclease VIII-like 2

(E. coli)

-1,59 Nucleus enzyme

Q5SPY9 NPDC1 neural proliferation,

differentiation and control,

1

-1,59 Extracellular

Space

other

B0ZBF1 NPR1 natriuretic peptide receptor

A/guanylate cyclase A

(atrionatriuretic peptide

receptor A)

-2,00 Plasma

Membrane

enzyme

Q96PE5 OPALIN oligodendrocytic myelin

paranodal and inner loop

protein

-1,59 Cytoplasm other

O00764 PDXK pyridoxal (pyridoxine,

vitamin B6) kinase

-0,50 Cytoplasm kinase

P30086 PEBP1 phosphatidylethanolamine

binding protein 1

-0,45 Cytoplasm other

P07737 PFN1 profilin 1 -0,56 Cytoplasm other

O95336 PGLS 6-phosphogluconolactonase -2,32 Cytoplasm enzyme

Q8IV08 PLD3 phospholipase D family,

member 3

-2,00 Cytoplasm enzyme

80

P28072 PSMB6 proteasome (prosome,

macropain) subunit, beta

type, 6

1,59 Cytoplasm peptidase

P62191 PSMC1 proteasome (prosome,

macropain) 26S subunit,

ATPase, 1

2,00 Nucleus peptidase

P18754 RCC1

(includes

EG:1104)

regulator of chromosome

condensation 1

-0,71 Cytoplasm other

IPI00815

843

RPL14 ribosomal protein L14 3,32 Cytoplasm other

Q02878 RPL6 ribosomal protein L6 1,59 Cytoplasm other

Q96T51 RUFY1 RUN and FYVE domain

containing 1

-2,00 Cytoplasm transporter

Q9NP81 SARS2 seryl-tRNA synthetase 2,

mitochondrial

-1,42 Cytoplasm enzyme

Q13228 SELENBP1 selenium binding protein 1 -0,51 Cytoplasm other

P35237 SERPINB6 serpin peptidase inhibitor,

clade B (ovalbumin),

member 6

-2,26 Cytoplasm other

Q9UJS0 SLC25A13 solute carrier family 25,

member 13 (citrin)

2,59 Cytoplasm transporter

P12236 SLC25A6 solute carrier family 25

(mitochondrial carrier;

adenine nucleotide

translocator), member 6

1,59 Cytoplasm transporter

P11166 SLC2A1 solute carrier family 2

(facilitated glucose

transporter), member 1

-0,29 Plasma

Membrane

transporter

SNRPEL1 SNRPEL1 small nuclear

ribonucleoprotein

polypeptide E-like 1

-2,81 Nucleus other

P04179 SOD2 superoxide dismutase 2,

mitochondrial

-0,39 Cytoplasm enzyme

P05455 SSB Sjogren syndrome antigen B

(autoantigen La)

-1,54 Nucleus enzyme

Q9UNL2 SSR3 signal sequence receptor,

gamma (translocon-

3,32 Cytoplasm other

81

associated protein gamma)

Q12846 STX4 syntaxin 4 -2,81 Plasma

Membrane

transporter

P51687 SUOX sulfite oxidase -2,32 Cytoplasm enzyme

O60506 SYNCRIP synaptotagmin binding,

cytoplasmic RNA interacting

protein

-0,95 Nucleus other

P09758 TACSTD2 tumor-associated calcium

signal transducer 2

-1,06 Plasma

Membrane

other

Q99805 TM9SF2 transmembrane 9

superfamily member 2

1,59 Plasma

Membrane

transporter

Q9HC07 TMEM165 transmembrane protein 165 1,59 Plasma

Membrane

other

Q9NS69 TOMM22 translocase of outer

mitochondrial membrane 22

homolog (yeast)

2,81 Cytoplasm transporter

P07437 TUBB tubulin, beta 0,74 Cytoplasm other

Q16881 TXNRD1 thioredoxin reductase 1 -0,85 Cytoplasm enzyme

Q9NVA1 UQCC ubiquinol-cytochrome c

reductase complex

chaperone

-1,59 Cytoplasm other

P21796 VDAC1 voltage-dependent anion

channel 1

0,53 Cytoplasm ion channel

Q86UZ6 ZBTB46 zinc finger and BTB domain

containing 46

-1,59 Nucleus other

Top 10 significant (p<0.05, Fischer’s exact test) biological functions (biofunctions)

related to disease and disorders along with the proteins involved in each biofunction

were also derived from IPA and are displayed in Table IV.4. Infection mechanism,

inflammatory response and cancer were the biofunctions possessing the lowest p-

value and there were two proteins, CD 44 antigen (CD44) and CD81 antigen (CD81),

which were part of the three of them. CD44, which was found to be overexpressed in

smokers when compared to nonsmokers, is a cellular surface glycoprotein involved

cell-cell and cell-matrix interactions, through its affinity for hyaluronic acid and

possibly to its affinity for other ligands as osteopontin, collagens and matrix

metalloproteinases. In the presence of cigarette smoke in vivo or reactive oxygen

82

species (ROS) in vitro, CD44 was reported to mediate oxidative stress-induced mucus

hypersecretion in airway epithelium from smokers or primary cultures of human

bronchial epithelial cells [24].

Table IV.3: Top 5 protein interaction networks generated from proteins found to be

significantly differentially expressed proteins between smokers and nonsmokers.

Score Focus

Molecules Top Functions Molecules in Network

52 24

Antigen Presentation, Cell-To-

Cell Signaling and Interaction,

Hematological System

Development and Function

Actin,ARHGAP1,Beta

Tubulin,Caspase,CD44,CD81,CDC42,Ck2,CLIC4,CYB

5R3,Cytochrome c,DDOST,Erm,F

Actin,FSH,GDI2,GOT1,HSPA9,HSPA1L,KEL,MSN,MY

H10,NFkB (complex),PDXK,PEBP1,PFN1,Ras

homolog,Rho

gdi,RUFY1,SLC2A1,SSB,STX4,TUBB,TXNRD1,VDAC1

29 15

Cell Morphology, Drug

Metabolism, Molecular

Transport

beta-

estradiol,CD81,CKB,CKMT2,CLIC4,COX1,CYBA,CYC

1,CYTB,EPHA2,FUT3,FUT7,GNS,GYG1,HOXA9,hydr

ogen peroxide,lipoxin

A4,magnesium,MSI2,MUC5AC,PGLS,PHLDA1,PLD3

,PLSCR1,RAB9A,RALA,RALBP1,RCC1 (includes

EG:1104),SELENBP1,SLK,TACSTD2,TGFB1,TNF,UQC

R10,UQCRH

28 15

DNA Replication,

Recombination, and Repair,

Energy Production, Nucleic

Acid Metabolism

19S proteasome,ADAMTS14,adenosine-

tetraphosphatase,AKR1A1,ATP

synthase,ATP5A1,ATP5C1,ATP5D,ATP5E,ATP5O,AT

P6V1B2,BLMH,COX4I1,DPP3,ECHS1,ETFA,GLOD4,I

KBKE,KLK2,MCCC2,NAPA,NCSTN,peptidase,Protea

some

PA700/20s,PSMB6,PSMC,PSMC1,PSMD1,retinoic

acid,RPL6,RPL11,RPL14,SERPINB6,SLC2A4,SYNCRIP

24 13

Cell-To-Cell Signaling and

Interaction, Cellular Assembly

and Organization, Cellular

Movement

ACP1,Akt,ALCAM,Ap1,CPT1A,DEFA1 (includes

EG:1667),EIF2S1,ERK,ERK1/2,ERRFI1,FUT7,GEFT,G

NB2L1,GSS,HK1,HP,Hsp70,IL1,Insulin,Integrin,LOC

290704,Mapk,MTCH2,NPR1,P38 MAPK,Pdgf,PDGF

BB,Pdgfr,Pdgfra-

Pdgfrb,PI3K,Pkc(s),PP2A,PVR,SLC25A6,SOD2

83

24 13

Carbohydrate Metabolism,

Hepatic System Development

and Function, Small Molecule

Biochemistry

ALDH5A1,BAT2,BPHL,BRF2,EIF2C1,EIF2C4,EPPK1,F

7,GSTK1,HNF4A,KCNN2,malate dehydrogenase

(oxaloacetate-decarboxylating)

(NADP),ME2,ME3,MIR18A (includes

EG:406953),MIR293 (includes

EG:100049714),MIR34A (includes

EG:407040),MOD2,MRPS28,NEDD8,NEIL2,PHB2,PI

NX1,PTP4A3,RAB11A,SARS2,SLC25A13,SSR3,TM9S

F2,TMEM165,UQCC,USP15,WDR8 (includes

EG:49856),YBX1

Table IV.4: Top 10 significant biofunctions in disease and disorders observed in

differentially expressed proteins of smokers when compared to nonsmokers.

Category p-value Molecules

Infection Mechanism 2,52E-03-2,89E-02 CD81,CD44,SSB

Inflammatory

Response 2,52E-03-3,58E-02 CD81,HP,SOD2,CDC42,CD44,STX4

Cancer 5,84E-03-4,58E-02 CD81,PEBP1,SSR3,SOD2,SLC2A1,DDOST,CD44,TM9SF2,KE

L,FUT7,SERPINB6,SSB

Cardiovascular

Disease 5,84E-03-4,58E-02 BLMH,PFN1,NPR1,CD44

Dermatological

Diseases and

Conditions

5,84E-03-5,84E-03 BLMH

Developmental

Disorder 5,84E-03-5,84E-03 CD44

Genetic Disorder 5,84E-03-4,8E-02

CD81,MYH10,GNB2L1,UQCC,TM9SF2,FUT7,TUBB,GSS,SSR

3,SLC25A6,SOD2,DPP3,C20ORF114,null,GOT1,PDXK,CPT1

A,SLC2A1,ATP5A1,DDOST,ATP5C1,NPR1,ZBTB46,ACP1,M

E2,ALCAM,TMEM165,CYC1,VDAC1,GNS,SUOX,MCCC2,PE

BP1,CDC42,ADAMTS14,CKAP4,EIF2S1,CLIC4,PSMB6,SELE

NBP1,NDUFS2,MUC5AC,ALDH5A1,COX4I1,TACSTD2,SLC2

5A13,BLMH,HSPH1,GDI2,MSI2,NDUFS3,PSMC1,HP,CD44

Immunological

Disease 5,84E-03-4,52E-02 CD81,FUT7,GSS

Inflammatory Disease 5,84E-03-3,46E-02 SOD2,BLMH,ADAMTS14,CD44,MUC5AC,TUBB,SELENBP1

84

Metabolic Disease 5,84E-03-4,58E-02 SLC25A13,SOD2,CPT1A,NDUFS2,GNS,SUOX,ALDH5A1,MC

CC2

Figure IV.5: Top protein network as obtained from Ingenuity Pathway Analysis.

Mucin-5AC (MUC5AC), a gel-forming glycoprotein of respiratory tract epithelia that

protects the mucosa from infection and chemical damage by binding to inhaled

microrganisms and particules, together with MUC5B were the proteins responsible for

the reported mucus hypersecretion, which means that CD44 is involved in ROS-

induced MUC5AC expression [24]. In fact, in the present work, MUC5AC was also

found to be overexpressed in the smokers group. CD44 was also reported to be

involved in activation of NFkB complex, which is important in regulating cellular

response to harmful cellular stimuli such as ROS [25]. This interaction is shown in the

network displayed in Figure IV.5. CD44 is also a player in other actions involving

inflammatory response. Inflammatory response can also be viewed as part of an innate

tissue repair process following an insult, such as invading pathogens or damaged cells

as a result of cigarette smoking. The ultimate fate of many recruited inflammatory cells

is death, may it be apoptotic or necrotic. Phagocytic removal of apoptotic cells by

macrophages is an immunologically silent process that does not provoke release of

85

pro-inflammatory mediators [26]. Conversely, failure to remove apoptotic

inflammatory cells can result in secondary necrosis and consequent release of their

toxic granule contents causing further tissue damage and exacerbating the

inflammatory response [27-29]. Hence, removal of apoptotic cells by phagocytosis will

determine whether the inflammatory response will succeed [27-29]. Phagocytosis is

controlled by a variety of factors including cell surface molecules such as CD44 [28, 30].

Cigarette smoke is reported to include a high concentration of oxidants that

consequently lead to oxidative stress, which in turn leads to mitochondrial dysfunction

and DNA damage [7, 8, 31]. Within the significantly differentially expressed proteins,

four are related to oxidative stress or oxidative stress response mediated by Nrf2 –

superoxide dismutase 2, mitochondrial (SOD2), aldo-keto reductase family 1 member

A1 (AKR1A1), thioredoxin reductase 1, cytoplasmic (TXNRD1) and glutathione

synthetase (GSS). SOD2 is a mitochondrial antioxidant enzyme that detoxifies

superoxide anion radicals, a byproduct of mitochondrial respiration. SOD2 was found

to be underexpressed in the smokers which is consistent to what have recently been

reported [32]. The same trend was observed for AKR1A1 and TXNRD1, which as SOD2

are parts of the oxidative stress response mediated by Nrf2, and GSS. GSS catalyzes the

second step of glutathione (GSH) biosynthesis which in turn is important for a variety

of biological functions, including protection of cells from oxidative damage by free

radicals and detoxification of xenobiotics [33-35]. Bleomycin also triggers excess

production of ROS and DNA damage in the lung and bleomycin hydrolase (BLMH) [36],

an enzyme whose only known function is the metabolic inactivation of bleomycin, was

found to be underexpressed in smokers. This may contribute to the action of

bleomycin which has long been associated with pulmonary fibrosis and lung

emphysema in the presence of cigarette smoke [36-39]. Since the oxidative balance is

impaired due to the elevated levels of ROS present in the smoker subjects, it was

expected that the antioxidant response would also be increased in order to

compensate for this fact and therefore lower levels of SOD2, AKR1A1, TXNRD1 and GSS

found in smokers may have a serious impact in the cellular environment as reactions

between cellular components and ROS lead to DNA damage, mitochondrial

dysfunction, cell membrane damage and cell death [7, 8, 31]. Damaged DNA-binding

protein 1 (DDB1) plays a key role in the normal cell cycle and in response to DNA

86

damage and was also found to be underexpressed in the smoker group [40]. Cell

division cycle 42 (CDC42) is a small GTPase of the Rho-subfamily that is required for the

establishment of the apical-basal axis in epithelial cells and in differentiating neurons

[41-43]. It controls epithelial tissue morphogenesis by regulating spindle orientation

during cell division [44] and modulates cell adhesion and polarity during embryonic

morphogenesis by regulating the trafficking of key cell junction proteins [45, 46].

CDC42 was also found to be under-expressed in smoker subjects. Eleven proteins were

found to be related to mitochondrial dysfunction canonical pathway according to IPA

including cytochromes c1 (CYC1), c subunit 4 isoform (COX4I1) and b5 reductase 3

(CYB5R3), all overexpressed in the smoker group. Full list of proteins including

differential expression values can be accessed in Supplemental Table 6, Supplemental

Information. Cigarette smoke, as a mixture of a large number of substances is

expected to influence drug metabolism. In fact, 11 proteins were found to be related

to drug metabolism among the differentially expressed ones. Information on the

identity of this proteins and it differential expression and also on the type of action of

each protein within drug metabolism can be found in Supplemental Tables IV.7A and

IV.7B, Supporting Information, respectively.

CONCLUSION

Cigarette smoke is the most preventable cause of sickness and death worldwide.

Chronic cigarette smoking contributes to cardiovascular diseases, oral cancers and

even ocular diseases, but the respiratory tract is the most affected and thus cigarette

smoke is an important risk factor for lung diseases such as lung cancer, tuberculosis or

chronic obstructive pulmonary disease. This is a pioneer work as for the first time the

proteome of nasal epithelial cells obtained from smoker subjects is revealed and

compared to the one of nonsmokers. Moreover, samples were analyzed by a high-

resolution mass spectrometer which was capable of generating over 900 protein

identifications by two or more peptides. Ninety-six proteins were found to be

differentially expressed between the proteomes of healthy smokers and nonsmokers,

which were related to processes of antigen presentation, cell-to-cell signaling and

interaction, cell morphology, drug metabolism, DNA repair, energy production or

mitochondrial dysfunction. Although requiring further orthogonal validation our data

87

was consistent with previous evidences showing CD44, MUC5AC or SOD2 differential

modulation in smokers due to inflammatory response pathways. In addition, the data

presented here may provide new insights into processes such as drug metabolism,

energy production or mitochondrial dysfunction shedding a light onto proteins that

had never been associated with cigarette smoke.

88

ACKNOWLEDGEMENTS

All volunteers for their cooperation in this work. Work partially supported by FCT-

FEDER POCI/SAU-MMO/56163/2004, FCT/Poly-Annual Funding Program and FEDER-

Saude XXI Program (Portugal). BMA is recipient of FCT doctoral fellowship

(SFRH/BD/31415/2006).

89

REFERENCES

[1] European Respiratory Society and European Lung Foundation 2003.

[2] in: Schraufnagel, D. E. (Ed.), American Thoracic Society 2010.

[3] Eriksen, D. J. M. D. M. (Ed.), The Tobacco Atlas, World Health Organization 2002.

[4] The world health report 2008: primary health care now more than ever, World

Health Organization 2008.

[5] Tobacco: a major international health hazard. Proceedings of an international

meeting. Moscow, 4-6 June 1985. IARC Sci Publ 1986, 1-319.

[6] Church, D. F., Pryor, W. A., Free-radical chemistry of cigarette smoke and its

toxicological implications. Environ Health Perspect 1985, 64, 111-126.

[7] Rahman, I., Biswas, S. K., Kode, A., Oxidant and antioxidant balance in the airways

and airway diseases. European journal of pharmacology 2006, 533, 222-239.

[8] Faux, S. P., Tai, T., Thorne, D., Xu, Y., et al., The role of oxidative stress in the

biological responses of lung epithelial cells to cigarette smoke. Biomarkers 2009, 14

Suppl 1, 90-96.

[9] Shaykhiev, R., Bals, R., Interactions between epithelial cells and leukocytes in

immunity and tissue homeostasis. Journal of leukocyte biology 2007, 82, 1-15.

[10] Pahl, A., Preclinical modelling using nasal epithelial cells for the evaluation of

herbal extracts for the treatment of upper airway diseases. Planta Med 2008, 74, 693-

696.

[11] Beck, S., Penque, D., Garcia, S., Gomes, A., et al., Cystic fibrosis patients with the

3272-26A-->G mutation have mild disease, leaky alternative mRNA splicing, and CFTR

protein at the cell membrane. Hum Mutat 1999, 14, 133-144.

[12] Roxo-Rosa, M., da Costa, G., Luider, T. M., Scholte, B. J., et al., Proteomic analysis

of nasal cells from cystic fibrosis patients and non-cystic fibrosis control individuals:

search for novel biomarkers of cystic fibrosis lung disease. Proteomics 2006, 6, 2314-

2325.

[13] Gomes-Alves, P., Imrie, M., Gray, R. D., Nogueira, P., et al., SELDI-TOF biomarker

signatures for cystic fibrosis, asthma and chronic obstructive pulmonary disease. Clin

Biochem, 43, 168-177.

90

[14] McDougall, C. M., Blaylock, M. G., Douglas, J. G., Brooker, R. J., et al., Nasal

epithelial cells as surrogates for bronchial epithelial cells in airway inflammation

studies. American journal of respiratory cell and molecular biology 2008, 39, 560-568.

[15] Penque, D., Mendes, F., Beck, S., Farinha, C., et al., Cystic fibrosis F508del patients

have apically localized CFTR in a reduced number of airway cells. Laboratory

investigation; a journal of technical methods and pathology 2000, 80, 857-868.

[16] Marengo, E., Robotti, E., Bobba, M., Gosetti, F., The principle of exhaustiveness

versus the principle of parsimony: a new approach for the identification of biomarkers

from proteomic spot volume datasets based on principal component analysis.

Analytical and bioanalytical chemistry, 397, 25-41.

[17] McDonald, J. H., Handbook of Biological Statistics, Sparky House Publishing,

Baltimore, MD 2009.

[18] Ferguson, R. E., Carroll, H. P., Harris, A., Maher, E. R., et al., Housekeeping

proteins: a preliminary study illustrating some limitations as useful references in

protein expression studies. Proteomics 2005, 5, 566-571.

[19] Di, Y. P., Harper, R., Zhao, Y., Pahlavan, N., et al., Molecular cloning and

characterization of spurt, a human novel gene that is retinoic acid-inducible and

encodes a secretory protein specific in upper respiratory tracts. The Journal of

biological chemistry 2003, 278, 1165-1173.

[20] Ozawa, K., Kuwabara, K., Tamatani, M., Takatsuji, K., et al., 150-kDa oxygen-

regulated protein (ORP150) suppresses hypoxia-induced apoptotic cell death. The

Journal of biological chemistry 1999, 274, 6397-6404.

[21] Eisen, M. B., Spellman, P. T., Brown, P. O., Botstein, D., Cluster analysis and display

of genome-wide expression patterns. Proceedings of the National Academy of Sciences

of the United States of America 1998, 95, 14863-14868.

[22] Ross, D. T., Scherf, U., Eisen, M. B., Perou, C. M., et al., Systematic variation in

gene expression patterns in human cancer cell lines. Nat Genet 2000, 24, 227-235.

[23] Lan, M. Y., Ho, C. Y., Lee, T. C., Yang, A. H., Cigarette smoke extract induces

cytotoxicity on human nasal epithelial cells. Am J Rhinol 2007, 21, 218-223.

[24] Casalino-Matsuda, S. M., Monzon, M. E., Day, A. J., Forteza, R. M., Hyaluronan

fragments/CD44 mediate oxidative stress-induced MUC5B up-regulation in airway

91

epithelium. American journal of respiratory cell and molecular biology 2009, 40, 277-

285.

[25] Lieberman, L. A., Hunter, C. A., Regulatory pathways involved in the infection-

induced production of IFN-gamma by NK cells. Microbes Infect 2002, 4, 1531-1538.

[26] Meagher, L. C., Savill, J. S., Baker, A., Fuller, R. W., Haslett, C., Phagocytosis of

apoptotic neutrophils does not induce macrophage release of thromboxane B2.

Journal of leukocyte biology 1992, 52, 269-273.

[27] Ward, C., Dransfield, I., Chilvers, E. R., Haslett, C., Rossi, A. G., Pharmacological

manipulation of granulocyte apoptosis: potential therapeutic targets. Trends in

pharmacological sciences 1999, 20, 503-509.

[28] Kirkham, P. A., Spooner, G., Rahman, I., Rossi, A. G., Macrophage phagocytosis of

apoptotic neutrophils is compromised by matrix proteins modified by cigarette smoke

and lipid peroxidation products. Biochemical and biophysical research communications

2004, 318, 32-37.

[29] Haslett, C., Granulocyte apoptosis and its role in the resolution and control of lung

inflammation. American journal of respiratory and critical care medicine 1999, 160, S5-

11.

[30] Hart, S. P., Dougherty, G. J., Haslett, C., Dransfield, I., CD44 regulates phagocytosis

of apoptotic neutrophil granulocytes, but not apoptotic lymphocytes, by human

macrophages. J Immunol 1997, 159, 919-925.

[31] Sies, H., Oxidative stress: oxidants and antioxidants. Exp Physiol 1997, 82, 291-295.

[32] Russo, M., Cocco, S., Secondo, A., Adornetto, A., et al., Cigarette smoke

condensate causes a decrease of the gene expression of Cu-Zn superoxide dismutase,

mn superoxide dismutase, glutathione peroxidase, catalase, and free radical-induced

cell injury in SH-SY5Y human neuroblastoma cells. Neurotox Res 2011, 19, 49-54.

[33] Meister, A., Anderson, M. E., Glutathione. Annual review of biochemistry 1983, 52,

711-760.

[34] Brown, L. A., Glutathione protects signal transduction in type II cells under oxidant

stress. The American journal of physiology 1994, 266, L172-177.

[35] Njalsson, R., Norgren, S., Physiological and pathological aspects of GSH

metabolism. Acta Paediatr 2005, 94, 132-137.

92

[36] Cho, H. Y., Kleeberger, S. R., Nrf2 protects against airway disorders. Toxicology and

applied pharmacology, 244, 43-56.

[37] Takada, K., Takahashi, K., Sato, S., Yasui, S., Cigarette smoke modifies bleomycin-

induced lung injury to produce lung emphysema. Tohoku J Exp Med 1987, 153, 137-

144.

[38] Decologne, N., Wettstein, G., Kolb, M., Margetts, P., et al., Bleomycin induces

pleural and subpleural fibrosis in the presence of carbon particles. Eur Respir J, 35,

176-185.

[39] Cisneros-Lira, J., Gaxiola, M., Ramos, C., Selman, M., Pardo, A., Cigarette smoke

exposure potentiates bleomycin-induced lung fibrosis in guinea pigs. American journal

of physiology 2003, 285, L949-956.

[40] Lv, X. B., Xie, F., Hu, K., Wu, Y., et al., Damaged DNA-binding protein 1 (DDB1)

interacts with Cdh1 and modulates the function of APC/CCdh1. The Journal of

biological chemistry 2010, 285, 18234-18240.

[41] Cappello, S., Attardo, A., Wu, X., Iwasato, T., et al., The Rho-GTPase cdc42

regulates neural progenitor fate at the apical surface. Nat Neurosci 2006, 9, 1099-

1107.

[42] Etienne-Manneville, S., Hall, A., Rho GTPases in cell biology. Nature 2002, 420,

629-635.

[43] Florian, M. C., Geiger, H., Concise review: polarity in stem cells, disease, and aging.

Stem cells (Dayton, Ohio), 28, 1623-1629.

[44] Jaffe, A. B., Kaji, N., Durgan, J., Hall, A., Cdc42 controls spindle orientation to

position the apical surface during epithelial morphogenesis. The Journal of cell biology

2008, 183, 625-633.

[45] Georgiou, M., Marinari, E., Burden, J., Baum, B., Cdc42, Par6, and aPKC regulate

Arp2/3-mediated endocytosis to control local adherens junction stability. Curr Biol

2008, 18, 1631-1638.

[46] Duncan, M. C., Peifer, M., Regulating polarity by directing traffic: Cdc42 prevents

adherens junctions from crumblin' aPart. The Journal of cell biology 2008, 183, 971-

974.

93

Chapter V

Proteomic Profiling of Nasal

Epithelial Cells in Chronic

Obstructive Pulmonary Disease

94

Proteomic profiling of nasal epithelial cells in chronic obstructive pulmonary disease

Bruno M. Alexandre1, Brian L. Hood2, Mai Sun2,

Thomas P. Conrads2* and Deborah Penque1*

1Laboratorio de Proteómica, Departamento de Genética, Instituto Nacional de Saúde

Dr. Ricardo Jorge (INSA-IP), Av. Padre Cruz 1649-016 Lisboa, Portugal and the

2Department of Pharmacology & Chemical Biology, University of Pittsburgh Cancer

Institute, University of Pittsburgh

Keywords: Nasal epithelial cells, nasal brushing, chronic obstructive pulmonary

disease, lung, tobacco, cigarette smoke, proteomics.

*Corresponding authors: Deborah Penque, Ph.D., Laboratório de Proteómica,

Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo

Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8137,

Fax: +351 21752 6410, E-mail: [email protected] and Thomas P.

Conrads, Ph.D., 204 Craft Avenue, Suite B401, Pittsburgh, PA, 15213, Tel: 412-641-

7556, Fax: 412-641-2356, E-mail: [email protected] and

95

ABSTRACT

Chronic obstructive pulmonary disease (COPD), a chronic lung disease, is the fourth

leading cause of death in world. COPD is primarily characterized by the presence of

airflow limitation resulting from inflammation and remodeling of small airways and is

often associated with lung parenchymal destruction or emphysema.

Fresh nasal epithelial cells collected by noninvasive brushing technique have been used

as surrogates of lower airway cells in the investigation of respiratory diseases. Here,

for the first time, a wider proteomic profiling of nasal epithelial cells from COPD

patients in comparison with healthy smokers and nonsmokers was performed using a

combination of 1D-PAGE and high resolution liquid chromatography-tandem mass

spectrometry approach. About 1173 protein were identified confidentially by at least

two peptides and compared across conditions. Functional characterization by

Ingenuity pathway analysis (IPA) revealed that about 40% of significantly differentially

expressed proteins in COPD nasal epithelial cells are related to cancer. The data also

revealed that unfolded protein response (UPR) is activated in COPD nasal epithelial

cells confirming previous evidence of the UPR in COPD. The upregulation of proteins

related to drug metabolism and oxidative stress response, in particular with Nrf2-

mediated oxidative stress response was also observed. Further validation of these data

by orthogonal methods will emphasize the value of using native nasal epithelial cells by

showing primary molecular networks/pathways associated with COPD pathogenesis.

96

INTRODUCTION

Chronic obstructive pulmonary disease (COPD) is primarily characterized by the

presence of airflow limitation resulting from inflammation and remodeling of small

airways and is often associated with lung parenchymal destruction or emphysema [1].

Additionally, it has been recognized that COPD extends beyond the lung and that many

patients have several systemic manifestations that can further impair functional

capacity and health-related quality of life [2, 3]. COPD is a major cause of chronic

morbidity and mortality throughout the world. Many people suffer from this disease

for years and die prematurely from it or its complications. COPD is the fourth leading

cause of death worldwide and its morbidity and mortality is expected to rise as

population age and mortality from cardiovascular and infectious diseases falls [1].

Among respiratory diseases, COPD is the leading cause of lost work days. In the United

States of America, medical costs credited to COPD were estimated at $32.1 billion [4].

In the European Union, productivity losses are estimated to amount to a total of €28.5

billion annually [5]. Therefore, according to these data, the total costs associated with

the disease, including indirect ones, are quite relevant.

Cigarette smoke is the most commonly encountered risk factor for COPD. Cigarette

smokers have a higher prevalence of respiratory symptoms, lung function

abnormalities, a greater rate of decline in forced expiratory volume in the first second,

FEV1, and higher death rates for COPD than nonsmokers [1, 5]. A 25-year follow up

study of the general population concluded that 92% of COPD deaths occurred in

subjects who were current smokers at the beginning of the follow up period and that,

after 25 years of smoking, at least 25% of smokers without initial disease will develop

clinically significant and 30-40% will have COPD [6].

The main functions of the nose are the sense of smell, the regulation of humidity and

temperature of inhaled air, and the removal of large particulates from the inhaled air.

Once it enters the nasal vestibule, inhaled air is forced to pass through the nasal valve,

and then expands as it travels further in the nasal cavity, which offers little airflow

resistance. This sudden change in speed and pressure produces turbulence and eddies

[7]. These currents allow adequate contact of inhaled air with respiratory epithelium

due to the presence of three shelf-like projections, the concha or turbinates. The

inferior turbinate has a central osseous core surrounded by lamina propria covered

97

with pseudostratified ciliated columnar (respiratory) epithelium resting over a thick

basement membrane [7]. The respiratory epithelium is composed of four types of cells,

namely, nonciliated and ciliated columnar cells, basal cells and goblet cells. The major

function of the nasal epithelium has been regarded to be primarily that of a physical

barrier, but recent evidence strongly supports that epithelial cells are quite active

metabolically and capable of modulating a variety of inflammatory processes and

immune responses [8, 9].

Culture conditions of airway epithelial cells, their proliferation and immortalization

may influence their protein expression levels and therefore modify cellular processes.

In the present work, freshly obtained epithelial cells were collected by nasal brushing.

This technique has been shown to be capable of yielding numerous and well-preserved

dissociated cells that are representative of the human superficial respiratory mucosa

[10]. Our group have already conducted successful proteomic approaches on nasal

epithelial cells to describe the proteomic profiling of other chronic lung diseases such

as the monogenic disease cystic fibrosis by means of two-dimensional gel

electrophoresis and surface enhanced laser desorpion/ ionization time of flight (SELDI-

TOF)-MS [11, 12], but no studies on COPD have been reported so far. Nasal epithelial

cells were also reported to constitute an accessible surrogate for studying lower airway

inflammation [13]. Identification of disease-specific, severity-related biomarkers is an

essential step for diagnosis and monitoring of therapeutics in COPD patients. In the

past decade, sample preparation techniques greatly evolved, but it was mass

spectrometry (MS), with new capabilities as high resolution, high mass accuracy that

supported proteomics with the effective means for high-throughput, comprehensive,

comparative examinations of protein expression in healthy and disease states.

Proteomics is now widely used and it is responsible for a better understanding of

biological processes that ultimately leads to the discovery of new biomarkers. In the

present work, four groups were constituted: healthy smokers and nonsmokers and

two groups of patients: mild and severe. Protein extracts derived from nasal epithelial

cells collected by nasal brushing were loaded onto a 1D-PAGE and separated by

reverse phase liquid chromatography prior to MS analysis on a high resolution mass

spectrometer.

98

MATERIALS AND METHODS

Individuals and Sample Collection

The study was approved by the Ethics Committee of Hospital de Santa Maria, Lisbon

and Instituto Nacional de Saude Dr. Ricardo Jorge (INSA-IP), Lisbon. After informed

consent, nasal epithelial cells were collected by nasal brushing as previously described

[10, 14], from healthy nonsmokers (n=8), healthy cigarette smokers (n=10), and COPD

mild (n=9) and COPD severe patients (n=6). Lung function was evaluated by means of

spirometry and, according to Global initiative for chronic obstructive lung disease

(GOLD) guidelines, FEV1/FVC < 0.7 was set to be the criterion for an obstructed lung

function [1]. Each group was split into two to achieve two biological replicates. Main

characteristics of each of the biological replicates are displayed in Table V.1.

Table V.I: Demographics of biological replicates.

n

Biological

Replicates n Age (y) FVC(%) FEF(%) FEV1(%) FEV1/FVC(%)

Nonsmokers 10 1 5 54 ± 2.9 97 ± 20.5 81 ± 17.9 98 ± 19.7 86 ± 4.1

2 5 53 ± 3.6 98 ± 10.4 84 ± 17.6 98 ± 13.8 88 ± 11.8

Smokers 8 1 4 52 ± 7.7 112 ± 19.5 68 ± 6.5 108 ± 15.6 79 ± 3.1

2 4 50 ± 1.7 103 ± 18.9 76 ± 24.0 98 ± 21.9 80 ± 1.9

COPD Mild 9 1 4 67 ± 11.8 97 ± 11.4 45 ± 24.4 81 ± 20.9 65 ± 12.5

2 5 66 ± 9.8 122 ± 33.3 47 ± 13.2 102 ± 23.6 67 ± 7.6

COPD Severe 6 1 3 64 ± 6.6 63 ± 21.9 7 ± 2.3 24 ± 4.9 35 ± 12.1

2 3 64 ± 3.8 106 ± 50.1 33 ± 21.8 60 ± 49.9 40 ± 15.9

Table V.2: Smoking history of biological replicates.

n

Biological

Replicates n Smoking history (y) Cigarettes per day

Nonsmokers 10 1 5 - -

2 5 - -

Smokers 8 1 4 31 ± 12.2 21 ± 6.3

2 4 29 ± 4.2 15 ± 4.1

COPD Mild 9 1 4 34 ± 9.7 35 ± 12.9

2 5 27 ± 8.2 34 ± 16.7

COPD Severe 6 1 3 39 ± 3.2 37 ± 11.5

2 3 41 ± 9.0 33 ± 25.7

99

Except for the individuals included in the nonsmoker group, all subjects ought to

possess a smoking history of at least 20 years, smoking a minimum of 10 cigarettes per

day. Information on smoking history can be found in Table V.2. Cell suspensions from

each individual were cytospun onto a microscopy slide, stained with May-Grünwald-

Giemsa (MGG-) staining and examined for evidence of epithelial cells (ciliated, goblet

and basal cells) and for red blood cell contamination (Figure IV.1). Individuals whose

cell preparation was contaminated were removed from the study.

Nasal Epithelial Cells Lysis

Cell suspensions were centrifuged after collection and pelleted cells were resuspended

in the presence of 10 mM Tris-Cl pH 7.6 in 1 mM EDTA containing protease inhibitors.

Cells were lysed by intermittent sonication cycles (10 cycles of 10 sec-pulse followed

by 30 sec pause on dry ice). Lysates were centrifuged twice at 2000 x g for 3 min at 4 °C

to discard any unlysed cells or cell debris. Before storing at -80 °C, an aliquot of 10 μL

from each individual was removed to perform a BCA protein assay (Pierce, Rockford IL,

USA).

Sample Preparation for LC-MS/MS

Two biological replicates were constituted within each of the groups under analysis

(Table V.1). Each biological replicate containing 30 μg of total cell lysate was spiked

with 3 pmol of chicken ovalbumin. Each sample was loaded into duplicate gel lanes

onto 1D SDS-PAGE on a 4-12% bis-tris gel (NuPAGE, Invitrogen, Carlsbad, CA) and

electrophoresed for approximately 10 min at a constant voltage of 150 V. Gels were

stained with Coomassie blue (SimplyBlue SafeStain, Invitrogen) and each pair of bands

belonging to the same sample were excised, sliced into small pieces and pooled

together into the same tube. Gel slices were destained in 50% acetonitrile (AcN) and

50mM ammonium bicarbonate (AMB) overnight at 4 °C and in the next morning for

another hour. Fully destained gel slices were dehydrated in 100% AcN. Gel slices were

then rehydrated in 25 mM AMB containing 20 µg/mL porcine sequencing grade

modified trypsin (Promega, Madison, WI) on ice for 45 min. This solution was

discarded and a 25 mM AMB solution was added to the gel slices and incubated

overnight at 37 °C. Tryptic peptides were extracted with 70% ACN and 5% formic acid

100

(FA) and dried by vacuum centrifugation. Each digest was resuspended in 60 µl of 0.1%

trifluoroacetic acid (TFA).

Proteomic analysis by liquid chromatography-tandem mass spectrometry

Peptide digests were resolved by nanoflow reverse-phase liquid chromatography

(Ultimate 3000, Dionex Inc.) coupled online via electrospray ionization to a hybrid

linear ion trap-Orbitrap mass spectrometer (LTQ-Orbitrap, ThermoFisher Scientific,

Inc., San Jose, CA). Five injections of 2 µL of peptide extracts corresponding to 1 μg

total protein were resolved on 100 μm i.d. by 360 μm o.d. by 200 mm long fused silica

capillary columns (Polymicro Technologies, Phoenix, AZ) slurry-packed in-house with 5

μm, 300 Å pore size C-18 silica-bonded stationary phase (Jupiter, Phenomenex,

Torrance, CA). After sample injection, peptides were eluted from the column using a

linear gradient of 2% mobile phase B (100% AcN and 0.1% formic acid) to 40% mobile

phase B over 125 min at a constant flow rate of 200 nL/min followed by a column wash

consisting of 95% B for an additional 30 min at a constant flow rate of 400 nL/min. The

LTQ-Orbitrap MS was configured to collect high resolution (R=60,000 at m/z 400)

broadband mass spectra (m/z 375-1800) from which the thirteen most abundant

peptide molecular ions dynamically determined from the MS scan were selected for

tandem MS using a relative CID energy of 30%. Dynamic exclusion was utilized to

minimize redundant selection of peptides for CID.

Peptide Identification and Spectral Count Analysis

Peptide identifications were obtained by searching the LC-MS/MS data utilizing

SEQUEST (Thermo Scientific BioWorks 3.2) on a 72 node Beowulf cluster against a

UniProt-derived human proteome database (version 03/10) obtained from the

European Bioinformatics Institute (EBI). Search parameters consisted of enzyme:

trypsin (KR); enzyme limits: full enzymatic-cleavage at both ends; missed cleavages

sites: 2; peptide tolerance: 20 ppm; fragment ion tolerance: 0.5 amu; and variable

modifications on methionine of 15.99492 m/z. Resulting peptide identifications were

filtered according to specific SEQUEST scoring criteria [delta correlation (ΔCn) ≥ 0.08

101

and charge state dependent cross correlation (Xcorr) ≥ 1.9 for *M+H+1+ (mass+proton),

≥ 2.2 for *M+2H+2+, ≥ 3.5 for *M+3H+3++ and ≥ 3.0 for *M+4H+4++. Differences in

protein abundance between the samples were derived by spectral counting (SC).

Peptides whose sequence mapped to multiple protein isoforms were grouped as per

the principle of parsimony [15]. A value of 0.5 was added to each spectral count value

prior to log2 transformation to enable ratio values to be calculated for proteins

identified in one group, but not another [16]. Proteins which exhibited a >95%

confidence interval from the mean for each comparison performed were considered

statistically significant.

Bioinformatic analyses

Uniprot accessions corresponding to proteins identified by at least two peptides were

mapped to HUGO (HGNC) gene symbols utilizing Ingenuity Pathway Analysis (IPA)

(Ingenuity® Systems, www.ingenuity.com). Accessions which failed to map were

converted to IPI identifiers with the mapping utility available at www.uniprot.org and

remapped to IPA to maximize protein identifications available for downstream

bioinformatic analyses. Protein localization and subtype assignments were derived

from IPA-mapped data sets. Functional analysis of significant protein lists were

performed utilizing the “Core Analysis” function in IPA using default parameters

(p<0.05, Fischer’s Exact test). Network and protein interaction analyses were also

performed utilizing IPA for significant proteins in which a maximum of 35 proteins per

network assignment were allowed. ProteinCenter (Thermo Fischer Scientific) was used

to retrieve information on gene ontology terms to annotate proteins according to

cellular component, biological process and molecular function using default

parameters.

RESULTS AND DISCUSSION

Global proteomics analysis of nasal epithelial cells

Two biological replicates were constituted for each of the groups under analysis and

five injections corresponding to 1 µg total protein from each of the biological replicates

was analyzed by LC-MS/MS using a high-resolution mass spectrometer resulting in the

102

identification of 89968 peptides and 1475 proteins in total (Supplemental Table V.1,

Supporting Information), of which 1173 were identified by at least two peptides across

the samples under analysis (Supplemental Table V.2, Supporting Information). Protein

digests equivalency was determined by comparing the total number of peptides

identified (total spectral counts) in each of the analytical samples, which resulted in a

calculated relative standard deviation (RSD) of 9.3%. This was also evaluated through

comparison of total peptides identified for chicken ovalbumin, which was added in

equal amounts to each of the analyzed samples according to the workflow displayed in

Figure IV.2. Calculated RSD for ovalbumin is 7.7%, which is consistent with the RSD

obtained for total spectral counts. Equivalency in peptide load was also determined by

comparison of the spectral count values for the “housekeeping” protein actin,

cytoplasmic 1 (ACTB, Uniprot Accession (Acc): P60709), commonly used to correct for

protein loading in western blot analysis [17], which revealed a RSD of 12.7%. As

reported in Chapter IV, when performing the nasal brushing procedure, some local

bleeding may occur, resulting in the identification of proteins such as hemoglobin

subunits alpha, beta, gamma and gamma-1 (HBA1, HBB, HBD and HBG1, respectively),

band 3 anion transport protein (SLC4A1) or erythrocyte band 7 integral membrane

protein (STOM). Epithelial origin of the samples was confirmed by the identification of

proteins such as keratin type I cytoskeletal 19 (KRT19, Acc: P08727), keratin type II

cytoskeletal 8 (KRT8, Acc: P05787), palate, lung and nasal epithelium carcinoma

(PLUNC, Acc: Q9NP55), long palate, lung and nasal epithelium carcinoma (LPLUNC, Acc:

Q8TDL5), epithelial cell adhesion molecule (EPCAM, Acc: P16422), and a handful of

mucins such as mucin-1 (MUC1, Acc: P15941), mucin-2 (MUC2. Acc: Q02817), mucin-4

(MUC4, Acc: Q99102), mucin-5AC (MUC5AC, Acc: P98088) or mucin 5B (MUC5B, Acc:

Q9HC84) which are reported to be expressed at relatively high levels in the human

respiratory tracts when compared to other mucin genes [18]. In an attempt to assess

proteins related to epithelium, a list of 111 proteins that matched to human

epithelium when cross-referenced to Uniprot database was generated and it is

available in Supplemental Table V.3, Supporting Information. However, visual

inspection of the generated list reveals this list is not accurate as expected as, for

instance, it lacks proteins as mucins 1, 2 and 4, which are expressed in the epithelium.

Cellular location and functional type of each of the proteins identified by two or more

103

peptides are listed as Supplemental Table V.4A, Supporting Information. These

proteins were also grouped according to their cellular components (Gene Ontology)

and to the number of transmembrane domains by Protein Center (Thermo Fisher

Scientific). This information is available as Supplemental Table V.4B and Supplemental

Figure V.1, Supporting Information, respectively. A very recent and comprehensive

work performed by our group on the nasal epithelium using 2D-LC-MS/MS generated

1482 protein identifications [19]. Comparing this set to all protein identifications in the

present work resulted in an overlap of about one third (702 proteins), while one third

of proteins were uniquely identified in each of the works under comparison. These

results are mainly due to differences in strategies used in both studies, especially with

regard to cell fractionation and separation of the samples before analysis by MS. Venn

diagram displaying this comparison can be found in Supplemental Figure V.2A and

V.2B, Supporting Information, in percentage and in number of protein identifications,

respectively.

Comparative proteomics analysis of COPD patients vs. healthy individuals

In order to assess the proteome of COPD patients in comparison to the one of healthy

individuals, all biological replicates from COPD mild and severe groups were combined

and the same was done for biological replicates from smoker and nonsmoker groups.

After submitting spectral counts of proteins identified within these two groups (COPD

patients and healthy individuals) to a t-test, 47 proteins exhibited a >95% confidence

interval and were therefore considered to be significantly differentially expressed.

However, there was an entry, UniProt acc: P01611, Ig kappa chain V-I region Wes, that

possesses no HGNC symbol, which was not recognized by Ingenuity pathway analysis

(IPA) upon submission and, therefore, 46 proteins were considered in this comparative

analysis. Figure V.1 shows the hierarchical clustering of these proteins exhibiting their

relative expression. Information on the identity of these proteins, on their relative

differentially expression, as well as on its cellular location and functional type is

exhibited in Table V.3. This list of 46 proteins was submitted to the IPA’s Core Analysis,

where 44 proteins were found to be eligible for network analysis and 43 to be eligible

for functions and pathway analysis.

104

Figure V.1: Hierarchical clustering of the significantly differentially expressed

proteins between COPD patients and healthy individuals. Protein abundances are

displayed as normalized expression.

Grouping significantly differentially expressed proteins into networks generated 3

different networks. Statistical scores, number of significantly differentially expressed

proteins, as well as the biological functions associated with these networks are shown

in Table V.4. The network exhibiting higher statistical score incorporates 18 proteins

and was related to cellular assembly and organization, lipid metabolism and small

105

molecule biochemistry (Figure V.2). Networks 2 and 3 are available as Supplemental

Figures V.3 and V.4, Supporting Information Interestingly, the three networks

generated by IPA were interconnected together (Supplemental Figure V.5, Supporting

Information), making it possible to merge them into a single network comprising all the

44 proteins found to be eligible for network analysis (Figure V.3), thus yielding a single

network containing the significantly differentially expressed proteins in COPD patients

when compared to healthy individuals. Information on all proteins displayed in this

network (Figure V.3) is exhibited in Supplemental Table V.5, Supporting Information.

Table V.3: Differentially expressed proteins in COPD patients when compared to

healthy individuals exhibiting a >95% confidence interval. Fold change along with

cellular location and functional type retrieved by Ingenuity knowledgebase

(Ingenuity Systems) are also provided.

Uniprot

Acc.

HGNC

Symbol Entrez Gene Name

COPD/

Healthy Log2

Ratio

Cellular

Location

Functional

Type

P24752 ACAT1 acetyl-CoA acetyltransferase 1 0,85 Cytoplasm enzyme

Q99798 ACO2 aconitase 2, mitochondrial 0,35 Cytoplasm enzyme

Q86TX2 ACOT1 acyl-CoA thioesterase 1 3,17 Cytoplasm enzyme

O43707 ACTN4 actinin, alpha 4 -0,63 Cytoplasm other

P42330 AKR1C3 aldo-keto reductase family 1,

member C3 (3-alpha

hydroxysteroid dehydrogenase,

type II)

1,93 Cytoplasm enzyme

P07355-1 ANXA2 annexin A2 0,27 Plasma

Membrane

other

P08758 ANXA5 annexin A5 0,49 Plasma

Membrane

other

P30042-1 C21orf33 chromosome 21 open reading

frame 33

1,56 Cytoplasm other

P27824 CANX calnexin -1,00 Cytoplasm other

Q13938 CAPS calcyphosine 0,58 Cytoplasm other

P13688-1 CEACAM

1

carcinoembryonic antigen-

related cell adhesion molecule

-3,00 Plasma

Membrane

transmem

brane

106

(includes

others)

1 (biliary glycoprotein) receptor

O00748-1 CES2 carboxylesterase 2 2,03 Cytoplasm enzyme

P12277 CKB creatine kinase, brain 1,94 Cytoplasm kinase

P30084 ECHS1 enoyl CoA hydratase, short

chain, 1, mitochondrial

0,93 Cytoplasm enzyme

P07099 EPHX1 epoxide hydrolase 1,

microsomal (xenobiotic)

-1,00 Cytoplasm peptidase

O75477 ERLIN1 ER lipid raft associated 1 -2,81 Plasma

Membrane

other

Q96BQ1 FAM3D family with sequence similarity

3, member D

-2,59 Extracellular

Space

cytokine

P22570-1 FDXR ferredoxin reductase 3,20 Cytoplasm enzyme

P21333-1 FLNA filamin A, alpha -2,00 Cytoplasm other

Q9HC38-1 GLOD4 glyoxalase domain containing 4 1,66 Cytoplasm enzyme

P43304-1 GPD2 glycerol-3-phosphate

dehydrogenase 2

(mitochondrial)

-3,86 Cytoplasm enzyme

Q9UBQ7 GRHPR glyoxylate

reductase/hydroxypyruvate

reductase

1,90 Cytoplasm enzyme

P00390-1 GSR glutathione reductase 0,41 Cytoplasm enzyme

P09211 GSTP1 glutathione S-transferase pi 1 0,33 Cytoplasm enzyme

P51858 HDGF hepatoma-derived growth

factor

1,70 Extracellular

Space

growth

factor

Q13151 HNRNPA

0

heterogeneous nuclear

ribonucleoprotein A0

2,42 Nucleus other

P61604 HSPE1 heat shock 10kDa protein 1

(chaperonin 10)

1,84 Cytoplasm enzyme

P19013 KRT4 keratin 4 -2,43 Cytoplasm other

P25325 MPST mercaptopyruvate

sulfurtransferase

1,09 Cytoplasm enzyme

P35579-1 MYH9 myosin, heavy chain 9, non-

muscle

-2,18 Cytoplasm enzyme

Q9NR45 NANS N-acetylneuraminic acid

synthase

0,94 Cytoplasm enzyme

Q56VL3-1 OCIAD2 OCIA domain containing 2 -2,32 Cytoplasm other

P05166 PCCB propionyl CoA carboxylase, beta 1,30 Cytoplasm enzyme

107

polypeptide

Q10713 PMPCA peptidase (mitochondrial

processing) alpha

2,64 Cytoplasm peptidase

P30044-1 PRDX5 peroxiredoxin 5 0,68 Cytoplasm enzyme

Q99873-1 PRMT1 protein arginine

methyltransferase 1

2,00 Nucleus enzyme

P12724 RNASE3 ribonuclease, RNase A family, 3 -3,17 Cytoplasm enzyme

Q15393-1 SF3B3 splicing factor 3b, subunit 3,

130kDa

1,78 Nucleus other

P12235 SLC25A4 solute carrier family 25

(mitochondrial carrier; adenine

nucleotide translocator),

member 4

-0,89 Cytoplasm transporte

r

P05141 SLC25A5 solute carrier family 25

(mitochondrial carrier; adenine

nucleotide translocator),

member 5

-3,00 Cytoplasm transporte

r

P11166 SLC2A1 solute carrier family 2

(facilitated glucose

transporter), member 1

0,59 Plasma

Membrane

transporte

r

Q13630 TSTA3 tissue specific transplantation

antigen P35B

0,78 Plasma

Membrane

enzyme

Q16881-1 TXNRD1 thioredoxin reductase 1 0,79 Cytoplasm enzyme

P31930 UQCRC1 ubiquinol-cytochrome c

reductase core protein I

-2,24 Cytoplasm enzyme

P21796 VDAC1 voltage-dependent anion

channel 1

-2,04 Cytoplasm ion

channel

P45880-3 VDAC2 voltage-dependent anion

channel 2

-2,08 Cytoplasm ion

channel

Table V.4: Protein interaction networks generated by IPA from 44 proteins found to

be eligible for network analysis among the 46 significantly differentially expressed

proteins between COPD patients and healthy individuals.

Score Focus

Molecules Top Functions Molecules in Network

108

43 18

Cellular Assembly and

Organization, Lipid

Metabolism, Small Molecule

Biochemistry

ACAT1,ACAT2,Actin,ACTN4,ANG,ANXA2,ANXA5,CANX

,CEACAM1,DHRS2 (includes EG:10202),ERK1/2,F

Actin,FER (includes

EG:2241),FLNA,GAS8,GSTP1,HDGF,HSPE1,Insulin,MYH

9,NFkB (complex),PCCB,PDGF BB,PDGF-

AA,S100P,SLC25A4,SLC25A5,SLC2A1,TCR,TIE1,TIMM1

7A (includes

EG:10440),Tropomyosin,TXNRD1,VDAC1,VDAC2

34 15

Cellular Development,

Hematological System

Development and Function,

Hematopoiesis

ALDH2,ASB9,BCKDHA,BCKDHB,beta-

estradiol,C21ORF33,CBR1,CD209,CKB,CLEC4E,dehydr

oisoandrosterone,ECHS1,EPHX1,ERLIN1,FDXR,GLOD4,

GPD2,GSR,Histone

h3,HSP90AB1,HSPD1,HSPE1,IKBKE,IL4,KRT4,MPST,MT

-

CYB,NANS,NFATC2IP,NR6A1,SF3B3,TNF,TPM3,TRAF6,

UQCRC1

34 15

Carbohydrate Metabolism,

Small Molecule Biochemistry,

Post-Translational

Modification

ACIN1,ACO2,ACOT1 (includes

EG:641371),AKR1C3,ALDH2,BCKDHA,BTG1,C21ORF33,

CAPRIN1,CCT3,CES2 (includes

EG:8824),CSDA,F7,FAM3D,GRHPR,HNF4A,HNRNPA0,

HNRNPR,IDH3B,palmitoyl-CoA

hydrolase,PMPCA,PRDX5,PRMT1,PXR ligand-PXR-

Retinoic acid-

RXRα,RNASE3,SLC2A4,SRPRB,SSSCA1,SUPT5H,SYNCRI

P,TSTA3,VDAC1,VDAC2,YBX2

Table V.5: Top 25 significant biofunctions generated from significantly differentially

expressed proteins on COPD patients when compared to healthy individuals.

Category p-value Molecules

Cellular Assembly and

Organization

4,87E-05-3,97E-

02

CKB,PRMT1,SLC25A4,FLNA,ANXA5,MYH9,ANXA2,ACTN4,VDA

C1

Molecular Transport 8,11E-05-4,8E-02 SLC25A4,SLC2A1,PRDX5,GPD2,ACAT1,MYH9,VDAC1,EPHX1,V

DAC2

Nucleic Acid Metabolism 8,11E-05-4,53E-

02

TSTA3,SLC25A4,GPD2,HSPE1,VDAC1

Small Molecule 8,11E-05-4,8E-02 AKR1C3,PRDX5,ACO2,PCCB,CES2,CKB,PRMT1,GPD2,ANXA5,H

109

Biochemistry SPE1,ACOT1,TSTA3,SLC25A4,ECHS1,SLC2A1,MPST,ANXA2,FD

XR,TXNRD1,ACAT1,MYH9,NANS,VDAC1,EPHX1

Cancer 7,26E-04-3,69E-

02

RNASE3,ECHS1,SLC2A1,AKR1C3,CANX,ANXA2,SLC25A5,FDXR,

VDAC2,Ceacam1,PRMT1,FLNA,ANXA5,HSPE1,ACAT1,EPHX1,

GSTP1

Lipid Metabolism 7,26E-04-4,8E-02 ECHS1,SLC2A1,AKR1C3,ANXA5,ACAT1,ACOT1,MYH9,ANXA2,

PCCB,FDXR,EPHX1

Reproductive System

Disease

7,54E-04-2,61E-

02

SLC2A1,PRDX5,FLNA,ANXA5,ANXA2,VDAC1

Energy Production 7,9E-04-4,53E-02 SLC25A4,GPD2,HSPE1,ACO2,VDAC1,FDXR

Cellular Development 8,36E-04-4,25E-

02

Ceacam1,PRMT1,RNASE3,ANXA5,MYH9,CES2,VDAC1

Cellular Growth and

Proliferation

8,36E-04-4,25E-

02

RNASE3,AKR1C3,SLC2A1,CAPS,ANXA2,CES2,SF3B3,FDXR,TXN

RD1,HDGF,Ceacam1,PRMT1,HNRNPA0,ACAT1,MYH9,ACTN4,

VDAC1

Respiratory System

Development and

Function

8,36E-04-2,85E-

02

PRMT1,RNASE3

Renal and Urological

System Development and

Function

8,66E-04-4,25E-

02

Ceacam1,CES2,ACTN4,VDAC1

Drug Metabolism 2,16E-03-3,41E-

02

GSR,AKR1C3,ANXA2,CES2

Endocrine System

Development and

Function

2,16E-03-3,69E-

02

AKR1C3,FDXR

Amino Acid Metabolism 2,89E-03-3,13E-

02

PRMT1,ANXA2,TXNRD1

Antimicrobial Response 2,89E-03-2,89E-

03

HSPE1

Carbohydrate Metabolism 2,89E-03-3,13E-

02

TSTA3,SLC2A1,GPD2,ANXA5,ACO2,MYH9,ANXA2,NANS

Cardiovascular System

Development and

Function

2,89E-03-4,25E-

02

Ceacam1,SLC2A1,MYH9,ACTN4

Cell Death 2,89E-03-4,51E-

02

SLC25A4,RNASE3,SLC2A1,PRDX5,FDXR,TXNRD1,HDGF,VDAC2

,GSR,Ceacam1,FLNA,HSPE1,MYH9,ACTN4,VDAC1,EPHX1

Cell Morphology 2,89E-03-4,25E- SLC25A4,FLNA,ANXA5,MYH9,ANXA2,ACTN4

110

02

Cell Signaling 2,89E-03-3,69E-

02

FLNA,VDAC1

Cell-To-Cell Signaling and

Interaction

2,89E-03-2,85E-

02

TSTA3,Ceacam1,RNASE3,ANXA5,MYH9,ANXA2,ACTN4,VDAC

1

Cellular Compromise 2,89E-03-1,72E-

02

TSTA3,ANXA5

Cellular Function and

Maintenance

2,89E-03-4,7E-02 CKB,Ceacam1,PRMT1,SLC2A1,FLNA,MYH9,ACTN4,VDAC1,UQ

CRC1

Cellular Movement 2,89E-03-4,8E-02 Ceacam1,SLC2A1,FLNA,ACAT1,MYH9

Figure V.2: Top protein network as obtained from Ingenuity pathway analysis.

Top 25 significant (p<0.05, Fischer’s exact test) biological functions (biofunctions) along

with the proteins involved in each biofunction were also derived from IPA and are

displayed in Table V.5. Cellular assembly and organization was ranked first as it possess

the lowest p-value. Complete information on the full list of biofunctions, including

details on function annotation can be found in Supplemental Table V.6, Supporting

Information. Biofunctions are divided in three brunches when submitted to IPA

according to Ingenuity Knowledgebase: diseases and disorders, molecular and cellular

functions and physiological system development and function. Top 10 significant

111

biofunctions for each of these three brunches, along with information on respective p-

value and proteins involved in each biofunction are displayed in Tables V.6, V.7 and

V.8, respectively. Within diseases and disorders, cancer was the top biofunction

accounting for 17 proteins in total. This means that about 40% of significantly

differentially expressed proteins eligible for biofunctions analysis by IPA are related to

cancer. This is no surprise since cancer is a hot topic and is one of the most studied

subjects of the past decades.

Figure V.3: Merged network comprising all 44 proteins found eligible for networks

analysis by Ingenuity pathway analysis.

Table V.6: Top 10 significant biofunctions within diseases and disorders together with

proteins involved in each biofunction.

Category p-value Molecules

Cancer 7,26E-04-3,69E-02 RNASE3,ECHS1,SLC2A1,AKR1C3,CANX,ANXA2,SLC25A5,

FDXR,VDAC2,Ceacam1,PRMT1,FLNA,ANXA5,HSPE1,ACA

112

T1,EPHX1, GSTP1

Reproductive

System Disease

7,54E-04-2,61E-02 SLC2A1,PRDX5,FLNA,ANXA5,ANXA2,VDAC1

Antimicrobial

Response

2,89E-03-2,89E-03 HSPE1

Connective Tissue

Disorders

2,89E-03-2,89E-03 FLNA

Developmental

Disorder

2,89E-03-4,8E-02 FLNA

Genetic Disorder 2,89E-03-4,8E-02 SLC25A4,SLC2A1,ANXA2,PCCB,FDXR,GSR,GRHPR,FLNA,

HSPE1,ACAT1,MYH9,ACTN4,EPHX1,KRT4

Hematological

Disease

2,89E-03-3,69E-02 RNASE3,GPD2,ANXA5,ACO2,MYH9,ANXA2,EPHX1

Metabolic Disease 2,89E-03-3,69E-02 SLC2A1,ACAT1,PCCB

Renal and Urological

Disease

2,89E-03-4,8E-02 GRHPR,MYH9,ACTN4

Skeletal and

Muscular Disorders

2,89E-03-2,89E-03 FLNA

Table V.7: Top 10 significant biofunctions within molecular and cellular functions

together with proteins involved in each biofunction.

Category p-value Molecules

Cellular Assembly and

Organization

4,87E-05-3,97E-02 CKB,PRMT1,SLC25A4,FLNA,ANXA5,MYH9,ANXA2,ACTN

4,VDAC1

Molecular Transport 8,11E-05-4,8E-02 SLC25A4,SLC2A1,PRDX5,GPD2,ACAT1,MYH9,VDAC1,EP

HX1,VDAC2

Nucleic Acid

Metabolism

8,11E-05-4,53E-02 TSTA3,SLC25A4,GPD2,HSPE1,VDAC1

Small Molecule

Biochemistry

8,11E-05-4,8E-02 AKR1C3,PRDX5,ACO2,PCCB,CES2,CKB,PRMT1,GPD2,AN

XA5,HSPE1,ACOT1,TSTA3,SLC25A4,ECHS1,SLC2A1,MPST

,ANXA2,FDXR,TXNRD1,ACAT1,MYH9,NANS,VDAC1,EPH

X1

Lipid Metabolism 7,26E-04-4,8E-02 ECHS1,SLC2A1,AKR1C3,ANXA5,ACAT1,ACOT1,MYH9,AN

XA2,PCCB,FDXR,EPHX1

Energy Production 7,9E-04-4,53E-02 SLC25A4,GPD2,HSPE1,ACO2,VDAC1,FDXR

113

Cellular Development 8,36E-04-4,25E-02 Ceacam1,PRMT1,RNASE3,ANXA5,MYH9,CES2,VDAC1

Cellular Growth and

Proliferation

8,36E-04-4,25E-02 RNASE3,AKR1C3,SLC2A1,CAPS,ANXA2,CES2,SF3B3,FDXR

,TXNRD1,HDGF,Ceacam1,PRMT1,HNRNPA0,ACAT1,MY

H9,ACTN4,VDAC1

Drug Metabolism 2,16E-03-3,41E-02 GSR,AKR1C3,ANXA2,CES2

Amino Acid

Metabolism

2,89E-03-3,13E-02 PRMT1,ANXA2,TXNRD1

Table V.8: Top 10 significant biofunctions within physiological system development

and function together with proteins involved in each biofunction.

Category p-value Molecules

Respiratory System Development and

Function

8,36E-04-2,85E-02 PRMT1,RNASE3

Renal and Urological System

Development and Function

8,66E-04-4,25E-02 Ceacam1,CES2,ACTN4,VDAC1

Endocrine System Development and

Function

2,16E-03-3,69E-02 AKR1C3,FDXR

Cardiovascular System Development

and Function

2,89E-03-4,25E-02 Ceacam1,SLC2A1,MYH9,ACTN4

Embryonic Development 2,89E-03-1,72E-02 Ceacam1,HSPE1

Nervous System Development and

Function

2,89E-03-1,44E-02 MYH9,VDAC1

Reproductive System Development and

Function

2,89E-03-3,97E-02 Ceacam1,AKR1C3

Skeletal and Muscular System

Development and Function

2,89E-03-4,25E-02 MYH9,VDAC1,HDGF

Tissue Development 2,89E-03-2,85E-02 TSTA3,Ceacam1,MYH9,ACTN4

Tumor Morphology 2,89E-03-3,13E-02 TSTA3,Ceacam1

Cigarette smoking exposes the lung to high concentrations of reactive oxygen species

(ROS) and it is the major risk factor for chronic obstructive pulmonary disease (COPD).

It is estimated that only 15-35% of chronic, continuous cigarette smokers develop

COPD [20-22]. Thus, the majority of long-term smokers do not develop COPD, which

suggests that failure of compensatory mechanisms that protect the lung from ROS or

114

xenobiotic materials contributes to the development of COPD. In this way, expression

of antioxidant proteins believed to be important in protection of the lung from

cigarette smoke-induced injuries such as peroxiredoxin, glutathione S-transferase or

glutathione reductase varies widely in airway epithelial cells harvested from chronic

cigarette smokers [23, 24]. Noteworthy, results obtained from studies performed on

bronchial epithelial cells, were found to be consistent with the ones obtained from

nasal epithelial cells [25]. Recent reports indicate that that ROS interfere with protein

folding in the endoplasmic reticulum (ER), a complex molecular cascade termed

unfolded protein response (UPR) [26]. UPR is one of the signaling pathways comprising

the proteostasis network, which regulates and maintains protein folding and function

in the face of many cellular challenges during cell lifetime [27, 28]. Activation of the

UPR compensates for abnormalities in protein folding by increasing the expression of

genes involved in protein chaperoning and folding, protein translation, and protein

degradation [26]. In the present work, a considerable number of proteins related to

UPR were confidently identified, although most of them did not pass the t-test and

were therefore not significantly differentially expressed. Translational endoplasmic

reticulum ATPase, also known as Vasolin-containing protein (VCP), was found to be

overexpressed in COPD. VCP is associated with cellular functions comprising nuclear

envelope reconstruction, cell cycle, post-mitotic Golgi reassembly, suppression of

apoptosis, DNA damage response and endoplasmic reticulum-associated degradation

(ERAD) [29, 30]. VCP overexpression has been implicated in chronic inflammations of

other lung diseases as cystic fibrosis and lung cancer [31, 32]. It has been reported that

overexpression of VCP in COPD may induce protein aggregation triggering to chronic

oxidative stress, inflammation and apoptosis, which in turn may lead to severe

emphysema [30]. In the present study, both proteins from the calnexin/calreticulin

chaperone system were identified. Calnexin (CANX) is a 90 kDa type I ER membrane

protein and calreticulin (CALR) is a 60 kDa soluble ER lumen protein [33]. Both are

thought to play a central role in quality control of protein folding since they have the

capability of stabilizing nascent proteins until they are properly folded and assembled

or retaining incorrectly folded protein subunits within the ER for degradation by ERAD

mechanisms [22, 33-35]. CANX was found to be significantly underexpressed, while

CALR was found to be overexpressed in COPD patients. Endoplasmin, also known as

115

GRP94 (HSP90B) and 78 kDa glucose-regulated protein, better known as GRP78 or BiP

(HSPA5), members of the GRP78/GRP94 chaperone system were also identified in this

study, but no differential expression was observed between COPD patients and

healthy individuals [33]. Furthermore, we were able to confidently identify most of the

proteins that belong to a large ER-localized multiprotein complex which comprises

DNAJB11, HSP90B1, HSPA5, HYOU1, PDIA2, PDIA4, PDIA6, PPIB, SDF2L1, UGT1A1 and

ERP29 [36]. In fact, HSP90B1, HSPA5, HYOU1, PDIA4, PDIA6, PPIB and ERP 29 were

identified. Although none of these proteins were found to be significantly differentially

expressed when submitted to a t-test, PPIB and ERP29 exhibit overexpression in COPD

patients based on their spectral counts, similarly to what was stated for CALR.

Moreover, both components of the Hsp10/Hsp60 chaperone complex were also

identified in the present study. 10 kDa heat shock protein, mitochondrial (HSPE1, also

known as Hsp10), a protein which functions as a chaperonin was also found to be

overexpressed in COPD. Its structure consists of a heptameric ring which binds to

another HSP, 60 kDa heat shock protein, mitochondrial (HSPD1, also known as Hsp60)

in order to form a symmetric functional heterodimer which enhances protein folding in

an ATP-dependent manner [37, 38]. Its antichaperonin is HSPD1 which was also

observed to be overexpressed in COPD patients, although it was not found to be

significantly differentially expressed according to the employed statistical test. Since

the processes involved in protein transport and folding consume ATP and generate

ROS, the UPR induces expression of a variety of genes involved in processes such as

antioxidant defense, inflammation, xenobiotic metabolism, energy metabolism,

protein synthesis and apoptosis [39-42].

Glutathione S-transferase P (GSTP1), which was found to be overexpressed in the

proteome of nasal epithelial cells of COPD patients when compared to healthy

subjects, is known to play a role in detoxification by catalyzing the conjugation of many

hydrophobic and electrophilic compounds with reduced glutathione and has been

linked to cancer and other diseases including COPD. In fact, upregulation of GSTP1 was

found to be associated with lung carcinoma [43], while recently GSTP1 was found to be

significantly associated with emphysema severity in COPD patients [44]. Lung cancer

and COPD are leading causes of death (second and fourth, respectively), and both are

associated with cigarette smoke exposure. It has been shown that 50-70% of patients

116

diagnosed with lung cancer suffer from COPD, and reduced lung function is an

important event in lung cancer suggesting an association between COPD and lung

cancer [45]. Simultaneous overexpression of GSTP1 and underexpression of epoxide

hydrolase 1 (EPHX1) had been associated with the increase of lung cancer risk among

smokers and, as a matter of fact, we also report underexpression of EPHX1 in COPD

patients, which is consistent with previous observations that lung cancer and COPD do

share some molecular events. Both GSTP1 and EPHX1 are part of the nuclear factor

erythroid 2-related factor 2 (Nrf2)-mediated oxidative stress response, an important

regulator of lung antioxidant defenses, which has been implicated in ER-stress induced

apoptosis in COPD [46]. Another two members of this pathway were found to be

among the significantly differentially expressed proteins: thioredoxin reductase 1,

cytoplasmic (TXNR1) and glutathione-disulfide reductase (GSR), both found to be

overexpressed in COPD patients. Another antioxidant enzyme, peroredoxin-5

mitochondrial (PRDX5), which does not belong to Nrf2-mediated oxidative stress

response, was also observed to be overexpressed in COPD. As stated before, UPR may

also induce xenobiotic metabolism, which have been associated with COPD as a result

of the amount of substances present in the cigarette smoke that are released into the

lungs. Besides GSTP1 and GSR, which were already referred to, aldo-keto reductase

family 1 member 3 (AKR1C3), annexin A2 (ANXA2) and cocaine esterase (CES2) are also

associated with drug metabolism and they were all found to be overexpressed in

COPD.

CONCLUSION

COPD is one of the leading causes of death in the world and has been intensively

studied for the past decades. Current biological-derived samples in use for the

assessment of COPD mechanisms include blood, sputum, bronchoalveolar fluid,

exhaled breath and bronchial biopsies. Nasal epithelial cells collected by nasal brushing

were shown to mimic the lower airway epithelial, having the advantage of being a

noninvasive and not painful procedure. Here, for the first time fresh obtained nasal

epithelial cells were used to perform a proteomic study in COPD. In the present study

we were able to identify 1475 proteins in total, contributing to expand the knowledge

on the proteome of nasal epithelial cells, since we reported 769 proteins that had not

117

been described yet. From the total 1475, 1173 proteins were identified by at least two

peptides and those were the ones that were taken into account towards the

comparative analysis between COPD patients and healthy individuals. Our data

confirmed previous evidences that UPR is activated in COPD patients since we were

able to observe overexpression in a considerable number of proteins associated in

different protein complexes involved in UPR. This includes overexpression of VCP, both

components of the Hsp10/Hsp60 chaperone complex (HSPD1 and HSPE1), CALR and

two members of a large ER-localized multiprotein complex of at least 11 proteins, PPIB

and ERP29. We also observed an increase in expression of proteins related to Nrf2-

mediated oxidative stress response such as GSTP1, TXNRD1 and GSR. Finally, we also

report an increase in drug metabolism, as all significantly differentially expressed

proteins related to this biofunction were overexpressed in COPD: GSTP1, GSR, AKR1C3

and ANXA2. These data needs further validation by orthogonal methods so that the

activation of UPR and Nrf2-mediated oxidative stress response and the increase in

drug metabolism on the nasal epithelial cells of COPD patients is fully confirmed. This

work also emphasize further value of using nasal epithelial cells in COPD pathogenesis

investigation that can lead to identification of new candidate biomarkers for this

disease.

ACKNOWLEDGEMENTS

All volunteers for their cooperation in this work. Work partially supported by FCT-

FEDER POCI/SAU-MMO/56163/2004, FCT/Poly-Annual Funding Program and FEDER-

Saude XXI Program (Portugal). BMA is recipient of FCT doctoral fellowship

(SFRH/BD/31415/2006).

118

REFERENCES

[1] Global initiative for chronic obstructive lung disease 2010.

[2] Barnes, P. J., Celli, B. R., Systemic manifestations and comorbidities of COPD. Eur

Respir J 2009, 33, 1165-1185.

[3] Barnes, P. J., Chronic obstructive pulmonary disease: effects beyond the lungs. PLoS

medicine 2010, 7, e1000220.

[4] Mannino, D. M., Buist, A. S., Global burden of COPD: risk factors, prevalence, and

future trends. Lancet 2007, 370, 765-773.

[5] European Respiratory Society and European Lung Foundation 2003.

[6] Lokke, A., Lange, P., Scharling, H., Fabricius, P., Vestbo, J., Developing COPD: a 25

year follow up study of the general population. Thorax 2006, 61, 935-939.

[7] Standring, S., Gray's Anatomy: The Anatomical Basis of Clinical Practice, 40th

Edition. Chapter 32. Nose, nasal cavity and paranasal sinuses, Churchill Livingstone

2008.

[8] Shaykhiev, R., Bals, R., Interactions between epithelial cells and leukocytes in

immunity and tissue homeostasis. Journal of leukocyte biology 2007, 82, 1-15.

[9] Pahl, A., Preclinical modelling using nasal epithelial cells for the evaluation of herbal

extracts for the treatment of upper airway diseases. Planta Med 2008, 74, 693-696.

[10] Beck, S., Penque, D., Garcia, S., Gomes, A., et al., Cystic fibrosis patients with the

3272-26A-->G mutation have mild disease, leaky alternative mRNA splicing, and CFTR

protein at the cell membrane. Hum Mutat 1999, 14, 133-144.

[11] Roxo-Rosa, M., da Costa, G., Luider, T. M., Scholte, B. J., et al., Proteomic analysis

of nasal cells from cystic fibrosis patients and non-cystic fibrosis control individuals:

search for novel biomarkers of cystic fibrosis lung disease. Proteomics 2006, 6, 2314-

2325.

[12] Gomes-Alves, P., Imrie, M., Gray, R. D., Nogueira, P., et al., SELDI-TOF biomarker

signatures for cystic fibrosis, asthma and chronic obstructive pulmonary disease. Clin

Biochem 2010, 43.

[13] McDougall, C. M., Blaylock, M. G., Douglas, J. G., Brooker, R. J., et al., Nasal

epithelial cells as surrogates for bronchial epithelial cells in airway inflammation

studies. American journal of respiratory cell and molecular biology 2008, 39, 560-568.

119

[14] Penque, D., Mendes, F., Beck, S., Farinha, C., et al., Cystic fibrosis F508del patients

have apically localized CFTR in a reduced number of airway cells. Laboratory

investigation; a journal of technical methods and pathology 2000, 80, 857-868.

[15] Marengo, E., Robotti, E., Bobba, M., Gosetti, F., The principle of exhaustiveness

versus the principle of parsimony: a new approach for the identification of biomarkers

from proteomic spot volume datasets based on principal component analysis.

Analytical and bioanalytical chemistry, 397, 25-41.

[16] McDonald, J. H., Handbook of Biological Statistics, Sparky House Publishing,

Baltimore, MD 2009.

[17] Ferguson, R. E., Carroll, H. P., Harris, A., Maher, E. R., et al., Housekeeping

proteins: a preliminary study illustrating some limitations as useful references in

protein expression studies. Proteomics 2005, 5, 566-571.

[18] Di, Y. P., Harper, R., Zhao, Y., Pahlavan, N., et al., Molecular cloning and

characterization of spurt, a human novel gene that is retinoic acid-inducible and

encodes a secretory protein specific in upper respiratory tracts. The Journal of

biological chemistry 2003, 278, 1165-1173.

[19] Simoes, T., Charro, N., Blonder, J., Faria, D., et al., Molecular profiling of the

human nasal epithelium: A proteomics approach. Journal of proteomics 2011.

[20] Rennard, S. I., Vestbo, J., COPD: the dangerous underestimate of 15%. Lancet

2006, 367, 1216-1219.

[21] Fletcher, C., Peto, R., Tinker, C., Speizer, F. E., The Natural History of Chronic

Bronchitis and Emphysema Oxford University Press, Oxford, UK 1976.

[22] Kelsen, S. G., Duan, X., Ji, R., Perez, O., et al., Cigarette smoke induces an unfolded

protein response in the human lung: a proteomic approach. American journal of

respiratory cell and molecular biology 2008, 38, 541-550.

[23] Hackett, N. R., Heguy, A., Harvey, B. G., O'Connor, T. P., et al., Variability of

antioxidant-related gene expression in the airway epithelium of cigarette smokers.

American journal of respiratory cell and molecular biology 2003, 29, 331-343.

[24] Spira, A., Beane, J., Shah, V., Liu, G., et al., Effects of cigarette smoke on the

human airway epithelial cell transcriptome. Proceedings of the National Academy of

Sciences of the United States of America 2004, 101, 10143-10148.

120

[25] Sridhar, S., Schembri, F., Zeskind, J., Shah, V., et al., Smoking-induced gene

expression changes in the bronchial airway are reflected in nasal and buccal

epithelium. BMC genomics 2008, 9, 259.

[26] Gomes-Alves, P., Neves, S., Penque, D., Signaling pathways of proteostasis

network unrevealed by proteomic approaches on the understanding of misfolded

protein rescue. Methods in enzymology 2011, 491, 217-233.

[27] Powers, E. T., Morimoto, R. I., Dillin, A., Kelly, J. W., Balch, W. E., Biological and

chemical approaches to diseases of proteostasis deficiency. Annual review of

biochemistry 2009, 78, 959-991.

[28] Roth, D. M., Balch, W. E., Modeling general proteostasis: proteome balance in

health and disease. Curr Opin Cell Biol 2011, 23, 126-134.

[29] Vij, N., AAA ATPase p97/VCP: cellular functions, disease and therapeutic potential.

J Cell Mol Med 2008, 12, 2511-2518.

[30] Min, T., Bodas, M., Mazur, S., Vij, N., Critical role of proteostasis-imbalance in

pathogenesis of COPD and severe emphysema. J Mol Med 2011, 89, 577-593.

[31] Yamamoto, S., Tomita, Y., Hoshida, Y., Iizuka, N., et al., Expression level of valosin-

containing protein (p97) is correlated with progression and prognosis of non-small-cell

lung carcinoma. Annals of surgical oncology 2004, 11, 697-704.

[32] Vij, N., Fang, S., Zeitlin, P. L., Selective inhibition of endoplasmic reticulum-

associated degradation rescues DeltaF508-cystic fibrosis transmembrane regulator and

suppresses interleukin-8 levels: therapeutic implications. The Journal of biological

chemistry 2006, 281, 17369-17378.

[33] Ni, M., Lee, A. S., ER chaperones in mammalian development and human diseases.

FEBS letters 2007, 581, 3641-3651.

[34] Caramelo, J. J., Parodi, A. J., Getting in and out from calnexin/calreticulin cycles.

The Journal of biological chemistry 2008, 283, 10221-10225.

[35] Schroder, M., Kaufman, R. J., ER stress and the unfolded protein response.

Mutation research 2005, 569, 29-63.

[36] Meunier, L., Usherwood, Y. K., Chung, K. T., Hendershot, L. M., A subset of

chaperones and folding enzymes form multiprotein complexes in endoplasmic

reticulum to bind nascent proteins. Molecular biology of the cell 2002, 13, 4456-4469.

121

[37] Bross, P., Li, Z., Hansen, J., Hansen, J. J., et al., Single-nucleotide variations in the

genes encoding the mitochondrial Hsp60/Hsp10 chaperone system and their disease-

causing potential. J Hum Genet 2007, 52, 56-65.

[38] Hansen, J. J., Bross, P., Westergaard, M., Nielsen, M. N., et al., Genomic structure

of the human mitochondrial chaperonin genes: HSP60 and HSP10 are localised head to

head on chromosome 2 separated by a bidirectional promoter. Hum Genet 2003, 112,

71-77.

[39] Marciniak, S. J., Ron, D., Endoplasmic reticulum stress signaling in disease. Physiol

Rev 2006, 86, 1133-1149.

[40] Gorlach, A., Klappa, P., Kietzmann, T., The endoplasmic reticulum: folding, calcium

homeostasis, signaling, and redox control. Antioxidants & redox signaling 2006, 8,

1391-1418.

[41] Gregersen, N., Bross, P., Protein misfolding and cellular stress: an overview.

Methods in molecular biology (Clifton, N.J 2010, 648, 3-23.

[42] Harding, H. P., Zhang, Y., Zeng, H., Novoa, I., et al., An integrated stress response

regulates amino acid metabolism and resistance to oxidative stress. Molecular cell

2003, 11, 619-633.

[43] Hayes, J. D., Pulford, D. J., The glutathione S-transferase supergene family:

regulation of GST and the contribution of the isoenzymes to cancer chemoprotection

and drug resistance. Crit Rev Biochem Mol Biol 1995, 30, 445-600.

[44] Kim, W. J., Hoffman, E., Reilly, J., Hersh, C., et al., Association of COPD candidate

genes with computed tomography emphysema and airway phenotypes in severe

COPD. Eur Respir J 2011, 37, 39-43.

[45] Yao, H., Rahman, I., Current concepts on the role of inflammation in COPD and

lung cancer. Current opinion in pharmacology 2009, 9, 375-383.

[46] Malhotra, D., Thimmulappa, R., Vij, N., Navas-Acien, A., et al., Heightened

endoplasmic reticulum stress in the lungs of patients with chronic obstructive

pulmonary disease: the role of Nrf2-regulated proteasomal activity. American journal

of respiratory and critical care medicine 2009, 180, 1196-1207.

123

Chapter VI

Serum Proteomics of Chronic

Obstructive Pulmonary Disease

Patients

124

Serum proteomics of chronic obstructive pulmonary disease patients

Bruno M. Alexandre1, Pang-ning Teng2, Brian L. Hood2, Mai Sun2,

Deborah Penque1* and Thomas P. Conrads2*

1Laboratorio de Proteómica, Departamento de Genética, Instituto Nacional de Saúde

Dr. Ricardo Jorge (INSA-IP), Av. Padre Cruz 1649-016 Lisboa, Portugal and the

2Department of Pharmacology & Chemical Biology, University of Pittsburgh Cancer

Institute, University of Pittsburgh

Keywords: Serum, blood, chronic obstructive pulmonary disease, emphysema, chronic

bronchitis, mass spectrometry, proteomics.

*Corresponding authors: Thomas P. Conrads, Ph.D., 204 Craft Avenue, Suite B401,

Pittsburgh, PA, 15213, Tel: 412-641-7556, Fax: 412-641-2356, E-mail:

[email protected] and Deborah Penque, Ph.D., Laboratório de Proteómica,

Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo

Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8137,

Fax: +351 21752 6410, E-mail: [email protected]

125

ABSTRACT

Chronic obstructive pulmonary disease (COPD) is a major cause of morbidity and

mortality in adults, and its incidence is increasing worldwide. Patients may have

chronic bronchitis, emphysema, small airway disease or a combination of these that

modulate the course of the disease. There is still some ambiguousness concerning the

disease-specific molecular mechanisms of the inflammatory process and acute

exacerbation of COPD. Therefore, potential biomarkers which are specific for COPD

have not been fully identified and validated, even though there is a great need for such

biomarkers. To date no work had employed LC-MS methods to generate

comprehensive data on the serum proteome of COPD patients. A total 33049 peptides

corresponding to 2856 proteins were identified by the powerful shotgun approach

GeLC-MS/MS using a linear ion trap mass spectrometer. We were able to find proteins

potentially related to biological functions that have impact in pathophysiology of

COPD. This includes TRAF3IP2, which is associated with innate immunity in response to

pathogens, inflammatory signals and airway hyperresponsiveness; PLG, reported to be

involved in mechanisms of wound healing and development of pulmonary diseases

such as asthma, cystic fibrosis and COPD itself; GPLD1 and APOE. Further complete

validation and study of some of these proteins will provide a better understanding of

molecular pathways underlying COPD pathogenesis.

126

INTRODUCTION

Chronic obstructive pulmonary disease (COPD) is a major cause of morbidity and

mortality in adults, and its incidence is increasing worldwide. The pathogenesis of

COPD is still poorly understood and is likely to be a complex interplay between genetic

and environmental factors. Persistently decreased forced expiratory volume (FEV1)

and increased forced expiratory time are major diagnostic features of COPD. COPD is

characterized by long-term progressive bronchial airflow obstruction (FEV1/FVC < 0.7

as measured by spirometry), poorly responsive to bronchodilators as this obstruction is

not fully reversible [1]. Patients may have chronic bronchitis, emphysema, small airway

disease or a combination of these that modulate the course of the disease. There is no

treatment available with the capability to stop or revert disease process and heal the

patient. At the present time, therapy is directed to provide as much quality of life as

possible to the patient. In 2000, approximately 2.7 million deaths were caused by

COPD placing this disease as the fourth leading cause of death in the world [2].

Moreover, The Global Burden of Disease Study projected COPD, which was ranked

sixth in 1990, to be placed third among the leading causes of death in the world by the

year 2020 [2]. There is still some ambiguousness concerning the disease-specific

molecular mechanisms of the inflammatory process and acute exacerbation of COPD.

Therefore, potential biomarkers which are specific for COPD have not been fully

identified and validated, even though there is a great need for such biomarkers [3].

Proteomic technologies allow for identification of protein changes caused by the

disease process and recent advances, especially at mass spectrometry and

bioinformatics levels, raise the chances to identify novel putative biomarkers. Serum is

known to perfuse tissues and therefore to be a source of biochemical products that

can be indicative of the physiological status of the individual and even of the disease

status of the patient. Surprisingly, few serum proteomic studies were performed in

COPD. The first proteomic study was published in 2007 and consisted in measuring 143

pre-selected serum biomarkers by protein microarray platform [4]. Our group

conducted an extensive study on serum biomarker signatures of cystic fibrosis, asthma

and COPD patients by SELDI-TOF-MS where we were also able to find peaks

differentiating COPD from controls using Dunn’s comparison test (p<0.05) [5].

However, to date no work has employed LC-MS methods so far in order to generate

127

comprehensive data on the serum proteome of COPD patients. In the present study,

190 COPD patients were divided into four different groups according to the two main

clinical features of this disease – emphysema and chronic bronchitis – and analyzed by

the powerful shotgun approach which combines protein gel and liquid

chromatography separation methods before spectra acquisition by a linear ion trap

mass spectrometer (GeLC-MS/MS).

MATERIALS AND METHODS

Individuals, Sample Collection and Pooling Strategy

Serum samples (n=190 well characterized COPD patients) were collected from

peripheral blood and were provided by Center for Clinical Pharmacology, Department

of Medicine, University of Pittsburgh, Pittsburgh, PA. Detailed clinical parameters

concerning each one of the patients under analysis were also provided in order to

minimize bias when pooling the samples into each of the groups under analysis. In

order to assess the effects the proteome of the two main clinical features of COPD,

emphysema and chronic bronchitis, four different groups were constituted and each

group was further split into two to achieve two biological replicates per group. Main

characteristics of each of the biological replicates are displayed in Table VI.1.

Table VI.1: Main characteristics of the biological replicates for each of the groups

under analysis. “No features” (Group A) refers to emphysema and chronic bronchitis

only. (Biol Rep- Biological Replicate; BMI- body mass index).

Group n Emphysema Chronic

Bronchitis Biol Rep n Age (y) BMI FEV1/FVC(%)

No Features (A) 123 No No A1 62 68.1 ± 6.0 28.6 ± 3.5 64.0 ± 10.6

A2 61 69.5 ± 6.3 28.0 ± 4.1 66.7 ± 8.1

Emphysema (B) 32 Yes No B1 16 66.2 ± 6.4 26.8 ± 3.4 58.7 ± 8.1

B2 16 64.6 ± 6.0 25.0 ± 2.6 53.7 ± 10.7

Chronic

Bronchitis (C) 20 No Yes

C1 10 65.0 ± 7.8 26.6 ± 5.7 55.8 ± 17.4

C2 10 66.9 ± 6.6 27.7 ± 2.9 55.1 ± 14.5

Emphysema and

Chronic

Bronchitis (D)

15 Yes Yes

D1 8 60.1 ± 5.9 23.5 ± 5.0 35.3 ± 5.9

D2 7 67.7 ± 5.4 25.5 ± 4.8 36.7 ± 6.8

128

Sample Preparation for LC-MS/MS

Once pools were constituted, a first spiking step was introduced when beta-

galactosidase (E. coli) was added to a final concentration of 200 fmol/µL. Due to

sample complexity and high dynamic range, samples were immunodepleted of their

top-14 most abundant proteins through a Hu-14 multiple affinity removal system

(MARS, Agilent Technologies, Palo Alto CA, USA) column and resulting F1 and F2

fractions from each sample were pooled together. A second spiking step was

performed by adding chicken ovalbumin to a final concentration of 200 fmol/µL before

depleted samples were buffer-exchanged into 25 mM ammonium bicarbonate using

centrifugal ultrafiltration (3000 molecular weight cut-off) to a final volume of 500 µL.

Protein concentrations were determined by BCA (Pierce). Samples were deglycosylated

through incubation with PNGase F enzyme for 2h at 37 °C. One-dimensional

polyacrilamide gels (1D-PAGE) was performed in the high-resolution pre-cast gel

system XCell SureLock™ Mini-Cell and NuPAGE® Novex® Bis-Tris using pre-casted 4-

12% gels (Invitrogen, Carlsbad CA, USA) at a constant voltage of 150 V for 1h. After

staining with SimplyBlue™ SafeStain, gel images were acquired and evaluated in order

to divide each lane (sample) into 10 bands. Each of these bands was excised

accordingly and destained in a solution of 50% ACN in 50 mM AmB. Reactive cysteine

residues were reduced via rehydration of gel bands in 10 mM DTT and 25 mM AMB

followed by incubation at 56 °C for 45 min and alkyated via incubation in 55mM

iodoacetamide and 25mM AMB for 30 min at ambient temperature in the dark. Bands

were then dehydrated with acetonitrile, rehydrated with sequencing grade porcine

trypsin (Promega, Madison, WI, USA) in 25 mM ammonium bicarbonate and digested

at 37 °C for 16h. Peptide digests were extracted with 70% acetonitrile, 5% formic acid,

dried by vacuum centrifugation and stored at -80 °C until further analysis.

Proteomic analysis by liquid chromatography-tandem mass spectrometry.

Peptide extracts (6 µg per injection) from each gel fraction were resuspended in 0.1%

TFA and separately analysed by liquid chromatography (LC) using a Dionex Ultimate

3000 nanoflow LC system (Dionex Corporation, Sunnyvale, Calif., USA) coupled on-line

to a linear ion trap (LIT) mass spectrometer (MS) (LTQ; ThermoFisher Scientific Inc.,

San Jose, CA, USA). Separation of the sample was performed using a 75-µm inner

129

diameter x 360 outer diameter x 10 cm-long fused silica capillary column (Polymicro

Technologies, Phoenix, AZ, USA) packed in house with 5 µm, 300 Å pore size Jupiter C-

18 stationary phase (Phenomenex, Torrance, CA, USA). Following sample injection

onto a C-18 precolumn (Dionex), the column was washed for 3 min with mobile phase

A (2% acetonitrile, 0.1% formic acid) at a flow rate of 30 µL/min. Peptides were eluted

using a linear gradient of 0.33% mobile phase B (0.1% formic acid in

acetonitrile)/minute for 130 min, then to 95% B in an additional 15 min, all at a

constant flow rate of 200 nL/min. Column washing was performed at 95% B for 15 min

for all runs, after which the column was re-equilibrated in mobile phase A prior to

subsequent injections. The LIT-MS was operated in a data-dependent MS/MS mode in

which each full MS scan (precursor ion selection scan range of m/z 350–1,800) was

followed by seven MS/MS scans where the seven most abundant peptide molecular

ions dynamically determined from the MS scan were selected for tandem MS using a

relative collision-induced dissociation (CID) energy of 35%. Dynamic exclusion was

utilized to minimize redundant selection of peptides for CID.

Peptide Identification and Spectral Count Analysis.

Tandem mass spectra were searched (Thermo Scientific BioWorks 3.3.1 software suite)

on a 72 node Beowulf cluster against the UniProt Homo sapiens proteome database

(October 2008 release, http://www.expasy.org) using SEQUEST (ThermoFisher

Scientific, Inc.). Additionally, peptides were searched for dynamic methionine

oxidation (15.9949) and cysteine carboxyamidomethylation (57.0215) modifications.

Peptides were considered legitimately identified if they achieved specific charge state

and proteolytic cleavage-dependent cross-correlation (Xcorr) scores of 1.9 for [M+H]1+

, 2.2 for [M+2H]2+ , and 3.5 for [M+3H]3+ , and a minimum delta correlation score (Δ Cn)

of 0.08. False positive rate for this dataset is estimated to be less than 5%, in

accordance to probability-based evaluation of peptide and protein identifications from

tandem mass spectra and SEQUEST analysis of the human proteome [6]. Results were

further filtered using software developed in-house, and differences in protein

abundance between samples were derived by summing the total CID events that

130

resulted in a positively identified peptide for a given protein accession across all

samples (spectral counting) [7].

Western Blot Analysis

Primary antibodies used were mouse monoclonal anti-human Apoliprotein E, APOE

(Abcam) and Phosphatidylinositol-glycan-specific phospholipase D, GPLD1 (Abcam).

Secondary antibody was horseradish peroxidase-conjugated goat anti-mouse IgG (L+H)

that was pre-absorbed with serum (Pierce). The serum samples (20 µg) were resolved

by 1D-PAGE (Invitrogen) and transferred to Immobilon-PSQ PVDF membranes

(Millipore) using the Invitrogen Xcell II Blot Module according to the manufacturer’s

protocol. Membranes were blocked with 0.5% low-fat milk (w/v) and incubated in

primary antibody (1:1000 dilution for α-APOE, 1:500 for α-GPLD1) for 16 h at 4 °C.

Membranes were washed with TBS with 0.1% Tween (TBST) and incubated with

secondary antibody (1:50,000 dilution) for 1 h at ambient temperature. After washing

with TBST, the membranes were incubated with either Dura or Femto Super Signal ECL

(ThermoFisher) for 2 or 5 min prior to chemiluminescent exposure.

Enzyme-linked immunosorbent assay (ELISA)

Serum plasminogen (PLG) level was quantified using the Plasminogen (Human) ELISA

kit (ALPCO Immunoassays, Salem, NH) according to the manufacture’s protocol. The

assay was performed in duplicate.

Bioinformatic analysis

Uniprot accessions corresponding to proteins identified by at least two peptides were

mapped to HUGO (HGNC) gene symbols utilizing Ingenuity Pathway Analysis (IPA)

(Ingenuity® Systems, www.ingenuity.com). Protein localization and subtype

assignments were derived from IPA-mapped data sets. IPA and ProteinCenter (Thermo

Fischer Scientific) were used to retrieve information on proteins annotation according

to cellular component, biological process and molecular function using default

parameters.

131

RESULTS AND DISCUSSION

Serum Proteomics of COPD patients

Serum is a biological material of high interest to be used in proteomics studies,

especially clinical proteomics studies. Blood serum is readily accessible in contrast to

tissue or organ biospecimens. Serum constantly perfuses all tissues of the body in a

dynamic exchange of molecules, making serum a rich environment and therefore a

very attractive biological material for proteomics-based biomarker research [8-14].

Another advantage is that serum possesses a high protein content, i.e. 60–80 mg of

protein/mL [15]. The major protein constituents of serum include albumin, which

accounts for about 55% of total protein mass alone, immunoglobulins, transferrin,

haptoglobin, and lipoproteins [12, 15]. Thousands of different proteins are present in

serum, but their concentration varies dramatically. In fact, dynamic range in protein

concentration differs in ten orders of magnitude, between albumin and interleukin-6

[12, 14]. Hence, the ability to detect proteins that are present at low concentrations is

critical to the discovery of new biomarkers of disease. While mass spectrometry is an

ideal tool to identify and quantify proteins in blood, abundant peptides may be

detected with greater intensity and therefore mask less abundant proteins [16]. To

overcome this issue and be able to investigate into the low abundant proteins where

tissue leakage derived proteins reside, serum samples were immunodepleted from the

14 abundant proteins, which altogether account for about 94% of serum protein mass.

An additional step on N-deglycosylation was also introduced to reduce sample

complexity and allow a higher number of confident identifications by mass

spectrometry. Samples were separated first by 1D-PAGE on a pre-casted 4-12%

gradient gel and later by LC coupled online to a LIT mass spectrometer. Main steps of

the whole workflow are shown in Figure VI.1.

Two biological replicates were constituted for each of the groups under analysis (Table

VI.1) and two injections corresponding to 6 µg total protein from each of the biological

replicates were analyzed by LC-LIT-MS/MS resulting in the identification of a total

33049 peptides and 2856 proteins (Supplemental Table VI.1, Supporting Information),

of which 929 were identified by at least two peptides across the samples under

analysis (Supplemental Table VI.2, Supporting Information).

132

Figure VI.1: Basic scheme of the methodology employed to study COPD patients’

proteome.

Protein digests equivalency was determined by comparing the total number of

peptides identified (total spectral counts) in each of the analytical samples, which

resulted in a calculated relative standard deviation (RSD) of 16.1%. This was also

133

evaluated through comparison of total peptides identified for chicken ovalbumin,

which was added in equal amounts to each of the analyzed samples according to the

workflow displayed in Figure VI.1. Calculated RSD for ovalbumin was 19.1%, which is

consistent with the RSD obtained for total spectral counts.

Cellular location and functional type of each of the proteins identified by two or more

peptides were obtained from Ingenuity pathway analysis (IPA) and full information is

available as Supplemental Table VI.3, Supporting Information. Cellular location of

proteins identified by at least two peptides is also displayed in Figure VI.2.

Figure VI.2: Cellular location of proteins identified by two or more peptides.

Interestingly, the larger slice of the chart represents proteins whose cellular location is

unknown (30%). This is a result of the lack of information acquired so far for a vast

number of proteins deposited in protein databases and enforces the need for more

and more biochemical and proteomics studies that will raise our current knowledge to

a higher level. As for the annotated proteins, most of the proteins were assigned to the

cytoplasm (26%).

Information on the profile of the number of transmembrane domains obtained for

proteins identified by at least two peptides is displayed in Supplemental Figure VI.1,

Supporting Information.

134

Comparative proteomics analysis of COPD patients

COPD patients’ samples were pooled into four different groups (Table VI.1) according

to each patient clinical data concerning chronic bronchitis and emphysema: a group of

patients that exhibited none of these two features prior to sample collection (Group

A); a group containing patients that have emphysema, but not chronic bronchitis

(Group B); a group of patients with chronic bronchitis but no emphysema (Group C);

and finally, a group of patients that possessed both chronic bronchitis and emphysema

preceding sample collection (Group D).

Using bioinformatics tools as IPA (Ingenuity Systems) or ProteinCenter (ThermoFisher),

it was possible to identify proteins associated with diverse biological processes such as

cell communication, defense response, response to stimulus or response to wounding.

For instance, von Willebrand factor (VWF, UniProt acc: P04275), a glycoprotein that is

involved in blood coagulation system, has been reported to be down regulated in

pulmonary adenocarcinoma individuals [17]. In the present study COPD patients with

both emphysema and chronic bronchitis revealed underexpression of this protein

when compared to the other three groups under analysis.

N-methyl-D-aspartate receptor 2C subunit (NMDAR2C, also known as GRIN2C, UniProt

Acc: O15398) which belongs to a class of ionotropic glutamate receptors has been

associated with tobacco disorders and with dry cough. In fact, Dextromethorphan, an

antagonist of human GRIN2C protein has been approved for the treatment of dry

cough (http://www. drugbank.ca/drugs/DB00514). We found GRIN2C underexpressed

in COPD patients with chronic bronchitis (group C) and both chronic bronchitis and

emphysema (group D).

Eighty-six proteins were identified by analysis of variance (ANOVA) with statistical

significant (p<0.05) differential abundances in spectral counts across the four groups

under analysis (Supplemental Table VI.4, Supporting Information). A supervised

hierarchical cluster analysis was performed on these 86 significantly differentially

expressed proteins (Figure VI.3).

Adapter protein CIKS isoform 3 (TRAF3IP2, UniProt acc: Q7Z6Q1), was found to be

overexpressed in COPD patients with emphysema (group B) and in patients with both

emphysema and chronic bronchitis (group D). TRAF3IP2 is involved in regulating

responses by cytokines by members of the Rel/NFkB transcription factor family, which

135

play a central role in innate immunity in response to pathogens, inflammatory signals

and stress and has also been implicated in airway hyperresponsiveness [18].

Figure VI.3: Hierarchical clustering exhibiting relative abundance of the eighty-six

significantly differentially expressed proteins across the four groups under analysis.

Plasminogen (PLG, UniProt acc: 00747) was found to be underexpressed in COPD

patients that suffer simultaneously from emphysema and chronic bronchitis (group D),

while it maintained about the same expression level over the three other groups of

COPD patients. Activation of the fibrinolytic system is dependent on the conversion of

the plasma zymogen, PLG, to the serine protease plasmin by the physiological

activators urokinase-type PLG activator (uPA) or tissue-type plasminogen activator

(tPA) [19]. Besides regulation of vascular patency by degrading fibrin-containing

thrombi, this system have been reported to be involved in other functions with impact

in important events as embryogenesis, angiogenesis, tumor growth and dissemination

and wound healing [19]. Moreover, impaired fibrinolytic activity is an underlying

feature in the development of pulmonary diseases [19], including COPD [20, 21].

136

Isoform 1 of phosphatidylinositol-glycan-specific phospholipase D (GPLD1, UniProt acc:

P80108) presented the same behavior described for PLG, i.e. it was found to be

underexpressed in COPD patients diagnosed with emphysema and chronic bronchitis

and it showed the same expression level across the three remaining groups. GPLD1 is a

GPI degrading enzyme that hydrolyzes the inositol phosphate linkage in proteins

anchored by phosphatidylinositol glycans, thereby releasing the attached protein from

the plasma membrane. Prostasin is a trypsin-like serine peptidase, highly expressed in

prostate, bronchus, and kidney [22, 23]. It has been shown that prostasin secretion

depends on GPI anchor cleavage by endogenous GPLD1 [22]. Prostasin is also known as

channel-activating protease (CAP)-1 and was the first of several membrane serine

peptidases found to activate the epithelium-sodium channel (ENaC) [24]. ENaC

function is tightly regulated and is critical for maintaining salt and fluid balance in the

lung [25]. Prostasin has been reported to have a critical role in regulating epithelial

sodium transport in normal and pathological conditions in the lung, since it has been

observed that prostasin is highly expressed in cystic fibrosis airways and is a strong

basal activator of ENaC in cystic fibrosis airway epithelial cells [26, 27].

Apolipoprotein E (APOE, UniProt acc: P02649) was found to be overexpressed in COPD

patients diagnosed with emphysema (group B) or chronic bronchitis (group C) or a

combination of both (group D), when compared to COPD patients that were not

diagnosed with emphysema or chronic bronchitis (group A). APOE plays a role in

cholesterol metabolism and is linked to cardiovascular diseases [28]. Cardiovascular

diseases are high-prevalent comorbidities of COPD [29]. Recently, it has been shown

by 2D gel-based proteomics approach overexpression of plasma APOE in COPD

patients when compared with healthy controls [30]. Here, we confirm this fact but also

suggest that overexpression of APOE may be associated with more severe form of

COPD, i.e., diagnosed with emphysema or chronic bronchitis.

Validation of protein abundances

Four proteins were selected for validation by western blot (WB), enzyme-linked

immunosorbent assay (ELISA) or single reaction monitoring (SRM), a new MS-based

strategy for a robust quantification [31]. Selected proteins were PLG, APOE, GPLD1 and

FCGBP.

137

PLG underexpression in COPD patients simultaneously suffering from emphysema and

chronic bronchitis (group D) was successfully validated by commercially available ELISA

kit. (Table VI.2, Figure VI.4).

Table VI.2: ELISA determination of serum PLG in COPD patients.

Sample Peptide spectral

counts (discovery)

PLG protein

concentration µg/mL

(ELISA)

A1 87 145,5

A2 108 121,4

B1 87 137,3

B2 79 121,5

C1 78 137,7

C2 76 149,0

D1 49 94,9

D2 40 110,2

Figure VI.4: ELISA determination of serum plasminogen in COPD patients. X-axes

labels refer to nomenclature displayed in Table VI.1.

138

APOE and GPLD1 were evaluated by WB analysis but unsuccessfully. The densitometry

analysis of APOE immunoreactive signal at the expected molecular mass (38 kDa)

showed no concordance with MS spectral count (Figure VI.5). One possible reason for

this is the presence of another variant of APOE below 28 kDa that could have

contributed to the mismatch between these two techniques. The antibody

immunoreactions for GPLD1 showed no signal by western blot (data not shown).

Validations of FCGBP, as well as, all those three aforementioned proteins are under

progress by SRM analysis. Selected peptides for this validation are displayed in

Supplemental Table 5, Supporting Information.

Figure VI.5: Western blot analysis of APOE. Primary antibody dilution: 1:1,000.

Secondary antibody dilution: 1:50,000.

CONCLUSION

Serum is widely used nowadays for assessing the diagnosis and follow up of many

diseases. It is ready available and it contains information derived from virtually every

part of the human body.

COPD is one of the leading causes of death in the world and has been the subject of

many studies. However, serum proteome of COPD patients has been lacking

comprehensive studies. In the present work, a powerful LC-MS approach was

employed for the first time to enlarge previous knowledge on COPD serum proteome.

We were able to identify a total 33049 peptides corresponding to 2856 proteins, of

which 929 were identified by two or more peptides across the four groups of COPD

patients under study, which were separated according to previous diagnosis of

emphysema and chronic bronchitis. However, many of the identified proteins still lack

crucial information related to cellular location or biological function in protein

139

databases. This is a major downside in proteomics research, especially in an era where

sophisticated mass spectrometers and powerful bioinformatics tools are on hand.

Nevertheless, we were able to find proteins related to biological functions that may

have impact in pathophysiology of COPD. This includes TRAF3IP2, which is associated

with innate immunity in response to pathogens, inflammatory signals and stress and

has also been implicated in airway hyperresponsiveness; PLG, reported to be involved

in mechanisms of wound healing and development of pulmonary diseases such as

asthma, cystic fibrosis and COPD itself; GPLD1, a GPI degrading enzyme associated with

cell homeostasis balance; and APOE, a protein of cholesterol metabolism related to

cardiovascular diseases. PLG differential abundance was successfully validated by

ELISA. Additional validations by SRM, a robust quantitative MS-based approach, are

under progress. Further more focused and dedicated studies on relevant proteins

highlighted here will certainly provide new insights into COPD pathological

mechanisms and/or provide therapeutic and/or diagnostic tools for COPD.

140

ACKNOWLEDGEMENTS

BMA would like to acknowledge Center for Clinical Pharmacology, Department of

Medicine, University of Pittsburgh for providing serum samples and patient clinical

data. BMA is recipient of FCT doctoral fellowship (SFRH/BD/31415/2006).

141

REFERENCES

[1] Global initiative for chronic obstructive lung disease 2010.

[2] Lopez, A. D., Shibuya, K., Rao, C., Mathers, C. D., et al., Chronic obstructive

pulmonary disease: current burden and future projections. Eur Respir J 2006, 27, 397-

412.

[3] Chen, H., Wang, D., Bai, C., Wang, X., Proteomics-based biomarkers in chronic

obstructive pulmonary disease. Journal of proteome research 2010, 9, 2798-2808.

[4] Pinto-Plata, V., Toso, J., Lee, K., Park, D., et al., Profiling serum biomarkers in

patients with COPD: associations with clinical parameters. Thorax 2007, 62, 595-601.

[5] Gomes-Alves, P., Imrie, M., Gray, R. D., Nogueira, P., et al., SELDI-TOF biomarker

signatures for cystic fibrosis, asthma and chronic obstructive pulmonary disease. Clin

Biochem 2010, 43.

[6] Qian, W. J., Liu, T., Monroe, M. E., Strittmatter, E. F., et al., Probability-based

evaluation of peptide and protein identifications from tandem mass spectrometry and

SEQUEST analysis: the human proteome. Journal of proteome research 2005, 4, 53-62.

[7] Liu, H., Sadygov, R. G., Yates, J. R., 3rd, A model for random sampling and

estimation of relative protein abundance in shotgun proteomics. Anal Chem 2004, 76,

4193-4201.

[8] Adkins, J. N., Varnum, S. M., Auberry, K. J., Moore, R. J., et al., Toward a human

blood serum proteome: analysis by multidimensional separation coupled with mass

spectrometry. Mol Cell Proteomics 2002, 1, 947-955.

[9] Kennedy, S., Proteomic profiling from human samples: the body fluid alternative.

Toxicol Lett 2001, 120, 379-384.

[10] Schrader, M., Schulz-Knappe, P., Peptidomics technologies for human body fluids.

Trends Biotechnol 2001, 19, S55-60.

[11] Zhang, H., Liu, A. Y., Loriaux, P., Wollscheid, B., et al., Mass spectrometric

detection of tissue proteins in plasma. Mol Cell Proteomics 2007, 6, 64-71.

[12] Anderson, N. L., Anderson, N. G., The human plasma proteome: history, character,

and diagnostic prospects. Mol Cell Proteomics 2002, 1, 845-867.

[13] Anderson, N. L., The clinical plasma proteome: a survey of clinical assays for

proteins in plasma and serum. Clinical chemistry 2010, 56, 177-185.

142

[14] Surinova, S., Schiess, R., Huttenhain, R., Cerciello, F., et al., On the development of

plasma protein biomarkers. Journal of proteome research 2011, 10, 5-16.

[15] Burtis, C. A., Ashwood, E. R., Tietz Fundamentals of Clinical Chemistry, 5th Ed.,

W.B. Saunders Company, Philadelphia, PA 2001.

[16] Tucholska, M., Bowden, P., Jacks, K., Zhu, P., et al., Human serum proteins

fractionated by preparative partition chromatography prior to LC-ESI-MS/MS. Journal

of proteome research 2009, 8, 1143-1155.

[17] Stearman, R. S., Dwyer-Nield, L., Zerbe, L., Blaine, S. A., et al., Analysis of

orthologous gene expression between human pulmonary adenocarcinoma and a

carcinogen-induced murine model. The American journal of pathology 2005, 167,

1763-1775.

[18] Zhao, Z., Qian, Y., Wald, D., Xia, Y. F., et al., IFN regulatory factor-1 is required for

the up-regulation of the CD40-NF-kappa B activator 1 axis during airway inflammation.

J Immunol 2003, 170, 5674-5680.

[19] Castellino, F. J., Ploplis, V. A., Structure and function of the plasminogen/plasmin

system. Thromb Haemost 2005, 93, 647-654.

[20] Stewart, C. E., Sayers, I., Characterisation of urokinase plasminogen activator

receptor variants in human airway and peripheral cells. BMC Mol Biol 2009, 10, 75.

[21] Jiang, Y., Xiao, W., Zhang, Y., Xing, Y., Urokinase-type plasminogen activator

system and human cationic antimicrobial protein 18 in serum and induced sputum of

patients with chronic obstructive pulmonary disease. Respirology (Carlton, Vic 2010,

15, 939-946.

[22] Verghese, G. M., Gutknecht, M. F., Caughey, G. H., Prostasin regulates epithelial

monolayer function: cell-specific Gpld1-mediated secretion and functional role for GPI

anchor. Am J Physiol Cell Physiol 2006, 291, C1258-1270.

[23] Verghese, G. M., Tong, Z. Y., Bhagwandin, V., Caughey, G. H., Mouse prostasin

gene structure, promoter analysis, and restricted expression in lung and kidney.

American journal of respiratory cell and molecular biology 2004, 30, 519-529.

[24] Vallet, V., Chraibi, A., Gaeggeler, H. P., Horisberger, J. D., Rossier, B. C., An

epithelial serine protease activates the amiloride-sensitive sodium channel. Nature

1997, 389, 607-610.

143

[25] Schild, L., Kellenberger, S., Structure function relationships of ENaC and its role in

sodium handling. Advances in experimental medicine and biology 2001, 502, 305-314.

[26] Donaldson, S. H., Hirsh, A., Li, D. C., Holloway, G., et al., Regulation of the

epithelial sodium channel by serine proteases in human airways. The Journal of

biological chemistry 2002, 277, 8338-8345.

[27] Tong, Z., Illek, B., Bhagwandin, V. J., Verghese, G. M., Caughey, G. H., Prostasin, a

membrane-anchored serine peptidase, regulates sodium currents in JME/CF15 cells, a

cystic fibrosis airway epithelial cell line. American journal of physiology 2004, 287,

L928-935.

[28] Martins, I. J., Hone, E., Foster, J. K., Sunram-Lea, S. I., et al., Apolipoprotein E,

cholesterol metabolism, diabetes, and the convergence of risk factors for Alzheimer's

disease and cardiovascular disease. Mol Psychiatry 2006, 11, 721-736.

[29] Dalal, A. A., Shah, M., Lunacsek, O., Hanania, N. A., Clinical and economic burden

of patients diagnosed with COPD with comorbid cardiovascular disease. Respiratory

medicine 2011.

[30] Bandow, J. E., Baker, J. D., Berth, M., Painter, C., et al., Improved image analysis

workflow for 2-D gels enables large-scale 2-D gel-based proteomics studies--COPD

biomarker discovery study. Proteomics 2008, 8, 3030-3041.

[31] Picotti, P., Rinner, O., Stallmach, R., Dautel, F., et al., High-throughput generation

of selected reaction-monitoring assays for proteins and proteomes. Nature methods

2010, 7, 43-46.

145

Chapter VII

Concluding Remarks and Future

Perspectives

146

COPD is at the present time the fourth leading cause of death in the world and it is the

only major cause of death that has been increasing in the past decades while the

others have been decreasing. COPD also has a big impact in economy due to

hospitalization and healthcare related costs, but also the costs of work absence not

only from the patients, but also from their relatives. Therefore, it has been the subject

for many studies and much has been understood in the past years. However, diagnosis

is still performed by measuring lung functions through spirometry and although many

lives could be saved if there was a worldwide spirometry screening, patients are still

being diagnosed by nothing more than the evaluation of their lung function without

any screening of the biomolecules responsible for their disease status.

There are quite a lot biological materials that can be used to investigate biomarkers for

this disease. Although COPD is now known to possess a systemic inflammation

component which is responsible for affecting other organs, it is in the lung that the

events that lead to breathless take place. Investigating to the lung directly is therefore

an optimal strategy to be able to identify proteins that may not be detectable

elsewhere either because they are not present or diluted into undetectable

concentrations. But this means that lung tissue has to be collected by biopsy which is

an extremely invasive technique. But besides tissue biospecimens, other sources of

biological materials used to study COPD is biofluids which includes sputum,

bronchoalveolar and nasal lavage fluid, exhaled breath condensate, and blood.

Proteomics has the capacity to provide large-scale information and consequently it has

the potential to expand previous knowledge on COPD. Surprisingly, given the need for

new biomarkers in COPD and the power of proteomics, proteomics have been quite

neglected to present. Hence, in this work we combined different biospecimens and

proteomics methodologies to provide new insight into the disease.

It had been observed before by means of microscopy that red blood cells (RBCs) from

COPD patients showed deformations in their shape. RBCs are crucial to the uptake of

oxygen from the lungs to the cells and this transport is dependent on their ability to

change shapes rapidly while navigating through blood vessels. In addition, RBCs play a

147

crucial role in antioxidant defense when fighting against oxidative stress, which has

long been recognized as feature of COPD. In this work we made use of a membrane

fractionation procedure, stable isotope labeling and a high-resolution fourier

transformed - ion cyclotron resonance (FT-ICR) mass spectrometer. Chorein or

Vacuolar protein sorting-associated protein 13A (VPS13A) is reported to play a role in

the cytoskeleton organization has been associated with thorny deformations of

circulating erythrocytes, possibly due to red cell membranes deformation. This protein

was found to be underexpressed in COPD patients when compared to controls by MS

and this underexpression was confirmed by WB. Consequently, underexpression of

chorein may play an important role in the deformation of COPD RBCs. Many other

interesting proteins were identified in the context of COPD and, additionally, there

were a considerable number of proteins described in RBC for the first time.

To overcome the difficulty of acquiring fresh biopsies of well characterized patients, in

our laboratory we have established a procedure to capture nasal epithelial cells. In

fact, in previous works it has been shown that these cells presented similar behavior to

the epithelial cells of the lower airway. Two different types of studies were presented

from these cells: a study performed on the effects of cigarette smoke, which is the

main risk factor for COPD and a comparison study between COPD patients and healthy

individuals. Both were pioneer studies since the nasal epithelial cells proteome of

cigarette smoker subjects or COPD patients had not been assessed. Moreover, in both

studies a high-resolution mass analyzer, the orbitrap, was employed which increases

the number of confident peptide/protein identifications. In the study on the effect of

cigarette smoking, ninety-six proteins were found to be differentially expressed

between the proteomes of healthy smokers and nonsmokers. These proteins were

related to processes of antigen presentation, cell-to-cell signaling and interaction, cell

morphology, drug metabolism, DNA repair, energy production or mitochondrial

dysfunction. Although requiring further orthogonal validation, our data was consistent

with previous evidences showing CD44, MUC5AC or SOD2 differential modulation in

smokers due to inflammatory response pathways. When studying the nasal epithelial

cells proteome of COPD patients compared to healthy individuals, previous evidences

that UPR is activated in COPD patients were confirmed since we were able to observe

148

overexpression in a considerable number of proteins involved in different protein

complexes involved in UPR. This includes overexpression of VCP, both components of

the Hsp10/Hsp60 chaperone complex (HSPD1 and HSPE1), CALR and two members of a

large ER-localized multiprotein complex of at least 11 proteins, PPIB and ERP29. We

also observed an increase in expression of proteins related to Nrf2-mediated oxidative

stress response such as GSTP1, TXNRD1 and GSR. Finally, we also report an increase in

drug metabolism, as all significantly differentially expressed proteins related to this

biofunction were overexpressed in COPD: GSTP1, GSR, AKR1C3 and ANXA2. These data

needs further validation by orthogonal methods so that the activation of UPR and

Nrf2-mediated oxidative stress response and the increase in drug metabolism on the

nasal epithelial cells of COPD patients is fully confirmed.

Serum collected from COPD patients was divided into 4 different groups in all different

combinations of presence/absence of the two main features of COPD, chronic

bronchitis and emphysema, to study their impact in the serum proteome. Due to its

complex protein mixture, serum was first immunodepleted from its most abundant

proteins comprising about 94% of total protein content before being analyzed by

GeLC-MS/MS. This powerful strategy was able to identify as many as 2856 proteins, of

which 929 were identified by two or more peptides. Plasminogen was found to be

underexpressed in COPD patients that suffer simultaneously from emphysema and

chronic bronchitis, while it maintained about the same expression level over the three

other groups of COPD patients and this differential expression was successfully

validated by ELISA. It was possible to identify other interesting proteins as TRAF3IP2,

which is associated with innate immunity in response to pathogens, inflammatory

signals and stress and has also been implicated in airway hyperresponsiveness or

Isoform 1 of phosphatidylinositol-glycan-specific phospholipase D (GPLD1), which is

GPI degrading enzyme that was described to be responsible for secretion of prostasin,

which and was the first of several membrane serine peptidases found to activate the

epithelium-sodium channel (ENaC). Prostasin was also reported to have a critical role

in regulating epithelial sodium transport in normal and pathological conditions in the

lung.

149

The work herein presented confirmed a few findings that had already been reported

and at the same time revealed many new possibilities for disease mechanisms and also

for new biomarkers. Some data requires further validation by orthogonal techniques,

but there is no doubt that this work has shed the light into proteins and even

processes that had not been associated to COPD before. This work also emphasizes

further value of using nasal epithelial cells in COPD pathogenesis investigation since it

can lead to identification of new candidate biomarkers for this disease.

150

REFERENCES

[1] Petty, T. L., The history of COPD. International journal of chronic obstructive

pulmonary disease 2006, 1, 3-14.

[2] Tiffeneau, R., Pinelli, [Not Available]. Paris Med 1947, 37, 624-628.

[3] Gaensler, E. A., Air velocity index; a numerical expression of the functionally

effective portion of ventilation. Am Rev Tuberc 1950, 62, 17-28.

[4] Briscoe, W. A., Nash, E. S., The Slow Space in Chronic Obstructive Pulmonary

Diseases. Annals of the New York Academy of Sciences 1965, 121, 706-722.

[5] Viegi, G., Pistelli, F., Sherrill, D. L., Maio, S., et al., Definition, epidemiology and

natural history of COPD. Eur Respir J 2007, 30, 993-1013.

[6] Vestbo, J., Lange, P., Can GOLD Stage 0 provide information of prognostic value in

chronic obstructive pulmonary disease? American journal of respiratory and critical

care medicine 2002, 166, 329-332.

[7] Barnes, P. J., Shapiro, S. D., Pauwels, R. A., Chronic obstructive pulmonary disease:

molecular and cellular mechanisms. Eur Respir J 2003, 22, 672-688.

[8] Barnes, P. J., Stockley, R. A., COPD: current therapeutic interventions and future

approaches. Eur Respir J 2005, 25, 1084-1106.

[9] Global initiative for chronic obstructive lung disease 2010.

[10] Lopez, A. D., Shibuya, K., Rao, C., Mathers, C. D., et al., Chronic obstructive

pulmonary disease: current burden and future projections. Eur Respir J 2006, 27, 397-

412.

[11] World health organization, Geneva 2000.

[12] European Respiratory Society and European Lung Foundation 2003.

[13] Jemal, A., Ward, E., Hao, Y., Thun, M., Trends in the leading causes of death in the

United States, 1970-2002. JAMA 2005, 294, 1255-1259.

[14] Pauwels, R. A., Rabe, K. F., Burden and clinical features of chronic obstructive

pulmonary disease (COPD). Lancet 2004, 364, 613-620.

[15] 2007.

[16] Novos Dados da DPOC em Portugal (BOLD initiative), Lisbon 2010.

[17] in: Schraufnagel, D. E. (Ed.), American Thoracic Society 2010.

[18] Eriksen, D. J. M. D. M. (Ed.), The Tobacco Atlas, World Health Organization 2002.

151

[19] Lokke, A., Lange, P., Scharling, H., Fabricius, P., Vestbo, J., Developing COPD: a 25

year follow up study of the general population. Thorax 2006, 61, 935-939.

[20] Stoller, J. K., Aboussouan, L. S., Alpha1-antitrypsin deficiency. Lancet 2005, 365,

2225-2236.

[21] Molfino, N. A., Current thinking on genetics of chronic obstructive pulmonary

disease. Curr Opin Pulm Med 2007, 13, 107-113.

[22] Molfino, N. A., Genetics of COPD. Chest 2004, 125, 1929-1940.

[23] Molfino, N. A., Coyle, A. J., Gene-environment interactions in chronic obstructive

pulmonary disease. International journal of chronic obstructive pulmonary disease

2008, 3, 491-497.

[24] Wood, A. M., Stockley, R. A., The genetics of chronic obstructive pulmonary

disease. Respiratory research 2006, 7, 130.

[25] Heaney, L. G., Lindsay, J. T., McGarvey, L. P., Inflammation in chronic obstructive

pulmonary disease: implications for new treatment strategies. Curr Med Chem 2007,

14, 787-796.

[26] Hogg, J. C., Chu, F., Utokaparch, S., Woods, R., et al., The nature of small-airway

obstruction in chronic obstructive pulmonary disease. The New England journal of

medicine 2004, 350, 2645-2653.

[27] Lapperre, T. S., Postma, D. S., Gosman, M. M., Snoeck-Stroband, J. B., et al.,

Relation between duration of smoking cessation and bronchial inflammation in COPD.

Thorax 2006, 61, 115-121.

[28] Lapperre, T. S., Sont, J. K., van Schadewijk, A., Gosman, M. M., et al., Smoking

cessation and bronchial epithelial remodelling in COPD: a cross-sectional study.

Respiratory research 2007, 8, 85.

[29] Roth, M., Pathogenesis of COPD. Part III. Inflammation in COPD. Int J Tuberc Lung

Dis 2008, 12, 375-380.

[30] Barnes, P. J., Mediators of chronic obstructive pulmonary disease.

Pharmacological reviews 2004, 56, 515-548.

[31] Barnes, P. J., The cytokine network in chronic obstructive pulmonary disease.

American journal of respiratory cell and molecular biology 2009, 41, 631-638.

[32] Mak, J. C., Pathogenesis of COPD. Part II. Oxidative-antioxidative imbalance. Int J

Tuberc Lung Dis 2008, 12, 368-374.

152

[33] MacNee, W., Pathogenesis of chronic obstructive pulmonary disease. Proceedings

of the American Thoracic Society 2005, 2, 258-266; discussion 290-251.

[34] Rahman, I., Adcock, I. M., Oxidative stress and redox regulation of lung

inflammation in COPD. Eur Respir J 2006, 28, 219-242.

[35] MacNee, W., Oxidative stress and lung inflammation in airways disease. European

journal of pharmacology 2001, 429, 195-207.

[36] Agusti, A., Soriano, J. B., COPD as a systemic disease. Copd 2008, 5, 133-138.

[37] Gan, W. Q., Man, S. F., Senthilselvan, A., Sin, D. D., Association between chronic

obstructive pulmonary disease and systemic inflammation: a systematic review and a

meta-analysis. Thorax 2004, 59, 574-580.

[38] Mannino, D. M., Buist, A. S., Global burden of COPD: risk factors, prevalence, and

future trends. Lancet 2007, 370, 765-773.

[39] Chapman, K. R., Mannino, D. M., Soriano, J. B., Vermeire, P. A., et al.,

Epidemiology and costs of chronic obstructive pulmonary disease. Eur Respir J 2006,

27, 188-207.

[40] Hamdan, M., Righetti, P. G., Proteomics today: protein assessment and biomarkers

using mass spectrometry, 2D electrophoresis, and microarray technology, John Wiley &

Sons, Inc., Hoboken, NJ 2005.

[41] Thomson, J. J., Rays Of Positive Electricity and Their Application to Chemical

Analysis, Longman's Green and Company, London 1913.

[42] Yamashita, M., Fenn, J. B., Electrospray Ion Source. Another Variation on the Free-

Jet Theme. J. Phys. Chem. 1984, 88, 4451-4459.

[43] Karas, M., Hillenkamp, F., Laser desorption ionization of proteins with molecular

masses exceeding 10,000 daltons. Analytical chemistry 1988, 60, 2299-2301.

[44] Tanaka, K., Waki, H., Ido, Y., Akita, S., et al., Protein and polymer analyses up to

m/z 100 000 by laser ionization time-of-flight mass spectrometry. Rapid

Communications in Mass Spectrometry 1988, 2, 151-153.

[45] O'Farrell, P. H., High resolution two-dimensional electrophoresis of proteins. The

Journal of biological chemistry 1975, 250, 4007-4021.

[46] Penque, D., Two-dimensional gel electrophoresis and mass spectrometry for

biomarker discovery. PROTEOMICS – Clinical Applications 2009, 3, 155-172.

153

[47] Washburn, M. P., Wolters, D., Yates, J. R., 3rd, Large-scale analysis of the yeast

proteome by multidimensional protein identification technology. Nature biotechnology

2001, 19, 242-247.

[48] Nesvizhskii, A. I., 2006, pp. 87-119.

[49] Aebersold, R., Mann, M., Mass spectrometry-based proteomics. Nature 2003, 422,

198-207.

[50] Nilsson, T., Mann, M., Aebersold, R., Yates, J. R., 3rd, et al., Mass spectrometry in

high-throughput proteomics: ready for the big time. Nature methods 2010, 7, 681-685.

[51] Ong, S. E., Mann, M., Stable isotope labeling by amino acids in cell culture for

quantitative proteomics. Methods in molecular biology (Clifton, N.J 2007, 359, 37-52.

[52] Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., et al., Quantitative analysis of

complex protein mixtures using isotope-coded affinity tags. Nature biotechnology

1999, 17, 994-999.

[53] Shadforth, I. P., Dunkley, T. P., Lilley, K. S., Bessant, C., i-Tracker: for quantitative

proteomics using iTRAQ. BMC genomics 2005, 6, 145.

[54] Ye, X., Luke, B., Andresson, T., Blonder, J., 18O stable isotope labeling in MS-based

proteomics. Briefings in functional genomics & proteomics 2009, 8, 136-144.

[55] Chen, H., Wang, D., Bai, C., Wang, X., Proteomics-based biomarkers in chronic

obstructive pulmonary disease. Journal of proteome research 2010, 9, 2798-2808.

[56] Ohlmeier, S., Vuolanto, M., Toljamo, T., Vuopala, K., et al., Proteomics of Human

Lung Tissue Identifies Surfactant Protein A as a Marker of Chronic Obstructive

Pulmonary Disease. Journal of proteome research 2008.

[57] Lee, E. J., In, K. H., Kim, J. H., Lee, S. Y., et al., Proteomic analysis in lung tissue of

smokers and COPD patients. Chest 2009, 135, 344-352.

[58] Merkel, D., Rist, W., Seither, P., Weith, A., Lenter, M. C., Proteomic study of

human bronchoalveolar lavage fluids from smokers with chronic obstructive

pulmonary disease by combining surface-enhanced laser desorption/ionization-mass

spectrometry profiling with mass spectrometric protein identification. Proteomics

2005, 5, 2972-2980.

[59] Gray, R. D., MacGregor, G., Noble, D., Imrie, M., et al., Sputum proteomics in

inflammatory and suppurative respiratory diseases. American journal of respiratory

and critical care medicine 2008, 178, 444-452.

154

[60] Iolascon, A., Perrotta, S., Stewart, G. W., Red blood cell membrane defects.

Reviews in clinical and experimental hematology 2003, 7, 22-56.


Recommended