UNIVERSIDADE DE LISBOA
FACULDADE DE CIÊNCIAS
DEPARTAMENTO DE BIOLOGIA ANIMAL
CHRONIC OBSTRUCTIVE PULMONARY
DISEASE: A PROTEOMICS APPROACH
BRUNO MIGUEL COELHO ALEXANDRE
DOUTORAMENTO EM BIOLOGIA
(BIOLOGIA MOLECULAR)
Tese Orientada pela Doutora Deborah Penque e pela
Professora Doutora Ana Maria Viegas Gonçalves Crespo
2011
v
Declaration
The research described in this thesis was performed at Laboratório de Proteómica,
Departamento de Genética, Instituto Nacional de Saúde Dr. Ricardo Jorge (INSA IP),
Lisboa, Portugal, at Laboratory of Proteomics and Analytical Technologies, National
Cancer Institute at Frederick (NCI-Frederick), Frederick, Maryland, United States of
America, and at Clinical Proteomics Facility, University of Pittsburgh Cancer Institute
(UPCI), Pittsburgh, Pennsylvania, United States of America.
This work was supported by Fundação para a Ciência e Tecnologia (FCT) PhD grant
SFRH/BD/2006/31415 and research grant POCI/SAU-MMO/56163/2004, and also by
FCT/Polyannual Funding Program and FEDER/Saúde XXI (Portugal).
The following chapters are based on articles published or submitted during the PhD:
Chapter II is based on the article:
Alexandre, B. M., Proteomic mining of the red blood cell: focus on the membrane
proteome. Expert review of proteomics 2010, 7, 165-168.
During the PhD the following articles were also published:
Cox, J., R, M. A. H., James, P., Jorrin-Novo, J. V., et al., Facing challenges in Proteomics
today and in the coming decade: Report of Roundtable Discussions at the 4th EuPA
Scientific Meeting, Portugal, Estoril 2010. Journal of proteomics 2011.
Bruno Miguel Coelho Alexandre
(Licenciado em Química pela Faculdade de Ciências da Universidade de Lisboa)
vii
Acknowledgments
As minhas primeiras palavras vão para a Dr.ª Deborah Penque. Primeiro, por me ter
escolhido entre mais de 100 candidatos para ser bolseiro de investigação de um dos
seus projectos e, durante o 1º ano dessa bolsa, por me ter convidado a fazer
doutoramento no seu laboratório. Agradeço também todo o apoio que me deu
durante todos estes anos e especialmente por ter estado presente sempre que
necessário, mas dando-me liberdade para seguir o meu caminho de acordo com as
minhas ideias.
Gostaria igualmente de agredecer à Professora Doutora Ana Crespo por todo o apoio
que me deu durante o período de doutoramento. Pelo facto de ambos termos a
mesma formação base – Química – e também por ter anteriormente realizado
trabalho em DPOC, senti desde o início uma grande empatia e uma grande
compreensão da sua parte e senti sempre que compreendia aquilo que sentia ao estar
a trabalhar no ramo da Biologia, o que me ajudou bastante a lidar com os novos
conhecimentos e as dificuldades inerentes a esta nova aprendizagem.
I would like to thank Dr. Thomas Conrads for having me as a PhD student in his labs
(there were 3 different locations!) and for his help in every single step of the projects.
This includes acquiring the samples that made the serum project possible. I’d also like
to thank Dr. Brian Hood for his contribution to my education as scientist in the field of
proteomics and for his support at all times. I’d like to thank both for all their support
over the years at the professional level, always trying to provide me with everything I
needed during my time in the US and teaching me how to use new tools, new
instruments, etc., but on the top of all, I’d like to thank both for giving me the chance
of becoming their friends and be able to share unique moments as thanksgiving for
instance, which we don’t celebrate in Portugal. Thank you very much for all the good
moments.
viii
I’d like to thank Dr. Josip Blonder for his contribution for the red blood cell project,
along with Drs. Haleem Issaq and Timothy Veenstra.
I’d like to thank all volunteers for their collaboration including not only the ones that
were recruited in Portugal, but also the ones recruited in the US. Without them, it
wouldn’t be possible to perform this work.
Gostaria de agradecer a colaboração do grupo do Prof. Dr. Bugalho de Almeida, em
especial ao Dr. Carlos Lopes, no recrutamento de doentes de DPOC em Portugal e na
partilha de informação clínica, que é crucial para a realização de um trabalho em
proteómica clínica.
Uma palavra muito especial para as pessoas com quem partilhei o meu dia-a-dia no
laboratório, em especial a Patrícia Alves, o Nuno Charro, o João Banha e a Isabel
Oliveira, a que mais tarde se juntaram Fátima Vaz, Sofia Neves, Tânia Simões e,
recentemente, Vukosava Torres.
Quero também agradecer a todos meus amigos, eles sabem quem são, por todos os
momentos que tanto contribuiram para a manutenção da minha sanidade mental.
O meu grande agradecimento desta tese vai para a minha companheira, a mulher da
minha vida, a Filipa. O momento mais alto, não apenas destes 4 anos de
doutoramento, mas da minha vida, foi o dia do nosso casamento. Obrigado por todo o
apoio incondicional, mesmo quando em causa estavam momentos menos bons e de
grande sacrifício, como foram as temporadas que passei a trabalhar nos EUA. A minha
vida só faz sentido se puder caminhar junto a ti e é assim que quero passar todos os
meus dias.
Aos meus pais, tenho tudo para agradecer. Todo o apoio que sempre me transmitiram
ao longo da vida. O amor e o carinho que sempre me deram, a educação e a
responsabilidade que me incutiram desde sempre dando-me ao mesmo tempo
liberdade para traçar o meu caminho. Os pais não se escolhem, mas eu não podia ter
tido mais sorte.
ix
Para terminar, gostaria de deixar um agradecimento pelo apoio que tive por parte da
minha familia, em especial aos meus avós, à minha madrinha, ao Zé e às minhas lindas
primas Inês e Joana. Gostaria também de agradecer o apoio que sempre tive dos
“novos” elementos da minha família, em especial dos meus sogros Vitor e Lai, e da
minha cunhada Joana.
xi
Abstract
Chronic obstructive pulmonary disease (COPD) is characterized by chronic airflow
limitation that is not fully reversible even under bronchodilators effect, caused by a
mixture of small airway disease – obstructive bronchiolitis – and parenchymal
destruction – emphysema. At the present time, COPD is the fourth leading cause of
death and its prevalence and mortality are expected to continue increasing in next
decade.
Spirometry is the most reproducible way to measure lung function and is nowadays
the best tool to diagnose airflow limitation and, consequently, diagnose COPD itself.
Biomarkers for diagnosis and/or prognosis as well as novel targets for the
development of more effective therapies for COPD are still needed.
Proteomics has the capacity to provide large-scale information and consequently it has
the potential to expand previous knowledge on COPD. Surprisingly, given the need for
new biomarkers in COPD and the power of proteomics, proteomics have been quite
neglected so far. Up to date only 50 reports (14 are reviews) match the search at
Pubmed (http://www.ncbi.nlm.nih.gov/pubmed, accessed June 23, 2011) for COPD
proteomics. Hence, there is a clear need to engage clinically valuable proteomics
studies in order to match the need for new biomarkers in COPD.
In last decade, the shotgun proteomics approach has become the method of choice for
identifying and quantifying proteins in most large-scale studies. Compared with 2DE,
shotgun proteomics allows higher data throughput and better protein detection
sensitivity. This strategy is based on trypsin digestion of proteins into peptides. This
produces a complex peptide mixture that is then separated by one- or multiple
dimensional liquid chromatography (LC) and subjected to peptide sequencing using
tandem mass spectrometry (MS/MS) before automated database searching. In the
present work we have employed different methodologies within shotgun proteomics
to generate solid and comprehensive data in COPD.
There are quite a lot biological materials that can be used to investigate biomarkers for
this disease. Although COPD is now known to possess a systemic inflammation
xii
component which is responsible for affecting other organs, it is in the lung that the
events that lead to breathless take place. Investigating to the lung directly is therefore
an optimal strategy to be able to identify proteins that may not be detectable
elsewhere either because they are not present or diluted into undetectable
concentrations. But this means that lung tissue has to be collected by biopsy which is
an extremely invasive technique. But besides tissue biospecimens, other sources of
biological materials used to study COPD is biofluids which includes sputum,
bronchoalveolar and nasal lavage fluid, exhaled breath condensate, and blood.
It had been observed before by means of microscopy that red blood cells (RBCs) from
COPD patients showed deformations in their shape. RBCs are crucial to the uptake of
oxygen from the lungs to the cells and this transport is dependent on their ability to
change shapes rapidly while navigating through blood vessels. In addition, RBCs play a
crucial role in antioxidant defense when fighting against oxidative stress, which has
long been recognized as feature of COPD. In this work we made use of a RBC
membrane fractionation procedure, stable isotope labeling and bidimensional liquid
chromatography (strong cation exchange / reverse phase) before sample acquisition
using a high-resolution fourier transform - ion cyclotron resonance (FT-ICR) mass
spectrometer (Chapter III). A total of 4697 peptides were quantified as present in both
COPD and control spectra corresponding to 1083 proteins. Three-hundred and
fourteen proteins possessing at least two peptides were identified, 46% of which were
annotated as membrane proteins. Golgin-245/p230 (GOLGA4), was identified as
overexpressed in COPD, a protein which is reported to be essential for intracellular
trafficking and cell surface delivery of tumor necrosis factor-α (TNF), the main
proinflammatory cytokine made and secreted by inflammatory macrophages
enhancing activation and recruitment of T-cells and ensuring robust innate and
acquired immune responses. Chorein or Vacuolar protein sorting-associated protein
13A (VPS13A) is reported to play a role in the cytoskeleton organization has been
associated with thorny deformations of circulating erythrocytes, possibly due to red
cell membranes deformation. This protein was found to be underexpressed in COPD
patients when compared to controls by MS and this underexpression was confirmed by
WB. Consequently, underexpression of chorein may play an important role in the
deformation of COPD RBCs. Many other interesting proteins were identified in the
xiii
context of COPD and, additionally, there were a considerable number of proteins
described in RBC for the first time (Chapter III).
To overcome the difficulty of acquiring fresh biopsies of well characterized patients,
we have established in our laboratory a procedure to collect human fresh nasal
epithelial cells. We have shown previously that these cells presented similar proteome
of epithelial cells presented in the lower airway. Here, two different types of studies
were presented using these cells: a study performed on the effects of cigarette smoke,
which is the main risk factor for developing COPD (Chapter IV) and a comparative
proteomic study between COPD patients and healthy individuals (Chapter V). Both
were pioneer studies by investigating the proteome of fresh nasal epithelial cells from
cigarette smoker subjects (Chapter IV) or COPD patients (Chapter V). In both studies a
high-resolution mass analyzer, the orbitrap, was employed increasing the number of
confident peptide/protein identifications. In Chapter IV, ninety-six proteins were found
to be differentially expressed between the proteomes of healthy smokers and
nonsmokers. These proteins were related to processes of antigen presentation, cell-to-
cell signaling and interaction, cell morphology, drug metabolism, DNA repair, energy
production or mitochondrial dysfunction. Although requiring further orthogonal
validation, our data was consistent with previous evidences showing CD44, MUC5AC or
SOD2 differential modulation in smokers due to inflammatory response pathways.
In Chapter V, 89968 peptides and 1475 proteins were identified in total, of which 1173
proteins were identified by at least two peptides. We were able to confirm previous
evidences that UPR is activated in COPD patients since we were able to observe
overexpression in a considerable number of proteins involved in different protein
complexes involved in UPR. This includes overexpression of VCP, both components of
the Hsp10/Hsp60 chaperone complex (HSPD1 and HSPE1), CALR and two members of a
large ER-localized multiprotein complex of at least 11 proteins, PPIB and ERP29. We
also observed an increase in expression of proteins related to Nrf2-mediated oxidative
stress response such as GSTP1, TXNRD1 and GSR. Additionally, we also report an
increase in drug metabolism, as all significantly differentially expressed proteins
related to this biofunction were overexpressed in COPD: GSTP1, GSR, AKR1C3 and
ANXA2. Further validation by orthogonal methods is needed so that the activation of
xiv
UPR and Nrf2-mediated oxidative stress response and the increase in drug metabolism
on the nasal epithelial cells of COPD patients is fully confirmed.
In Chapter VI, serum collected from COPD patients was divided into 4 different groups
in all different combinations of presence/absence of the two main features of COPD,
chronic bronchitis and emphysema, to study their impact in the serum proteome. Due
to its complex protein mixture, serum was first immunodepleted from its most
abundant proteins, comprising about 94% of total protein content, before being
analyzed by 1D-PAGE – LC-MS/MS (GeLC-MS/MS) in a linear ion trap mass
spectrometer. This powerful strategy was able to identify as many as 2856 proteins, of
which 929 were identified by two or more peptides. Plasminogen was found to be
underexpressed in COPD patients that suffer simultaneously from emphysema and
chronic bronchitis, while it maintained about the same expression level over the three
other groups of COPD patients and this differential expression was successfully
validated by ELISA. It was possible to identify other interesting proteins as TRAF3IP2,
which is associated with innate immunity in response to pathogens, inflammatory
signals and stress and has also been implicated in airway hyperresponsiveness or
Isoform 1 of phosphatidylinositol-glycan-specific phospholipase D (GPLD1), which is
GPI degrading enzyme that was described to be responsible for secretion of prostasin,
which was the first of several membrane serine peptidases found to activate the
epithelium-sodium channel (ENaC). Prostasin was also reported to have a critical role
in regulating epithelial sodium transport in normal and pathological conditions in the
lung.
The work herein presented confirmed a few findings that had already been reported
and more important revealed new insights into COPD disease mechanisms as well as
provided new candidate biomarkers for these diseases. Further validation and
integration of all data obtained into a systems biology approach will certainly
contribute to increase knowledge of COPD and ultimately bringing the well being of
patients.
xv
Resumo
A Doença Pulmonar Obstrutiva Crónica (DPOC) é caracterizada por uma limitação
obstrutiva do fluxo aéreo do exterior para os alvéolos e destes para o exterior. Esta
limitação ventilatória não é completamente reversível mesmo após a administração de
um broncodilatador e tende a ser progressiva. As causas que conduzem a esta
limitação são um conjunto de patologias respiratórias do qual fazem parte a bronquite
crónica e o enfisema. Para a obstrução do débito de ar na bronquite crónica,
contribuem a inflamação das vias aéreas inferiores, a cicatrização das suas paredes, o
edema do seu revestimento, o muco e o espasmo do músculo liso. A bronquite crónica
manifesta-se por tosse frequente e produção aumentada de expectoração. No entanto
nem todos os doentes portadores de bronquite crónica têm ou irão desenvolver uma
limitação crónica do fluxo aéreo. No caso do enfisema, as paredes alveolares estão
destruidas, pelo que os bronquíolos perdem o seu apoio estrutural e, por isso, entram
em colapso quando expiram o ar. Por conseguinte, no enfisema a redução do fluxo de
ar é permanente e de origem estrutural. Actualmente, nenhum tratamento tem a
capacidade de reduzir a progressão da DPOC ou suprimir a inflamação das vias aéreas
de pequeno calibre e do parênquima pulmonar. Nos dias de hoje, a DPOC é a quarta
causa de morte no mundo e prevê-se que a sua prevalência e mortalidade continuem a
aumentar ao longo da próxima década. Em Portugal e de acordo com um relatório
divulgado pelo Observatório Nacional das Doenças Respiratórias (ONDR), cerca de 540
000 pessoas sofriam de DPOC em 2007, o que significa que 5,2% da população padece
desta condição. Recentemente, um relatório divulgado pela iniciativa Burden of
Obstructive Lung Disease (BOLD), revelou que a taxa de prevalência de DPOC em
Portugal é de 14,2%.
A espirometria é fundamental no diagnóstico e na avaliação da DPOC por ser o meio
mais objectivo, padronizado e facilmente reprodutivel de medir o grau de obstrução
das vias aéreas. Considera-se que existe obstrução brônquica e, portanto, DPOC,
quando após a administração de um broncodilatador a relação FEV1/FVC é menor do
xvi
que 70% (FEV1 – Volume Expiratório Máximo no 1.º segundo; FVC – Capacidade Vital
Forçada)
A Proteómica tem a capacidade de provedenciar informação em larga escala e,
consequentemente, de alargar o actual conhecimento da DPOC. Surpreendentemente
dada a necessidade de encontrar novos biomarcadores em DPOC e do potencial da
proteómica, esta tem sido negligenciada na investigação da doença até aos dias de
hoje. Actualmente, existem apenas 50 publicações científicas, dos quais 14 são artigos
de revisão, correspondentes a uma pesquisa no site da PubMed
(http://www.ncbi.nlm.nih.gov/pubmed, acedido em 23 de Junho de 2011) com o
termo “COPD Proteomics”. No entanto, quando pesquisados em separado o número
total de artigos com os termos “COPD” e “Proteomics” é de, aproximadamente, 30
000. Portanto, existe uma clara necessidade de produzir novos estudos de proteómica
para colmatar esta lacuna e encontrar novos biomarcadores para a DPOC. Na última
década, a abordagem designada por shotgun proteomics tornou-se a abordagem de
eleição para a identificação e a quantificação de proteínas em estudos de larga escala.
Comparando com a electroforese bidimensional (2DE), esta abordagem permite obter
um maior número de identificações de péptidos e proteínas por ter uma maior
sensibilidade. A estratégia de shotgun proteomics tem como base a digestão tríptica de
proteínas para péptidos que vai conduzir a uma mistura complexa de péptidos que são
separados através de uma ou múltiplas dimensões cromatográficas. Posteriormente
sofrem uma sequenciação peptídica através de um espectómetro de massa e pesquisa
automática contra uma base de dados. No presente trabalho utilizaram-se diferentes
metodologias dentro da abordagem shotgun proteomics com o objectivo de gerar
resultados contundentes em DPOC.
Existe uma vasta gama de materiais biológicos que podem ser usados para procurar
biomarcadores para esta doença. Embora a DPOC possua um componente de
inflamação sistémica que é responsável por afectar outros orgãos, é no pulmão que os
mecanismos que originam a falta de ar estão presentes. Assim sendo, investigar
directamente o pulmão é a estratégia ideal para poder encontrar as proteínas
responsáveis por esses mecanismos que noutros materiais biológicos podem não estar
presentes ou, quando presentes, se encontram em concentrações diminutas
conduzindo à sua não detecção. Isto significa ter acesso a tecido pulmonar que é
xvii
obtido através de biopsia, uma técnica extremamente invasiva. Para além de tecido
pulmonar, existem outras fontes de material biológico têm sido utilizadas para o
estudo da DPOC como são os casos de expectoração, lavado nasal e lavado
broncoalveolar, condensado do ar expirado e sangue e seus componentes. No
presente trabalho foram utilizados eritrócitos, células do epitélio nasal e soro para o
estudo da doença.
Foi reportado por meios de microscopia que os eritrócitos de doentes com DPOC
exibiam alterações na sua morfologia. Os eritrócitos são fundamentais no transporte
de oxigénio dos pulmões para as células e este transporte depende da sua capacidade
para mudar rapidamente a sua forma ao navegar pelos capilares sanguíneos. Os
eritrócitos desempenham um papel fundamental no combate ao stress oxidativo, que
é uma característica da DPOC. Devido à sua hipotética importância para a doença,
eritrócitos foram isolados a partir de sangue total de doentes com DPOC e indivíduos
saudáveis. Neste trabalho (Chapter III), foi utilizada uma técnica de fraccionamento
membranar em conjunto com uma marcação isotópica com 18O/16O e posterior
separação através de cromatografia bidimensional (cromatografia de troca catiónica e
de fase inversa). A cromatografia de fase inversa estava directamente acoplada a um
espectómetro de massa de alta resolução (espectómetro de massa de ressonância de
ião-ciclotrão com transformada de Fourier – FT-ICR). Foi possível quantificar um total
de 4697 péptidos presentes em ambos os grupos estudados, que corresponderam a
1083 proteínas. 314 proteínas foram identificadas por dois ou mais péptidos, 46% das
quais se encontram anotadas como proteínas de membrana. A proteína GOLGA4 foi
identificada como estando sobreexpressa em DPOC. Esta proteína está descrita como
sendo essencial para o tráfego intracelular e pela colocação à superfície da célula da
proteína tumor necrosis factor-α (TNF), a principal citocina pró-inflamatória produzida
e secretada por macrófagos para activação e recrutamento de células T. Foi possível
também identificar a proteína VPS13A que está associada a deformações em
eritrócitos circulantes, possivelmente devido a deformações na membrana. Como
resultado do estudo concluiu-se que em doentes com DPOC esta proteína se encontra
subexpressa, resultado que foi confirmado por Western Blot.
Devido à dificuldade em adquirir amostras de biópsias pulmonares em doentes com
DPOC, foi desenvolvido no nosso laboratório um procedimento para obter células do
xviii
epitélio nasal. Em trabalhos anteriores foi demonstrado que estas células apresentam
um comportamento semelhante às células epiteliais das vias respiratórias inferiores.
Dois diferentes tipos de estudo foram realizados com estas células: um estudo sobre
os efeitos do fumo do tabaco, que é o principal factor de risco para o desenvolvimento
de DPOC (Chapter IV) e um estudo comparativo entre doentes com DPOC e indivíduos
saudáveis (Chapter V). Ambos os estudos são estudos pioneiros uma vez que o
proteoma de células do epitélio nasal nunca tinha sido descrito quer em indivíduos
saudáveis fumadores, quer em doentes com DPOC. No primeiro estudo, 96 proteínas
foram identificadas como diferencialmente expressas entre fumadores e não
fumadores saudáveis. Estas proteínas estão relacionadas com processos de
apresentação de antigénios, sinalização e interacção celular, morfologia celular,
metabolismo de xenobióticos, reparação de DNA, produção de energia e disfunção
mitocondrial. Os resultados obtidos foram consistentes com anteriores evidências que
mostram uma diferente modulação de CD44, MUC5AC ou SOD2 em fumadores devido
a processos de resposta inflamatória.
No segundo estudo, foram identificados um total de 89968 péptidos correspondentes
a 1475 proteínas das quais 1173 foram identificadas por dois ou mais péptidos. Foi
possível confirmar resultados obtidos em estudos anteriores que referiam que
mecanismos de Unfolded Protein Response (UPR) se encontravam activados em
doentes com DPOC, uma vez que foi observada a sobreexpressão de um númeo
considerável de proteínas envolvidas em diferentes complexos de proteínas
relacionados com UPR. Exemplos dessas proteínas incluem a VCP, os dois
componentes do complexo Hsp10/Hsp60, a CALR e dois membros de um grande
complexo composto por, pelo menos, 11 proteínas, localizado no retículo
endoplasmático: PPIB e ERP29. Foi também possível observar um aumento na
expressão de proteínas relacionadas com os mecanismos de resposta ao stress
oxidativo mediados por Nrf2, como GSTP1, TXNRD1 e GSR.
Relativamente ao estudo efectuado com o soro (Chapter VI), quatro grupos foram
constituídos para estudar as diferentes combinações de presença/ausência dos dois
principais componentes da doença: a bronquite crónica e o enfisema. Uma vez que o
soro é uma mistura complexa de proteínas, o soro de doentes de DPOC foi
primeiramente imunodepletado das suas proteínas mais abundantes, que
xix
representam cerca de 94% do seu conteúdo total de proteína. As amostras de soro
imunodepletado foram analisadas através de 1D-PAGE – LC-MS/MS (GeLC-MS/MS).
Esta poderosa estratégia permitiu a identificação de 33049 péptidos correspondentes
a 2856 proteínas, das quais 929 foram identificadas por dois ou mais péptidos. Foi
possível observar uma subexpressão de PLG em doentes com DPOC que sofrem
simultaneamente de enfisema e bronquite crónica. Esta subexpressão foi validada
através de ELISA. Outras proteínas de interesse foram encontradas diferencialmente
expressas em cada um dos grupos de doentes com DPOC, entre as quais TRAF3IP2, que
está associada à resposta imunitária inata a patogéneos e à resposta exacerbada das
vias respiratórias.
O trabalho aqui descrito confirmou evidências anteriormente reportadas e ao mesmo
tempo formulou novas hipóteses para os mecanismos da doença e revelou potenciais
novos biomarcadores. Embora alguns resultados necessitem de posterior validação
através de técnicas ortogonais, o presente trabalho tornou possível a descoberta de
proteínas e, inclusivamente, vias metabólicas que não tinham sido anteriormente
associadas à DPOC. Este trabalho enfatiza também o uso de células epiteliais nasais na
investigação da patogénese da doença, uma vez que pode conduzir/conduz à
identificação de novos biomarcadores específicos desta doença.
xxi
List of Symbols and Abbreviations
2DE 2-Dimensional Electrophoresis
ACN AcetoNitrile
AMB AMmonium Bicarbonate
CID Collision-Induced Dissociation
COPD Chronic Obstructive Lung Disease
DNA DeoxyriboNucleic Acid
ELISA enzyme-linked immunosorbent assay
ER Endoplasmic Reticulum
ERAD Endoplasmic Reticulum-Associated Degradation
FEV1 Forced Expiratory Volume in 1 second
FVC Forced Vital Capacity
GOLD Global Initiative for Obstructive Lung Diseases
IPA Ingenuity Pathway Analysis
IPI International Protein Index
LC Liquid Chromatography
LIT Linear Ion Trap
MARS Multiple Affinity Removal System
MGG May-Grunwald-Giemsa
MS Mass Spectrometry
PAGE PolyAcrylamide Gel Electrophoresis
RBC Red Blood Cell
RNS Reactive Nitrogen Species
ROS Reactive Oxygen Species
RSD Relative Standard Deviation
RT Room Temperature
SCX Strong Cation exchange
SDS Sodium Dodecyl Sulfate
TBST Tris Buffer Saline Tween20
xxii
TFA TriFluorocetic Acid
UPR Unfolded Protein Response
WB Western Blot
WHO World Health Organization
Xcorr charge state dependent cross correlation
ΔCn delta correlation
xxiii
Contents
Declaration ........................................................................................................................ v
Acknowledgments ........................................................................................................... vii
Abstract ............................................................................................................................ xi
Resumo ............................................................................................................................ xv
List of Symbols and Abbreviations ................................................................................. xxi
Contents ....................................................................................................................... xxiii
List of Figures ................................................................................................................. xxv
List of Tables ................................................................................................................. xxix
Preface .......................................................................................................................... xxxi
Chapter I - General Introduction ...................................................................................... 1
CHRONIC OBSTRUCTIVE PULMONARY DISEASE ........................................................... 2
History ....................................................................................................................... 2
Definition and prevalence ......................................................................................... 2
Diagnosis and classification ....................................................................................... 5
Risk factors ................................................................................................................ 6
Economic burden .................................................................................................... 10
PROTEOMICS ............................................................................................................... 11
Background and state-of-the-art ............................................................................ 11
Proteomics in COPD ................................................................................................ 15
Chapter II - Proteomic Mining Of The Red Blood Cell: Focus On The Membrane
Proteome ........................................................................................................................ 17
Chapter III - Quantitative Profiling of the Erythrocyte Membrane Proteome Isolated
from Patients Diagnosed with Chronic Obstructive Pulmonary Disease ....................... 29
Chapter IV - A comparative, Global Proteomic Analyses of Human Nasal Epithelial Cells
Obtained by Nasal Brushing in Nonsmoking versus Smoking Healthy Individuals ........ 63
Chapter V - Proteomic Profiling of Nasal Epithelial Cells in Chronic Obstructive
Pulmonary Disease ......................................................................................................... 93
Chapter VI - Serum Proteomics of Chronic Obstructive Pulmonary Disease Patients . 123
xxiv
Chapter VII - Concluding Remarks and Future Perspectives ........................................ 145
xxv
List of Figures
Figure I.1: Changes in lung parenchyma of COPD patients. Source: Barnes, PJ.
Adapted from www.goldcopd.org 3
Figure I.2: Trends in mortality rates for the six leading causes of death in the
US, 1970-2002 [13]. 4
Figure I.3: Inflammatory cells involved in COPD. Adapted from [30]. 7
Figure I.4: Pathogenesis of COPD. Source: Barnes PJ. Adapted from
www.goldcopd.org. 8
Figure I.5: Pulmonary and systemic inflammatory events associated with
COPD [25]. 10
Figure I.6: Proteomics timeline indicating important scientific contributions
to proteomics development for the past five decades [46]. 13
Figure I.7: General view of the experimental steps and flow of data in
shotgun proteomics analysis [48]. 14
Figure I.8: Workflow illustrating different proteomics-based approaches and
major steps required for proteomic biomarker discovery. 15
Figure III.1: Basic scheme of methodology showing main steps of sample
preparation. 39
Figure III.2: SCX chromatogram displaying sample separation into ten
fractions. 40
Figure III.3: Subcellular location of the 314 proteins identified by at least two
peptides according to both gene ontology annotations and ingenuity systems
knowledgebase. 41
xxvi
Figure III.4: Biological processes (panels A and C) and molecular functions
(panels B and D) for the whole proteins identified in both COPD patients and
control subjects (panels A and B) and for differentially (above 1.5-fold)
expressed proteins only (panels C and D). Information gathered from
PANTHER software. 42
Figure III.5: Proteins identified in both samples within the two main RBC
membrane protein complexes. Adapted from [20]. 44
Figure III.6: Main protein-protein interaction network comprising 43
members generated by Cytoscape 2.6.3 using PINA database. 47
Figure III.7: Overrepresented biological processes (GO) for the differentially
(above 1.5-fold) expressed proteins in COPD patients. 48
Figure III.8: Western blot validation showing both representative close-up
views of each Ab reaction and graphic representation of the relative
normalized abundance of (A) Acylamino-acid-releasing enzyme (AARE ), (B)
ALDOA, (C) VPS13A and (D) CYB5R3, using the full intensity of the respective
Ponceau-stained lane in the nitrocellulose membrane for normalization (n=3
independent replicates/each Ab reaction). The antigen–antibody complex
was detected by ECL (GE Healthcare) and Progenesis PG200v2006 software
(Nonlinear Dynamics) was used for densitometry analysis. 50
Figure IV.1: MGG-staining of nasal cells collected by brushing. Magnification:
80x 71
Figure IV.2: Basic workflow of the methodology employed for the study of
the nasal epithelial cells proteome of healthy smokers and nonsmokers. 72
Figure IV.3: Venn diagram showing the overlap in proteins identified by at
least two peptides between the two groups under analysis. 73
.
xxvii
Figure IV.4: Hierarchical cluster of the significantly differentially expressed
proteins. Protein abundances are displayed as normalized expression. X-axis
labels refer to information displayed in Table IV.1. 75
Figure IV.5: Top protein network as obtained from Ingenuity Pathway
Analysis. 84
Figure V.1: Hierarchical clustering of the significantly differentially expressed
proteins between COPD patients and healthy individuals. Protein
abundances are displayed as normalized expression. 104
Figure V.2: Top protein network as obtained from Ingenuity pathway
analysis. 110
Figure V.3: Merged network comprising all 44 proteins found eligible for
networks analysis by Ingenuity pathway analysis. 111
Figure VI.1: Basic scheme of the methodology employed to study COPD
patients’ proteome. 132
Figure VI.2: Cellular location of proteins identified by two or more peptides. 133
Figure VI.3: Hierarchical clustering exhibiting relative abundance of the
eighty-six significantly differentially expressed proteins across the four
groups under analysis. 135
Figure VI.4: ELISA determination of serum plasminogen in COPD patients. X-
axes labels refer to nomenclature displayed in Table VI.1. 137
Figure VI.5: Western blot analysis of APOE. Primary antibody dilution:
1:1,000. Secondary antibody dilution: 1:50,000. 138
xxix
List of Tables
Table I.1: Classification of COPD stages based on spirometry [9]. FEV1 = Forced
expiratory volume in 1 sec; FVC = forced vital capacity; PaO2 = arterial partial
pressure of oxygen. 5
Table III.1: Main characteristics of both control and patient groups 34
Table III.2: Profile of the COPD patients. 34
Table III.3: Predominant pathways associated to COPD patients when
compared to healthy smokers as provided by PANTHER. 43
Table III.4: Ten most overexpressed proteins in COPD erythrocyte ghost as
provided by Ingenuity systems knowledgebase. a) Swiss-Prot/Uniprot
accession number. 45
Table III.5: Proteins associated to oxidative stress present in top-10 networks.
a) Swiss-Prot/Uniprot accession number; b) According to ingenuity pathways
analysis. 46
Table IV.1: Main characteristics of the biological replicates of the samples
under analysis. 68
Table IV.2: Differentially expressed proteins in smokers (S) when compared to
nonsmokers (NS) exhibiting a >95% confidence interval. Cellular location and
functional type were retrieved by Ingenuity knowledgebase (Ingenuity
Systems). 76
Table IV.3: Top 5 protein interaction networks generated from proteins found
to be significantly differentially expressed proteins between smokers and
nonsmokers. 82
Table IV.4: Top 10 significant biofunctions in disease and disorders observed in 83
xxx
differentially expressed proteins of smokers when compared to nonsmokers.
Table V.I: Demographics of biological replicates. 98
Table V.2: Smoking history of biological replicates. 98
Table V.3: Differentially expressed proteins in COPD patients when compared
to healthy individuals exhibiting a >95% confidence interval. Fold change along
with cellular location and functional type retrieved by Ingenuity
knowledgebase (Ingenuity Systems) are also provided. 105
Table V.4: Protein interaction networks generated by IPA from 44 proteins
found to be eligible for network analysis among the 46 significantly
differentially expressed proteins between COPD patients and healthy
individuals. 107
Table V.5: Top 25 significant biofunctions generated from significantly
differentially expressed proteins on COPD patients when compared to healthy
individuals. 108
Table V.6: Top 10 significant biofunctions within diseases and disorders
together with proteins involved in each biofunction. 111
Table V.7: Top 10 significant biofunctions within molecular and cellular
functions together with proteins involved in each biofunction. 112
Table V.8: Top 10 significant biofunctions within physiological system
development and function together with proteins involved in each biofunction. 113
Table VI.1: Main characteristics of the biological replicates for each of the
groups under analysis. “No features” (Group A) refers to emphysema and
chronic bronchitis only. (Biol Rep- Biological Replicate; BMI- body mass index). 127
Table VI.2: ELISA determination of serum PLG in COPD patients. 137
xxxi
Preface
The goal of the work herein presented is to identify new biomarkers for chronic
obstructive pulmonary disease and to provide new insights into its pathogenesis and
pathology. A huge amount of data was generated in the different studies and it was
not possible to include it due to space constrains. Therefore information mentioned as
Supporting Information is provided in the CD attached to this thesis.
Chapter I (“General Introduction”) is a general introduction where several aspects of
COPD are discussed and where state-of-the-art of proteomics is described with the aim
of explaining the advantages on applying proteomics to meet the needs for the
discovery of new biomarkers in COPD.
Chapter II (“Proteomic Mining Of The Red Blood Cell: Focus On The Membrane
Proteome”) is an introduction to red blood cell membrane proteome that has been
published in Expert Reviews of Proteomics (see “List of Publications”) and although the
author of this manuscript and of the present thesis is the same, this chapter is herein
reproduced after written authorization provided by Expert Reviews Ltd., London, UK.
Chapter II acts as an extended introduction to Chapter III (“Quantitative Profiling of the
Erythrocyte Membrane Proteome Isolated from Patients Diagnosed with Chronic
Obstructive Pulmonary Disease”), since Chapter III describes work performed on the
red blood cell of COPD patients.
Chapters IV and V address work done on the nasal epithelial cell proteome. In Chapter
IV (“A comparative, global proteomic analyses of human nasal epithelial cells obtained
by nasal brushing in non-smoking versus smoking healthy individuals”), the effects of
cigarette smoke, the main risk factor for developing COPD, are addressed and in
Chapter V (“Proteomic profiling of nasal epithelial cells in chronic obstructive
pulmonary disease”) the nasal epithelial proteome of COPD patients is compared to
the one of healthy individuals.
Chapter VI (“Serum proteomics of chronic obstructive pulmonary disease patients”)
reports the proteome investigation of immunodepleted serum samples and compares
the proteome among four different groups of COPD patients, divided by
xxxii
presence/absence of the two main clinical features of COPD: chronic bronchitis and
emphysema.
Finally, in Chapter VII (“Concluding Remarks and Future Perspectives”) presents the
main conclusions and achievements of this work and, at the same time, points to
future work.
2
CHRONIC OBSTRUCTIVE PULMONARY DISEASE
History
Chronic obstructive pulmonary disease (COPD) has certainly long existed, but first
reports that may be traced to this disease only dates from the seventeenth century. It
was in 1679 that took place the first hypothetical report of COPD cases when Bonet
described emphysema as a condition of “voluminous lungs” [1]. Almost a century
ahead, Giovanni Morgagni described 19 cases of “turbid” lungs in 1769 and 20 years
later an emphysematous lung is illustrated by Matthew Baillie [1]. Early reports of
chronic bronchitis were generated in 1814 by Badham who used the word catarrh to
refer to the chronic cough and mucus hypersecretion that are key symptoms. He also
described bronchiolitis and chronic bronchitis as disabling disorders [1]. The
emphysema component of disease was beautifully described by Laënnec (1821) in his
Treatise of diseases of the chest. He recognized that emphysema lungs were
hyperinflated and did not empty well [1]. Spirometer was invented in 1846 by John
Hutchinson [1]. This device is today absolutely necessary to the correct diagnosis and
management of COPD. However, Hutchinson’s instrument only measured vital
capacity. A century went by until Tiffeneau was able to add the concept of timed vital
capacity as a measure of airflow [2]. Gaensler introduced the concept of the air
velocity index based on Tiffeneau’s work and later the forced vital capacity [3], which is
the foundation of the FEV1 and FEV1/FVC percent and spirometry became complete as
a COPD diagnostic instrument. What was once called chronic obstructive
bronchopulmonary disease, chronic airflow obstruction, chronic obstructive lung
disease, nonspecific chronic pulmonary disease, and diffuse obstructive pulmonary
syndrome, was coined COPD in 1965 by William Briscoe [4].
Definition and prevalence
COPD is characterized by chronic airflow limitation that is not fully reversible even
under bronchodilators effect, caused by a mixture of small airway disease –
obstructive bronchiolitis – and parenchymal destruction – emphysema (Figure I.1).
Indeed, main components of COPD are chronic bronchitis and emphysema. Chronic
bronchitis is defined by the presence of chronic recurrent increase in bronchial
3
secretions sufficient to cause expectoration. These secretions must be present in most
days for a minimum of three months per year for at least two consecutive years and
cannot be attributed to other disorders [5]. Noteworthy, not every patient with
chronic bronchitis has or will develop chronic airflow limitation [6]. Emphysema is
defined anatomically by permanent, destructive enlargement of airspaces distal to the
terminal bronchioles without obvious fibrosis [5].
Figure I.1: Changes in lung parenchyma of COPD patients. Source: Barnes, PJ. Adapted
from www.goldcopd.org
Associated chronic inflammation causes changes and narrowing of the small airways
leading to airway remodeling. Parenchyma destruction is responsible for the loss of
alveolar attachments and decrease of lung elastic recoil [7]. These changes reduce the
ability of the airways to remain open during expiration. No currently available
treatments reduce the progression of COPD or suppress the inflammation in small
airways and lung parenchyma [8]. According to the Global initiative for chronic
Obstructive Lung Disease (GOLD)’s last report, COPD is defined as a preventable and
treatable disease with some significant extrapulmonary effects that may contribute to
the severity in individual patients [9].
COPD is a major cause of morbidity and mortality in adults, and its incidence has been
increasing worldwide. In 2000, approximately 2.7 million deaths were caused by COPD
4
[10]. At the present time, COPD is the fourth leading cause of death and its prevalence
and mortality are expected to continue increasing in next decade [9-11]. Additionally,
COPD is the only major cause of death that is increasing in prevalence worldwide [12],
while others causes have been declining since 1970 [13] (Figure I.2). Not only mortality
but also morbidity associated with COPD are often underestimated by healthcare
providers and patients as COPD is frequently underdiagnosed and undertreated [14].
Figure I.2: Trends in mortality rates for the six leading causes of death in the US,
1970-2002 [13].
In Portugal and according to a report released by the national observatory for
respiratory diseases (Observatorio Nacional das Doencas Respiratorias – ONDR), there
5
were about 540 000 people suffering from COPD in 2007, which means that 5.2 % of
the population is estimated to suffer from this disease [15]. In contrast, a recent report
of the Burden of Obstructive Lung Disease (BOLD) initiative in Portugal revealed that
prevalence rate of COPD in Portugal is 14.2% [16].
Diagnosis and classification
Spirometry is the most reproducible way to measure lung function and is nowadays
the best tool to diagnose airflow limitation and, consequently, diagnose COPD itself.
Spirometry should be performed post-bronchodilator administration to minimize
variability and if the forced expiratory volume in one second to forced vital capacity
ratio (FEV1/FVC) is lower than 0.7 then there is lung obstruction and the severity of the
disease will depend on FEV1 value according to GOLD guidelines [9]. Classification of
the disease according to the four stages is documented in Table I.1.
Table I.1: Classification of COPD stages based on spirometry [9]. FEV1 = Forced
expiratory volume in 1 sec; FVC = forced vital capacity; PaO2 = arterial partial pressure
of oxygen.
Disease
stage
Main characteristics
1: Mild COPD
FEV1/FVC < 70%
FEV1 ≥ 80% predicted
With or without symptoms
2: Moderate
COPD
FEV1/FVC < 70%
50% ≤ FEV1 < 80% predicted
With or without symptoms
3: Severe COPD
FEV1/FVC < 70%
30% ≤ FEV1 < 50% predicted
With or without symptoms
4: Very severe
COPD
FEV1/FVC < 70%
FEV1 < 30% predicted or < 50% predicted plus presence of chronic respiratory failure
(PaO2 < 60 mm Hg while breathing room air at sea level)
6
Risk factors
Cigarette smoke is the most commonly encountered risk factor for COPD. Cigarette
smoking is the leading cause of preventable death worldwide and yet, despite anti-
smoking campaign efforts from such organizations as the European Respiratory Society
[12], American Thoracic Society [17] or the World Health Organization (WHO) [18], the
number of smokers keeps increasing. Thus, global epidemic of tobacco-associated
diseases has progressively worsened.
Cigarette smokers have a higher prevalence of respiratory symptoms, lung function
abnormalities, a greater rate of decline in forced expiratory volume in the first second,
FEV1, and higher death rates for COPD than nonsmokers [9, 12]. A 25-year follow up
study of the general population concluded that 92% of COPD deaths occurred in
subjects who were current smokers at the beginning of the follow up period and that
after 25 years of smoking, at least 25% of smokers without initial disease will develop
clinically significant and 30-40% will have COPD [19]. The fact that not all smokers
develop clinically significant COPD, suggests that genetic factors may modify each
individual risk [12].
COPD is a polygenic disease and a classical example of gene-environment interaction
[9]. The only proven genetic risk factor for COPD is the hereditary deficiency of α1-
antitrypsin, a major circulating inhibitor of serine proteases, in which a smoker will
considerably increase the risk for COPD [20]. Gene mutations and polymorphisms have
been studied and several candidate genes associated with COPD phenotypes have
been reported, but so far none has been validated [21-24]. Occupational dust, outdoor
and indoor pollution, socioeconomic status and genetic determinants are also
associated with the development of COPD [12].
Pathology, pathogenesis and pathophysiology
Cigarette smoke and other noxious particles cause amplified lung inflammation in
patients that develop COPD. This may induce parenchymal tissue destruction
(emphysema) and disturb normal repair and defense mechanisms resulting in small
airway inflammation [25, 26]. Emphysema and small airway inflammation and damage
lead to the enlargement of alveolar air spaces, airway wall fibrosis, loss of elastic recoil,
smooth muscle hypertrophy, goblet cell hyperplasia and mucus plugging. Inflammatory
7
exudates accumulate in the small airways lumen due to reduced mucociliary escalator
function [25]. The physiological consequences are airway collapse over expiration
leading to airflow obstruction and hyperinflation (air trapping) which ultimately results
in characteristic symptom of breathless and progressive airflow limitation that may
lead to death. In general, inflammatory and structural changes in the airways increase
with disease severity and persist on smoking cessation [27-29].
COPD is characterized by a specific pattern of inflammation which involves neutrophils,
macrophages and lumphocytes [7]. These cells release inflammatory mediators and
interact with structural cells in airways and lung parenchyma (Figure I.3).
Figure I.3: Inflammatory cells involved in COPD. Adapted from [30].
The wide variety of inflammatory mediators that have been shown to be increased in
COPD patients [30] attract inflammatory cells from circulation amplifying the
inflammatory process and inducing structural changes that may lead to emphysema
and mucus hypersecretion [31].
8
Lung inflammation is believed to be further augmented by oxidative stress and an
excess of proteinases in the lung. These two mechanisms are key players in COPD
pathology (Figure I.4).
Figure I.4: Pathogenesis of COPD. Source: Barnes PJ. Adapted from
www.goldcopd.org.
It has long been proposed that several proteases disrupt connective tissue
components, as elastin, in lung parenchyma to produce emphysema and that there is
an imbalance in COPD patients between proteases and endogenous antiproteases
which should protect the lung against protease-derived effects. In COPD, the
exogenously and endogenously derived oxidants have been found to inactivate
antiproteinases such as α1-antitrypsin [32]. Evidences of elastin degradation in COPD
have been demonstrated and although early attention was directed to neutrophil
elastase, many other proteases have been reported to be able to degrade elastin [7].
9
Oxidants are generated endogenously and exogenously, with cigarette smoke being
heavily implicated in the latter as it contains many oxygen free radicals [25]. Under
normal circumstances and despite permanent exposure to high oxygen levels, the lung
is able to manage oxidant species by neutralizing them with several antioxidant
mechanisms in the human respiratory tract [7, 25, 33]. Oxidative stress occurs when
reactive oxygen species (ROS) are produced in excess of the antioxidant defense
mechanisms resulting in harmful effects such as damage to lipids, proteins and
deoxyribonucleic acid (DNA) [7, 34]. Inflammatory and structural cells that are
activated in the airways of COPD patients produce ROS, including neutrophils,
eosinophils, macrophages and epithelial cells [7, 34, 35]. Alveolar macrophages are
activated by free radicals and react by producing high levels of mediators, some of
which are chemotactic for neutrophils and macrophages (Figure I.3), as well as ROS
and also reactive nitrogen species (RNS), with resultant local and systemic
inflammation [25].
It is increasingly recognized that the inflammatory response associated with COPD
extend beyond the lung [36]. Evidence of systemic inflammation includes activated
circulating inflammatory cells and elevated levels of both inflammatory cytokines and
acute phase proteins as C-reactive protein, fibrinogen, leukocytes and tumor necrosis
factor (TNF-α) in COPD patients when compared to healthy subjects [37].
The origin of systemic inflammation in COPD is still unclear and requires further
investigation, but it is likely to be a consequence of a number of factors, including
individual susceptibility and the direct effects of hypoxia and noxious substances as the
one of cigarette smoke on the peripheral vasculature and circulating inflammatory
cells [25]. Alternatively, the observed inflammation may be a consequence of
‘overspill’ from the lung to the peripheral circulation [25]. Systemic inflammation is
directly linked to a number of complications commonly encountered in COPD patients
including, but not limited to, cachexia, skeletal muscle dysfunction, depression,
osteoporosis, diabetes/glucose intolerance, autoimmune disorders and cardiovascular
diseases (Figure I.5) [25, 36].
10
Figure I.5: Pulmonary and systemic inflammatory events associated with COPD [25].
Economic burden
Among respiratory diseases, COPD is the leading cause of lost work days. In the United
States of America, medical costs credited to COPD were estimated at $32.1 billion [38].
In the European Union, productivity losses are estimated to amount to a total of €28.5
billion annually [12]. The total COPD-related expenses for outpatient care is €4.7
billion, while inpatient care generates costs of €2.9 billion, followed by expenses for
pharmaceuticals at €2.7 billion [12]. Therefore, according to these data provided by
the European Respiratory Society and the European Lung Foundation within the
11
European Lung White Book, indirect costs represent the major financial burden in
COPD and, more importantly, the total costs associated with the disease are quite
relevant. Another study compared the costs of COPD in Spain, USA, Sweden, Holland
and Italy. Global annual costs for each country ranged €109-541 million whilst annual
costs per patient were €151-3,91 [39].
Economic burden is likely to be underestimated since, for example, the economic value
of the care provided by family members is not generally acknowledged. Long-term
home care provided by relatives for COPD patients has a negative impact on
professional careers for both patients and their family members [5]. Hence, COPD
represents a very important threat to global economies.
PROTEOMICS
The Proteome is, by definition, the total set of proteins expressed by a given cell,
tissue, organ or organism at a certain time and under certain conditions. Proteomics is
defined by the large-scale study of the proteome. The human genome has been
sequenced a decade ago and about 20,000 genes were accounted. Genome
sequencing contains valuable information to proteomics that can take this knowledge
to a higher level providing new insights into the pathophysiology of many diseases that
may be translated to new prognosis and diagnosis, but also to novel therapeutical
treatments.
Background and state-of-the-art
The word proteome was coined at the Siena 2D electrophoresis meeting in 1994 [40].
The advent of proteomics has brought with it the hope of discovering novel
biomarkers that can be used to diagnose diseases, predict susceptibility and monitor
progression, among many other applications. This hope is built on the ability of
proteomic technologies, such as mass spectrometry (MS), to identify hundreds of
proteins in complex biofluids such as plasma and serum. Very few if any analytical
instruments surpass the mass spectrometer in the versatility of its application in both
basic and applied research, as it is the case of biomarker discovery. To support this
12
statement, it is sufficient to mention that mass spectrometry can be used for
applications ranging from characterization of electronic excited states and vibrational
levels of simple molecules to the construction of protein interaction maps in
multicellular organisms. This is also the result of almost hundred years of mass
spectrometry utilization since Sir J. J. Thomson was able to create the first mass
spectrometer in 1913 [41]. For over 80 years ionization methods had excluded the
study of large molecules, including peptides and proteins. In the 1980s, this paradigm
changed with the introduction of new ionization methods as electrospray ionization
(ESI) [42] and matrix-assisted laser desorption ionization (MALDI) [43, 44]. These
simple and sensitive ionization methods have been coupled to different types of
analyzers such as triple quadrupoles, three-dimensional ion traps, and time of flight
(TOF) , including its orthogonal version which allowed coupling of TOF to both pulsed
(MALDI) and continuous (ESI) ionization types [40].
A further impetus was given to the process of ion analysis through the
commercialization of hybrid configurations that have been intensively used in
proteomics including, but not limited to, TOF-TOF, ion trap-Fourier transform (FT)-ion
cyclotron resonance (ICR), and quadrupole-TOF. These combinations have a direct
impact on sensitivity and resolution of the sequence information that can be obtained
when performing tandem MS analysis [40].
At the same time MS was evolving, there were many advances in other fields that were
crucial to the development of proteomics as sample preparation techniques and
bioinformatics tools. One of the most widely used separation procedures is two-
dimensional gel electrophoresis (2DE) which consists in the separation of a complex
protein mixture according to physicochemical properties of proteins. First, proteins are
separated in one dimension according to their isolectric point through isoelectric
focusing (IEF) in immobilized pH gradient (IPG) strips, and then separated over a
second dimension according to their molecular weight in a sodium dodecyl sulfate
polyacrylamide gel electrophoresis (SDS-PAGE). This separation method was described
as it is used today in 1975 by O’Farrell [45]. SDS-PAGE is one of many achievements
that took place in the last 50 years and that established proteomics in the first line of
clinical research at the present time (Figure I.6) [46].
13
Figure I.6: Proteomics timeline indicating important scientific contributions to
proteomics development for the past five decades [46].
14
Figure I.7: General view of the experimental steps and flow of data in shotgun
proteomics analysis [48].
In last decade, the shotgun proteomics approach has become the method of choice for
identifying and quantifying proteins in most large-scale studies [47-50]. Compared with
2DE, shotgun proteomics allows higher data throughput and better protein detection
sensitivity. This strategy is based on digesting proteins (usually with trypsin) into
peptides. This produces a complex peptide mixture that is then separated by one- or
multiple dimensional liquid chromatography (LC) and subjected to peptide sequencing
using tandem mass spectrometry (MS/MS) before automated database searching
(Figure I.7). This strategy is compatible with the use of labeled samples for quantitative
15
purposes such as stable isotope labeling by amino acids in cell culture (SILAC) [51],
isotope coded affinity tags (ICAT) [52], isobaric tags for relative and absolute
quantitation (iTRAQ™) [53] or by O16/O18 exchange [54].
The chance of combining different techniques along sample preparation steps with
different separation methods and different types of mass spectrometers generates
multiple complementary approaches whose results can be combined to achieve a
higher level of understanding (Figure I.8).
Figure I.8: Workflow illustrating different proteomics-based approaches and major
steps required for proteomic biomarker discovery.
Proteomics in COPD
There is still some ambiguousness concerning the disease-specific molecular
mechanisms of the inflammatory process and acute exacerbation of COPD. Potential
biomarkers which are specific for COPD have not been fully identified and validated,
even though there is a great need for such biomarkers [55]. Proteomic technologies
allow for identification of protein changes caused by the disease process and recent
16
advances, especially at mass spectrometry and bioinformatics levels, raise the chances
to identify novel putative biomarkers. In a recent review, Chen and coworkers provide
information on putative biomarkers for COPD generated from several proteomics
studies in lung tissues, bronchoalveolar lavage fluid (BALF) and sputum [55].
Concerning lung tissues, two studies mentioned in this review yielded 12 differentially
expressed proteins when compared to healthy controls, including surfactant protein A
(SP-A) and matrix metalloproteinase-13 (MMP-13) [56, 57]. Regarding BALF, there
were four proteins reported to be differentially expressed in COPD (Neutrophil
defensins 1 and 2 and calgranulin A and B) [58] and in sputum, clara cell secretory
protein (CCSP) and again SP-A were the two proteins reported as potential biomarkers
[57, 59].
Surprisingly, to date only 50 reports (14 are reviews) match the search at Pubmed
(http://www.ncbi.nlm.nih.gov/pubmed, accessed June 23, 2011) for COPD proteomics,
while proteomics and COPD account for about 30,000 each when separately searched.
Hence, there is a clear need to engage clinically valuable proteomics studies in order to
match the need for new biomarkers in COPD.
18
Proteomic Mining Of The Red Blood Cell: Focus On The Membrane Proteome
Bruno M. Alexandre1*
1 Laboratório de Proteómica, Departamento de Genética, Instituto Nacional de Saúde
Dr. Ricardo Jorge (INSA-IP), Lisboa, Portugal
Keywords: Red blood cell; Membrane proteins; Membrane proteomics; Clinical
proteomics; Malaria.
*Corresponding author: Bruno M. Alexandre, Laboratório de Proteómica,
Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo
Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8138,
Fax: +351 21752 6410, e-mail: [email protected]
Alexandre, B. M., Proteomic mining of the red blood cell: focus on the membrane
proteome. Expert review of proteomics 2010, 7, 165-168.
Manuscript reproduced under written authorization from Expert Reviews Ltd., London,
UK.
19
The plasma membrane is strategically located in the interface between the inside and
the outside of the cell. Membrane proteins, as part of the plasma membrane, act as
key players mediating diverse cellular functions including, but not limited to,
metabolite and ion transport, intercellular communication, cell adhesion and cell
movement [1].
The main function of the red blood cell (RBC) is to mediate O2/CO2 exchange between
cells/tissues and lungs and this is only achieved due to the morphology and mechanical
deformability of the RBC membrane, which is responsible for its capability to perfuse
across the vessels and capillaries along its 120 days journey. The RBC membrane
possesses concomitant distinctive features as high elasticity (with little increase on
surface area) and robustness (stronger than steel in terms of structural resistance) [2].
These unique properties result from a composite structure in which cholesterol and
phospholipids, that compose the plasma membrane envelope, are anchored to a two-
dimensional elastic network of skeletal proteins through transmembrane proteins
embedded in the lipid bilayer [2, 3]. Appropriate function of integral membrane
proteins and their interaction with the cytoskeleton are vital for the maintenance of
structural stability and RBC shape. Failure on any of these events/constituents is the
cause for many red cell disorders [4, 5]. Hence, studying the membrane proteome is
important for comprehending the biology of disease states in the quest for novel
biomarkers and consequently is also important at the pharmacological level as many
successful drugs known to date target membrane proteins modulating its activity [6].
Since its very beginning, the study of membrane proteomes turned out to be a major
challenge. In 1974, a review was published concerning the organization of the human
RBC membrane compiling several studies and hypothesizing on a possible arrangement
for the most abundant RBC integral membrane proteins and interactors [7]. Giving its
importance, it is not surprising that membrane proteins have been studied by a variety
of biochemical techniques. One of those techniques is two-dimensional gel
electrophoresis (2-DE), which paved the way to the proteomics era [8]. In 1978, two
years after the application of the O’Farrell 2-DE system [9] to the study of membrane
proteins [10], over 200 spots were resolved from human RBC membranes [11]. But the
application of 2-DE to the study of membrane proteins was far from being ideal. This is
due to the unique characteristics of membrane proteins as their amphiphilic nature
20
(poor solubility in the aqueous buffers used for isoelectric focusing), high isoelectrical
points [pIs can be higher than the upper limit of the immobilized pH gradient (IPG)
strips] and low abundance, which greatly difficult its detection through 2-DE [1, 12-14].
To overcome the solubility issue, Rosenblum et al. used different concentrations of
urea, NP-40 detergent and mercaptoethanol to detect about 600 spots using silver
staining [15]. However, the real improvement on the methods/studies presented in
the large majority of the publications released until early 90s relied essentially on the
number of additional spots and the reproducibility of the new/modified
methodologies rather than identification and classification of hypothetical new
proteins found. One must bear in mind that before the development of an analytical
technique for naïve peptide/protein identifications, in the case, peptide mass
fingerprint (PMF) [16-20], proteins could only be identified by means of targeted
approaches as comigration with known proteins or immunoblotting, a more sensitive
technique [21]. Therefore, the first ‘serious’ proteomic study on RBC membranes, i.e.,
the first study where modern mass spectrometry (MS) and database searching in the
post-genomic era was employed to study RBC membranes was the one performed by
Low and co-workers in 2002 [22]. Using one-dimensional gel electrophoresis (1-DE)
and 2-DE, silver staining and in-gel trypsin digestion of selected spots followed by
matrix-assisted laser desorption ionization – time of flight (MALDI-TOF) MS, the
authors were able to identify 84 unique proteins: 59 proteins were identified by 2-DE
and 44 by 1-DE (19 proteins were common to both approaches). In addition, several
isoforms were found in the study. The first in-depth study on the RBC proteome was
conducted in 2004, where cytoplasmic and membrane fractions were further
fractionated and resulting sub-fractions were then analyzed (and classified) resulting in
the identification of 181 unique proteins, 91 of which were identified from the
membrane fraction [23]. In 2005, Tyan et al. were able to identify 272 proteins using a
trypsin-immobilized chip used for protein digestion prior to two-dimensional high
performance liquid chromatography – electrospray ionization (2D-HPLC-ESI)-MS/MS
[24]. In the same year, Bruschi et al. [25] presented an approach to improve the
analysis of high Mw proteins in 2-DE. The authors used diluted Immobiline gels
combined with sample delipidation generating gels with more than 500 spots,
including filamentous proteins such as spectrins and ankyrins and integral membrane
21
proteins as bands 3, 4.1 and 4.2 [25]. Still in 2005, Kakhniashvili et al. used two-
dimensional fluorescence difference gel electrophoresis (2D-DIGE) to compare the RBC
membrane profile of one sickle cell disease (SCD) patient to one healthy individual and
came up with 49 differentially expressed spots using a threshold of 2.5-fold. Selected
spots were further analyzed by LC-MS after in-gel trypsin digestion to identify 44
protein forms from 22 unique proteins [26]. The same strategy was employed to
investigate the therapeutic action of hydroxyurea in SCD [27]. In 2006, Pasini and
colleagues published the most complete study on the human RBC membrane
proteome known to date [28]. By combining sample preparation techniques and top
quality MS instruments such as quadrupole time-of-flight (Q-TOF) and Fourier
transform – ion cyclotron resonance (FT-ICR) they were able to identify 314 membrane
proteins (and also 252 soluble proteins). A very promising gel-based approach for
analysis of membrane proteins is two-dimensional blue-native (BN)/SDS PAGE. This
novel approach was applied to the study of the RBC membrane by van Gestel et al.
who were able to detect 146 spots, from which 524 unique proteins were identified by
LC-MS. This data was compared to two other comprehensive datasets produced by
Pasini et al. [28] and Bosman et al. [29] and it was exciting to observe that only 112
from a total of 1431 unique proteins were commonly identified in the three studies. In
addition, the authors were able to use BN/SDS PAGE in combination with CyDye
labeling to quantitatively analyze samples from healthy volunteers and a patient
suffering from congenital anemia [30], an approach that can potentially be used for
biomarker discovery. Noteworthy, although only targeted to cytoplasmic proteins, is
the study carried out by Roux-Dalvai et al., since the use of a new technology, peptide
ligand library, was responsible for the identification of as many as 1578 proteins from a
highly purified preparation of RBCs.
Datasets from some of the studies herein presented were gathered on a minireview
paper [31], but unfortunately the authors have not provided information on the
accession numbers, therefore making it difficult to understand how many proteins
were found to present and also to compare newly obtained datasets to the data
collected so far. Indeed, no review produced to date compiles the information
concerning protein identifications (and classification) together with the correspondent
22
accession number (e.g., UniProt, IPI) in the different studies, including very interesting
recently published reviews [32-35].
Conclusion & future outlook
The RBC proteome was believed to be a simple one taking into account that RBCs are
enucleated cells and lack internal organelles and protein synthesis machinery. But in
the past decade, the knowledge on the RBC proteome increased dramatically and
changed this picture. This was due not only to the development on sample preparation
techniques (e.g., fractionation, depletion/enrichment), but mostly to the use of
sophisticated mass spectrometers, appropriate search algorithms and to
comprehensive human protein databases. In order to entirely comprehend the action
of the RBC, and particularly the different roles played by its membrane, it is necessary
to identify every single protein, their structure, function, posttranslational
modifications, interactions, location and abundance in the cell. RBC membrane
proteome revelation will have a tremendous impact in medicine, including hot topics
as transfusion medicine [33, 36, 37] and malaria. Malaria, which is caused by an
eukaryotic protist of the genus Plasmodium, is responsible for the death of about 3
million people worldwide [38]. Malaria parasites first invade hepatocytes of the human
host before traveling into the blood to infect RBCs. As circulating infected RBCs are
removed in the spleen, P. falciparum (responsible for 80% of malaria cases and 90%
deaths from malaria) exhibits adhesion proteins at RBC surface causing RBCs to attach
to the blood vessels. These surface adhesion proteins as Plamodium falciparum
erythrocyte membrane protein 1, PfEMP1, are exposed to the immune system and
would be, therefore, an easy target. But remarkably, the parasites stay one step
forward from the immune system by presenting extreme diversity in PfEMP1 isoforms:
there are over 60 variations of the protein within a single parasite and virtually
limitless versions within parasite populations [39]. Furthermore, there are other RBC
proteins of particular interest as Glucose-6-phosphate dehydrogenase or Duffy
antigens (used by P. vivax to enter the cell), whose expression deficiency in RBCs
results in increased protection against P. vivax and severe malaria [40-42]. These facts
by themselves present major challenges for clinicians and researchers and set
important cases for the continued detailed study of the RBC membrane proteome. But
23
clinical proteomics applications are far from being limited to diseases that directly
affect the RBC; other diseases that although not affecting the RBC directly, provoke
alterations in the RBC, are no less mandatory to be exploited for diagnostic purposes.
For instance, when RBCs are depleted of critical enzymes needed for intermediary
metabolism and antioxidant activity, it results in oxidation of critical membrane
proteins, lipids and hemoglobin which lead to distortion and rigidity of the RBC
membrane.
It is also important to acknowledge that RBCs are one of the most abundant cells in
humans and are involved in numerous processes through the interplay with other
blood cells and endothelial cells for a time period that may last as long as 4 months,
thus potentially accumulating modifications on their proteins (surface and membrane
will likely be the more affected ones) that can indirectly report the underlying features
of a specific pathology, ultimately before the symptoms were ever manifested.
24
REFERENCES
[1] Rabilloud, T., Membrane proteins and proteomics: love is possible, but so difficult.
Electrophoresis 2009, 30 Suppl 1, S174-180.
[2] Mohandas, N., Gallagher, P. G., Red cell membrane: past, present, and future.
Blood 2008, 112, 3939-3948.
[3] Mohandas, N., Evans, E., Mechanical properties of the red cell membrane in
relation to molecular structure and genetic defects. Annual review of biophysics and
biomolecular structure 1994, 23, 787-818.
[4] An, X., Mohandas, N., Disorders of red cell membrane. British journal of
haematology 2008, 141, 367-375.
[5] Delaunay, J., The molecular basis of hereditary red cell membrane disorders. Blood
reviews 2007, 21, 1-20.
[6] Hopkins, A. L., Groom, C. R., The druggable genome. Nat Rev Drug Discov 2002, 1,
727-730.
[7] Steck, T. L., The organization of proteins in the human red blood cell membrane. A
review. The Journal of cell biology 1974, 62, 1-19.
[8] Penque, D., 2009, pp. 155-172.
[9] O'Farrell, P. H., High resolution two-dimensional electrophoresis of proteins. The
Journal of biological chemistry 1975, 250, 4007-4021.
[10] Ames, G. F., Nikaido, K., Two-dimensional gel electrophoresis of membrane
proteins. Biochemistry 1976, 15, 616-623.
[11] Rubin, R. W., Milikowski, C., Over two hundred polypeptides resolved from the
human erythrocyte membrane. Biochimica et biophysica acta 1978, 509, 100-110.
[12] Santoni, V., Molloy, M., Rabilloud, T., Membrane proteins and proteomics: un
amour impossible? Electrophoresis 2000, 21, 1054-1070.
[13] Rabilloud, T., Chevallet, M., Luche, S., Lelong, C., Fully denaturing two-dimensional
electrophoresis of membrane proteins: a critical update. Proteomics 2008, 8, 3965-
3973.
[14] Zhang, H., Lin, Q., Ponnusamy, S., Kothandaraman, N., et al., Differential recovery
of membrane proteins after extraction by aqueous methanol and trifluoroethanol.
Proteomics 2007, 7, 1654-1663.
25
[15] Rosenblum, B. B., Hanash, S. M., Yew, N., Neel, J. V., Two-dimensional
electrophoretic analysis of erythrocyte membranes. Clinical chemistry 1982, 28, 925-
931.
[16] Pappin, D. J., Hojrup, P., Bleasby, A. J., Rapid identification of proteins by peptide-
mass fingerprinting. Curr Biol 1993, 3, 327-332.
[17] Henzel, W. J., Billeci, T. M., Stults, J. T., Wong, S. C., et al., Identifying proteins
from two-dimensional gels by molecular mass searching of peptide fragments in
protein sequence databases. Proceedings of the National Academy of Sciences of the
United States of America 1993, 90, 5011-5015.
[18] Mann, M., Hojrup, P., Roepstorff, P., Use of mass spectrometric molecular weight
information to identify proteins in sequence databases. Biological mass spectrometry
1993, 22, 338-345.
[19] James, P., Quadroni, M., Carafoli, E., Gonnet, G., Protein identification by mass
profile fingerprinting. Biochemical and biophysical research communications 1993, 195,
58-64.
[20] Yates, J. R., 3rd, Speicher, S., Griffin, P. R., Hunkapiller, T., Peptide mass maps: a
highly informative approach to protein identification. Analytical biochemistry 1993,
214, 397-408.
[21] Towbin, H., Staehelin, T., Gordon, J., Electrophoretic transfer of proteins from
polyacrylamide gels to nitrocellulose sheets: procedure and some applications.
Proceedings of the National Academy of Sciences of the United States of America 1979,
76, 4350-4354.
[22] Low, T. Y., Seow, T. K., Chung, M. C., Separation of human erythrocyte membrane
associated proteins with one-dimensional and two-dimensional gel electrophoresis
followed by identification with matrix-assisted laser desorption/ionization-time of
flight mass spectrometry. Proteomics 2002, 2, 1229-1239.
[23] Kakhniashvili, D. G., Bulla, L. A., Jr., Goodman, S. R., The human erythrocyte
proteome: analysis by ion trap mass spectrometry. Mol Cell Proteomics 2004, 3, 501-
509.
[24] Tyan, Y. C., Jong, S. B., Liao, J. D., Liao, P. C., et al., Proteomic profiling of
erythrocyte proteins by proteolytic digestion chip and identification using two-
26
dimensional electrospray ionization tandem mass spectrometry. Journal of proteome
research 2005, 4, 748-757.
[25] Bruschi, M., Seppi, C., Arena, S., Musante, L., et al., Proteomic analysis of
erythrocyte membranes by soft Immobiline gels combined with differential protein
extraction. Journal of proteome research 2005, 4, 1304-1309.
[26] Kakhniashvili, D. G., Griko, N. B., Bulla, L. A., Jr., Goodman, S. R., The proteomics of
sickle cell disease: profiling of erythrocyte membrane proteins by 2D-DIGE and tandem
mass spectrometry. Experimental biology and medicine (Maywood, N.J 2005, 230, 787-
792.
[27] Ghatpande, S. S., Choudhary, P. K., Quinn, C. T., Goodman, S. R., Pharmaco-
proteomic study of hydroxyurea-induced modifications in the sickle red blood cell
membrane proteome. Experimental biology and medicine (Maywood, N.J 2008, 233,
1510-1517.
[28] Pasini, E. M., Kirkegaard, M., Mortensen, P., Lutz, H. U., et al., In-depth analysis of
the membrane and cytosolic proteome of red blood cells. Blood 2006, 108, 791-801.
[29] Bosman, G. J., Lasonder, E., Luten, M., Roerdinkholder-Stoelwinder, B., et al., The
proteome of red cell membranes and vesicles during storage in blood bank conditions.
Transfusion 2008, 48, 827-835.
[30] van Gestel, R. A., van Solinge, W. W., van der Toorn, H. W., Rijksen, G., et al.,
Quantitative erythrocyte membrane proteome analysis with Blue-Native/SDS PAGE.
Journal of proteomics 2009.
[31] Goodman, S. R., Kurdia, A., Ammann, L., Kakhniashvili, D., Daescu, O., The human
red blood cell proteome and interactome. Experimental biology and medicine
(Maywood, N.J 2007, 232, 1391-1408.
[32] Pasini, E. M., Lutz, H. U., Mann, M., Thomas, A. W., Red blood cell (RBC)
membrane proteomics - Part I: Proteomics and RBC physiology. Journal of proteomics
2009.
[33] Pasini, E. M., Lutz, H. U., Mann, M., Thomas, A. W., Red Blood Cell (RBC)
membrane proteomics - Part II: Comparative proteomics and RBC patho-physiology.
Journal of proteomics 2009.
[34] Liumbruno, G., D'Alessandro, A., Grazzini, G., Zolla, L., Blood-related proteomics.
Journal of proteomics 2009.
27
[35] D'Alessandro, A., Righetti, P. G., Zolla, L., The red blood cell proteome and
interactome: an update. Journal of proteome research, 9, 144-163.
[36] Liumbruno, G., D'Amici, G. M., Grazzini, G., Zolla, L., Transfusion medicine in the
era of proteomics. Journal of proteomics 2008, 71, 34-45.
[37] Bosman, G. J., Lasonder, E., Groenen-Dopp, Y. A., Willekens, F. L., et al.,
Comparative proteomics of erythrocyte aging in vivo and in vitro. Journal of proteomics
2009.
[38] Snow, R. W., Guerra, C. A., Noor, A. M., Myint, H. Y., Hay, S. I., The global
distribution of clinical episodes of Plasmodium falciparum malaria. Nature 2005, 434,
214-217.
[39] Chen, Q., Schlichtherle, M., Wahlgren, M., Molecular aspects of severe malaria.
Clin Microbiol Rev 2000, 13, 439-450.
[40] Foller, M., Bobbala, D., Koka, S., Huber, S. M., et al., Suicide for survival--death of
infected erythrocytes as a host mechanism to survive malaria. Cell Physiol Biochem
2009, 24, 133-140.
[41] Rowe, J. A., Opi, D. H., Williams, T. N., Blood groups and malaria: fresh insights
into pathogenesis and identification of targets for intervention. Curr Opin Hematol
2009, 16, 480-487.
[42] Chootong, P., Ntumngia, F. B., Vanbuskirk, K. M., Xainli, J., et al., Mapping epitopes
of the Plasmodium vivax Duffy binding protein with naturally acquired inhibitory
antibodies. Infection and immunity 2009.
29
Chapter III
Quantitative Profiling of the
Erythrocyte Membrane Proteome
Isolated from Patients Diagnosed
with Chronic Obstructive
Pulmonary Disease
30
Quantitative Profiling of the Erythrocyte Membrane Proteome Isolated from Patients
Diagnosed with Chronic Obstructive Pulmonary Disease
Bruno M. Alexandre1,2, Nuno Charro1,2, Carlos Lopes3, Pilar Azevedo3, António Bugalho
de Almeida3, King C. Chan2, Haleem Issaq2, Timothy D. Veenstra2, Josip Blonder2,
Deborah Penque1*
1 Laboratório de Proteómica, Departamento de Genética, Instituto Nacional de Saúde
Dr. Ricardo Jorge (INSA-IP), Lisboa, Portugal
2 Laboratory of Proteomics and Analytical Technologies, SAIC-Frederick Inc., National
Cancer Institute at Frederick, Frederick MD, USA
3 Clínica Universitária de Pneumologia, HSM, Universidade de Lisboa, Portugal
Keywords: 16O/18O stable isotopic labelling, Chronic obstructive pulmonary disease,
Membrane proteins, Red blood cells, LC-MS/MS
*Corresponding author: Deborah Penque, Ph.D., Laboratório de Proteómica,
Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo
Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8137,
Fax: +351 21752 6410, E-mail: [email protected]
31
ABSTRACT
Structural alterations in erythrocyte shape/metabolism have been described as playing
an important role in the pathophysiology of COPD. Whether these
structural/metabolic dysfunctions alter erythrocyte’s membrane proteome in patients
diagnosed with COPD remained to be determined. The goal of this study was the
comparative proteomic profiling of the erythrocyte membranes isolated from
peripheral blood of smokers diagnosed with COPD with healthy smokers, by using
differential 16O/18O stable isotope labeling followed by strong cation exchange (SCX)
fractionation and high resolution LC-LIT/FTICR-MS. Two-hundred and nineteen
proteins were identified as significantly differentially expressed in COPD erythrocyte
membranes. Functional analysis indicates that the main pathway networks associated
with these proteins are related to cell-to-cell signaling and interaction, hematological
system development, function and immune response, oxidative stress and
cytoskeleton. Chorein, which is reported to play a role in the cytoskeleton and whose
defects had been associated with the presence of thorny deformations of circulating
erythrocytes possibly due to red cell membranes deformation was found to be
underexpressed in COPD patients. The potential relevance of this and other proteins in
the COPD erythrocyte dysfunction and/or COPD disease is discussed.
32
INTRODUCTION
Chronic obstructive pulmonary disease (COPD) is characterized by chronic airflow
limitation that poorly responds to bronchodilators. COPD is mainly caused by small
airway disease (obstructive bronchiolitis) and parenchymal destruction (emphysema).
A chronic inflammatory response induces narrowing of the small airways that leads to
airway remodeling. Subsequent parenchyma destruction is responsible for the loss of
alveolar attachments and decrease of lung elastic recoil [1]. These pathological
processes reduce the ability of the airways to remain open during expiration. There is
no evidence that currently available treatments significantly reduce the progression of
COPD or suppress the inflammation in small airways and adjacent lung parenchyma
[2]. Spirometry is the tool of choice to assess airflow limitation, diagnose and classify
the severity of COPD [3]. In 2000, approximately 2.7 million deaths were caused by
COPD [4] placing this disease as the fourth leading cause of death in the world [5]. This
substantial morbidity associated with COPD is often underestimated by healthcare
providers and patients because the COPD is frequently under-diagnosed and under-
treated [6]. Cigarette smoking is by far the most common risk and the most significant
promoting factor for COPD. Cigarette smokers have higher prevalence of respiratory
symptoms and lung function abnormalities, including greater annual rate of decline in
FEV1 resulting in higher COPD incidence and mortality when compared to nonsmokers
[4]. According to Løkke et al. (2006) [7], approximately 90% of COPD patients are
smokers or ex-smokers.
COPD is primarily a lung disease that also displays significant systemic effects [8-10]. It
has been indicated that structural alterations in erythrocyte shape/metabolism play an
important role in the pathophysiology of COPD. Erythrocytes were also reported to act
as biosensors for the monitoring of the oxidative imbalance during the course of COPD
[11]. For instance, when erythrocytes are depleted of critical enzymes (e.g., reduced
glutathione) needed for intermediary metabolism and antioxidant activity, it results in
oxidation of membrane proteins, lipids and hemoglobin, causing distortion and rigidity
of the cell membrane. In this respect, erythrocytes have been reported as biosensors
for oxidative imbalance monitoring during the course of COPD [11]. Structural
alterations in erythrocyte shape and changes in blood rheology were suggested to play
an important role in the development of COPD. Using scanning electron microscopy
33
(SEM), light fluorescence microscopy and electron paramagnetic resonance (EPR),
Santini et al. were able to show that the surface of erythrocytes from COPD patients is
greatly altered with respect to control RBCs. They also observed important alterations
in actin and spectrin distribution and an increase in membrane rigidity in the context
of COPD [12].
The goal of the present study was to examine global changes in protein expression
between the erythrocyte’s membrane proteome isolated from peripheral blood of
smokers diagnosed with COPD and of healthy smokers. For this purpose, 16O/18O stable
isotope labeling coupled with 2D-LC-MS/MS was employed for differential profiling of
erythrocyte microsomal fraction obtained from peripheral blood [13]. Relative changes
in protein concentrations were determined using quantitative proteomic profiling that
relies on trypsin-mediated 16O/18O stable isotope labeling, strong cation exchange
(SCX) fractionation and high resolution LC-LIT/FTICR-MS.
MATERIALS AND METHODS
The study was approved by the Ethics Committee of both Hospital de Santa Maria
(HSM)-Lisboa and Instituto Nacional de Saude Dr. Ricardo Jorge (INSA)-Lisboa After
informed consent, healthy subjects (n=28) and patients diagnosed with COPD (n=25)
according to the GOLD guidelines [3] were recruited from the Clinica Universitaria de
Pneumologia, HSM. All individuals were matched for age (≥ 45 years old), gender and
smoking habits (Table III.1). Healthy subjects were current smokers presenting no signs
or symptoms of any respiratory or other chronic diseases. All COPD patients were
smokers or ex-smokers. Patients experiencing disease exacerbation and/or suffering
from any additional respiratory or chronic disease (e.g., asthma) were excluded. The
most relevant COPD clinical features observed in patients are listed in TableIII.2.
Purification of Red Blood Cells (RBCs) from Whole Blood
Fresh peripheral blood was obtained from patients and controls into a 4.9 ml vacuum
blood-collection tube containing 1.6 mg EDTA/ml blood and kept no longer than 4
hours at 4ºC until RBC purification to avoid degradation. To isolate RBCs, whole blood
was centrifuged for 5 minutes at 2,000 x g using a swinging bucket rotor (Heraeus
Multifuge general purpose tabletop centrifuge). Both the plasma and the fraction
34
Table III.1: Main characteristics of both control and patient groups.
Control group COPD group
Subjects (n) 28 25
Male (n) 13 17
Age (yr)
(mean ± SD) 45 12 61 11
Current Smokers (%) 100 86
Smoking history (pack-yr)
(mean ± SD) 20 9 32 14
FEV1/FVC (%)
(mean ± SD) 80 9 66 16
FEV1 (%)
(mean ± SD) 92 12 81 29
Table III.2: Profile of the COPD patients.
COPD (n=25)
Age of diagnosis (yr)
(mean ± SD) 55 11
Average number of annual exacerbations
(mean ± SD) 2.4 0.9
P(O2) (mm Hg)
(mean ± SD) 78.8 15.1
P(CO2) (mm Hg)
(mean ± SD) 41.7 8.7
Chronic bronchitis (%) 76
Emphysema (%) 48
Respiratory Insufficiency (%) 36
containing mostly leukocytes and thrombocytes (buffy coat) were removed by careful
suction. The remaining packed RBCs was centrifuged for another 2 minutes for
complete removal of the buffy coat. The RBC fraction was washed three to four times
with 3 volumes of an isotonic buffer [0.9% NaCl (w/v) solution, pH 8.0] and between
washes centrifuged at 2,000 x g for 4 minutes at 4 °C to completely remove any left
buffy layer vestige.
35
RBC Ghost Preparation
RBC ghosts were immediately prepared [14] from the enriched RBC pellets and lysed
by incubation in 1 volume of 5 mM phosphate buffer, pH 7.4, containing protease
inhibitors, for 15 minutes at RT. The samples were then diluted to 20 volumes of the
same buffer and centrifuged at 4,500 x g for 10 minutes at 4 ºC. Resulting RBC ghosts
were washed three times in the same buffer and centrifuged for 10 minutes at 27,300
x g (Sorvall RC5C plus, SS-34 rotor). Finally, to ensure hemoglobin-free ghosts and save
the whitish pellet only, pellets were centrifuged for 5 minutes at 25,000 x g using a
benchtop centrifuge (Eppendorf Centrifuge 5417 R, FA-45-24-11 rotor) and stored at -
80 ºC till further use.
Microsomal Preparation
Isolated erythrocyte ghosts from the same group (control or patient) were pooled into
2 ml siliconized tubes in the presence of 50 mM ammonium bicarbonate and 1 mM
TCEP (final concentration). The two pooled ghost samples were lysed by sonication
(Bransonsonic 1510R-DTH, Danbury CT, USA) and then centrifuged at 100,000 x g for 1
hour (Beckman 50 Ti rotor). The membrane pellets were resuspended by sonication
(Branson Digital Sonifier 250, Danbury CT, USA) in 100 mM sodium carbonate,
incubated with agitation for 2 hours at 4 ºC and centrifuged at 100,000 x g for 90
minutes (OPTIMA TLX Ultracentrifuge) to pellet purified membranes. The pellets were
washed three times with d.d. H2O, resuspended in 50 mM ammonium bicarbonate,
followed by BCA protein assay (Pierce, Rockford IL, USA). Equal amount of control and
compared sample were centrifuged for 1 hour at 100,000 x g, lyophilized to dryness
and resuspended in methanol (60% v/v, Omnisolv, EM Science, Gibbstown, NJ, USA)
buffered with 50 mM ammonium bicarbonate using intermittent sonication in a water
bath. Tryptic (Promega, Madison, WI, USA) digestion was carried out at 37 ºC in the
60% methanol/buffer solution using a 1:20 w/w trypsin-to-protein ratio as previously
described [15].
Differential, post-digestion 16O/18O Labeling.
The 18O labeling was performed employing trypsin catalyzed 18O exchange at 1:20
trypsin/protein ratio in an organic/aqueous system consisting of 20% (v/v) methanol/
36
80% (v/v) 25mM ammonium bicarbonate, pH 7.9, prepared in H218O, as described in
detail by Blonder and co-workers [13]. After 4 hours of incubation, a second shot of
trypsin (containing 1 µg) was added to each sample in order to increase (18O)
incorporation efficiency. Exchange reactions were quenched by boiling the samples for
10 minutes in a water bath and after placing them on ice, the pH of each sample was
adjusted by TFA to pH of 2.5. Samples were then pooled and immediately lyophilized
to dryness.
Peptide Fractionation by Strong Cation Exchange Liquid Chromatography
Lyophilized peptides were dissolved in 200 µL of 45% acetonitrile containing 0.1%
formic acid prior to strong cation exchange (SCX) chromatography. The sample was
resolved into 10 fractions using a microcapillary LC system (Model 1100, Agilent
Technologies Inc., Palo Alto, CA) as previously described [13]. Briefly, peptide fractions
were eluted with an ammonium formate multistep gradient at a flow rate of 200
µl/minute as follows: 0-1% B in 2 minutes, 1-10% B in 60 minutes, 10-62 % B in 20
minutes, 62-100% B in 3 minutes. Mobile phase A was 45% CH3CN and mobile phase B
was 45% CH3CN, 0.5M ammonium formate pH 3. The SCX-LC fractions were lyophilized
to dryness and reconstituted in 0.1% formic acid immediately prior to MS/MS analysis.
Nanoflow RPLC–MS/MS Analysis
Nanoflow RPLC of each SCX fraction was carried on an Agilent 1100 nanoflow LC
system (Palo Alto, CA) using a 75 m (inner diameter) x 360 m (outer diameter) x 10
cm long in house packed fused silica capillary column (Polymicro Technologies Inc.,
Phoenix, AZ) using 3 m, 300 Å pore size C18 media (Vydac, Hysperia, CA). The column
was coupled to a hybrid linear ion trap-Fourier transform ion cyclotron resonance (MS)
(LTQ-FT, ThermoElectron, San Jose, CA) using the nano-electrospray ionization source
supplied by the manufacturer. After injecting 5 µl of sample, the column was washed
for 30 minutes (at 0.5 l/minute) with 2% B, and peptides eluted (at 0.25 l/minute)
using a linear gradient as follows: 2-60% B in 100 minutes, 60-98% B in 20 minutes,
98% B for 20 minutes. The column was re-equilibrated with 2% B for 30 minutes prior
to subsequent sample loading using the flow rate of 0.5 µl/minute. Mobile phase A
was 0.1% formic acid in H2O and mobile phase B was 0.1% formic acid in acetonitrile.
37
The MS was operated in a data-dependent mode where the five most intense ions
detected in each FTICR-MS scan (m/z 200-2000) were selected for MS/MS in the ion
trap (precursor selection from m/z 400-2000). Normalized collision energy of 36% was
employed for collision-induced dissociation (CID) along with dynamic exclusion of 90
seconds to reduce redundant selection of peptides for CID. The ESI voltage and the
heated capillary temperature were set at 1.6 kV and 160 C, respectively.
Data Processing
CID spectra were analyzed using SEQUEST, on a Beowulf 20-node parallel virtual
machine cluster computer (ThermoElectron) against a non-redundant human
proteome database. A dynamic modification +4.008 Da was set on the C-terminus for
18O labeled peptides. Required precursor ion mass tolerance was 0.08 Da in MS mode
and 0.5 Da in MS2 mode. Only peptides possessing tryptic termini (allowing for up to
two internal missed cleavages), delta-correlation scores ( Cn) ≥0.08 and charge state-
dependent cross correlation (Xcorr) criteria: ≥1.9 for *M+H++1 peptides, ≥2.2 for
*M+H++2 peptides, ≥3.5 for *M+H++3, ≥4.5 for *M+H++4 peptides were considered
legitimate identifications. Relative abundances for differentially labeled isotopomeric
peptides were calculated from their mono-isotopic peaks and respective extracted ion
chromatogram areas calculated using XPRESS software (Thermoelectron, San Jose, CA)
and are reported as heavy-to-light 18O/16O ratio (i.e., 18O labeled COPD sample /16O
normal sample) for a particular peptide/protein.
Protein Annotation and Classification
Protein annotation properties were acquired using Protein Information and Knowledge
Extractor (PIKE) and Protein ANalysis THrough Evolutionary Relationships (PANTHER)
softwares (available on http://proteo.cnb.uam.es:8080/pike/ and
http://www.pantherdb.org/, respectively). Ingenuity Pathways Analysis (Ingenuity
Systems®, www.ingenuity.com) software was also used to retrieve protein information
through its knowledgebase and, importantly, to analyze potential protein-protein
interactions and group the identified proteins into signaling pathways. For the latter
purpose, we also used Cytoscape (v2.6.3, available on http://www.cytoscape.org/)
loaded with two different databases: human protein reference database (HPRD,
38
release date September 2007) and protein interaction network analysis (PINA, release
date April 2009). BiNGO, a Java-based tool that is implemented as a plug-in for
Cytoscape was utilized to check the overrepresented gene ontology terms. To check
this we set the Hypergeometric Test as the statistical test, Benjamini & Hochberg's FDR
correction for multiple testing correction and 0.05 as the significance level.
Transmembrane Domain Prediction and Hydropathicity Calculation
Alpha-helical transmembrane domains (TMD) were mapped using TMHMM available
at http://www.cbs.dtu.dk/services/TMHMM [16, 17] while protein grand average of
hydropathicity (GRAVY) scores [18] were calculated using ProtParam tool available at
the ExPASy Proteomics Server (http://www.expasy.org/tools/protparam.html).
Western Blot
Protein extracts obtained from lysed RBCs were quantified (BCA; Thermo Scientific
Pierce BCA protein assay) and 10 µg of each sample were separated in triplicate using
4-12 % (w/v) polyacrylamide gels (NuPAGE Novex Bis Tris, Invitrogen), transferred into
nitrocellulose membranes (Protran, Whatman). Membranes were probed with 1:6000
rabbit anti-CYB5R3 (Sigma-Aldrich), 1:250 mouse anti-ALDOA (Abcam), 1:1000 rabbit
anti-AARE (Abcam) or 1:200 rabbit anti-VPS13A (Abcam), for 2h at RT and developed
using enhanced chemiluminescence - ECL (Thermo Scientific Pierce ECL Western
Blotting Substrate). Antibody dilutions were all made in PBS containing 5% (w/v) fat
free milk. All membranes were washed 5 times for 10 minutes with stripping buffer
[1.5 % (w/v) Glycine, 0.1 % SDS (w/v), 1 % Tween 20 (v/v), pH 2.2], twice with PBS and
once with PBS-T before reprobing with the next primary antibody. The abundance of
the tested proteins in RBCs was calculated from densitometry of immunoblots (n=3
replicates) using Progenesis PG200v2006 software (Nonlinear Dynamics). The
corresponding Ponceau-stained lane total intensity (nitrocellulose membrane) was
used for western blot normalization.
RESULTS
To investigate into enriched erythrocyte membrane proteins for potential biomarkers
of COPD, erythrocyte microsomal fractions were prepared from pooled RBC ghost
39
samples isolated from peripheral blood of COPD patients (n=25) and healthy smoker
controls (n=28). Briefly, erythrocyte ghosts (cytological evaluation available on
Supplemental data – Figure 1 – SD_F1) were lysed by sonication in 50 mM ammonium
bicarbonate, centrifuged and resulting pellets incubated with sodium carbonate.
Samples were then normalized to same protein amount and trypsin digested before
they were labeled employing a trypsin catalyzed 18O exchange buffered system
containing 20% (v/v) methanol [13]. The COPD and control samples were pooled
together and separated into ten fractions through strong cation exchange
chromatography. Each one of these fractions was loaded into a reverse phase column
coupled online to a linear ion trap – ion-ciclotron resonance mass spectrometer
operating in a data-dependent mode where the five most intense ions detected in
each FTICR-MS scan (m/z 200-2000) were selected for MS/MS in the ion trap. Resulting
data was searched via Bioworks (SEQUEST) employing standard filtering criteria (see
material and methods for details). Relative peptide/protein abundances between
COPD patients’ and controls’ samples were quantified using Xpress software
(Thermoelectron, San Jose, CA). The general workflow displaying the main steps
involved, ranging from RBC ghost purification, to mass spectrometry (MS) and
corresponding data analysis is shown in Figure III.1.
Figure III.1: Basic scheme of methodology showing main steps of sample preparation.
40
We estimated a false positive rate for this dataset to be less than 5%, in accordance to
probability-based evaluation of peptide and protein identifications from tandem mass
spectra and SEQUEST analysis of the human proteome [19]. A total of 4697 peptides
were quantified as present in both COPD and control spectra corresponding to 1083
proteins (Supplemental Table III.1, Supporting Information). Three-hundred and
fourteen proteins possessing at least two identified peptides were selected for relative
quantification by calculating the 18O/16O ratio using XPRESS software (Supplemental
Table III.1, Supporting Information). Figure III.2 shows the SCX chromatogram of
sample separation into ten fractions.
Figure III.2: SCX chromatogram displaying sample separation into ten fractions.
Two-hundred and nineteen proteins were identified significantly over- and
underexpressed in COPD samples when applying a 1.5-fold threshold to the dataset.
These proteins were further analyzed by bioinformatics tools for cellular location and
functional annotation. Due to inconsistency when classifying proteins according to
their subcellular location through different softwares, we chose to manually curate
41
each one of the identified proteins. In order to get more information, we used both
gene ontology annotations and ingenuity knowledgebase. We were able to classify 310
out of the 314 proteins. Sample preparation toward enrichment of membrane proteins
was successful as 46% of the identified proteins were membrane proteins as displayed
in Figure III.Figure III..
Figure III.3: Subcellular location of the 314 proteins identified by at least two
peptides according to both gene ontology annotations and ingenuity systems
knowledgebase.
Moreover, there were proteins categorized as cytoplasmic proteins that belong to
cytoskeleton network as nebulin, spectrin or cytoskeleton-associated protein 5. We
also evaluated protein classification according to biological processes and molecular
functions for the whole dataset (proteins identified by at least two peptides) and for
differentially expressed proteins in COPD patients set by a threshold of 1.5-fold over-
or underexpression (Figure III.4) using PANTHER. This software was also used to get
the predominant pathways (Table III.3).
Transmembrane domain and hydrophobicity index analysis
We extended the analysis of our data using the TMHMM [16, 17] algorithm to map α-
helical integral membrane proteins and the GRAVY [18] index calculation to
42
characterize hydropathical character of this dataset. The GRAVY index is a global
descriptor of protein solubility, and corresponds to the sum of hydrophobicity values
for each of the amino-acids in the protein, normalized according to protein length
(Note: proteins exhibiting positive GRAVY values were recognized as hydrophobic
while proteins exhibiting negative GRAVY values were recognized as hydrophilic).
Figure III.4: Biological processes (panels A and C) and molecular functions (panels B
and D) for the whole proteins identified in both COPD patients and control subjects
(panels A and B) and for differentially (above 1.5-fold) expressed proteins only
(panels C and D). Information gathered from PANTHER software.
The TMHMM algorithm classified a total of 89 proteins as α-helical integral membrane
proteins possessing at least one transmembrane domain (Supplemental Table III.3,
Supporting Information). Of these, 40 proteins were characterized as hydrophobic
based on their positive GRAVY value. Also, there were found 16 proteins possessing a
positive GRAVY index although no transmembrane domain was predicted
(Supplemental Table III.3, Supporting Information). Therefore, according to GRAVY
43
index or TMHMM predictions, we were able to classify 34% of the dataset as potential
membrane proteins, which is consistent to what we found from information gathered
from protein databases (Figure III.).
Table III.3: Predominant pathways associated to COPD patients when compared to
healthy smokers as provided by PANTHER.
Category name (PANTHER Accession) # genes
Percent of gene hit
against total #
genes
Percent of gene hit
against total #
Pathway hits
Inflammation mediated by chemokine and
cytokine signaling pathway (P00031) 20 3,6% 6,5%
Ubiquitin proteasome pathway (P00060) 14 2,5% 4,5%
Wnt signaling pathway (P00057) 12 2,2% 3,9%
Integrin signalling pathway (P00034) 11 2,0% 3,5%
Angiogenesis (P00005) 10 1,8% 3,2%
Parkinson disease (P00049) 10 1,8% 3,2%
Huntington disease (P00029) 10 1,8% 3,2%
PDGF signaling pathway (P00047) 8 1,4% 2,6%
B cell activation (P00010) 8 1,4% 2,6%
Cytoskeletal regulation by Rho GTPase (P00016) 7 1,3% 2,3%
Apoptosis signaling pathway (P00006) 6 1,1% 1,9%
T cell activation (P00053) 6 1,1% 1,9%
Nicotinic acetylcholine receptor signaling
pathway (P00044) 6 1,1% 1,9%
Thyrotropin-releasing hormone receptor
signaling pathway (P04394) 6 1,1% 1,9%
Oxytocin receptor mediated signaling pathway
(P04391) 6 1,1% 1,9%
Hallmark Red Blood Cell Membrane ProteinsError! Reference source not found.Figure
II.5 highlights proteins that were identified in this study as part of the two
macromolecular complexes of membrane proteins of major importance to structural
integrity of RBC membrane [20]. As expected, the highest number of peptide counts
(1092), accounting for 23% of total peptide counts, belong to the most copious protein
in RBC membrane, band 3, also termed anion exchanger 1. Proteins as Glucose
transporter type 1, Glycophorin C, 55 kDa erythrocyte membrane protein, Kell blood
44
group glycoprotein, Erythrocyte band 7 integral membrane protein or Erythrocyte
phospholipid scramblase are examples of other important erythrocyte membrane
proteins that were also identified (Figure III.5). Additional information on relative
abundance of these RBC membrane proteins in COPD patients compared to healthy
controls is available in Supplemental Figure III.2 and Supplemental Table III.4,
Supporting Information.
Figure III.5: Proteins identified in both samples within the two main RBC membrane
protein complexes. Adapted from [20].
Differentially expressed proteins in COPD
Molecules showing high changes in their relative abundance between control and
patient groups were carefully evaluated. Top-ten overexpressed proteins identified in
COPD RBC microsomal fraction are displayed in Table III.4. Their functions are related
to transport (e.g., H+ transport across the cellular membranes), proteasomes, ATP-
binding cassettes, kinases, chemokines, among others. Underexpressed proteins are
45
associated with cystokeleton networks [e.g., Chorein or Vacuolar protein sorting-
associated protein 13A (VPS13A), Kinesin-2 (KIF2A), xenobiotic metabolism [e.g,
cytochrome b5 reductase 3 (CYB5R3), protein kinase c iota type (PRKCI)].
Table III.4: Ten most overexpressed proteins in COPD erythrocyte ghost as provided
by Ingenuity systems knowledgebase. a) Swiss-Prot/Uniprot accession number.
Accession
numbera
Description Fold
Change Type Location
P27449 ATPase, H+ transporting, lysosomal
16kDa, V0 subunit c 6,73 transporter Cellular membrane
Q15836 Vesicle-associated membrane protein 5,96 other Plasma membrane
Q13439 golgi autoantigen, golgin subfamily a, 4 5,31 other Cellular membrane,
Golgi membrane
P51665 proteasome (prosome, macropain) 26S
subunit, non-ATPase, 7 5,07 other Nucleus
P49721 proteasome (prosome, macropain)
subunit, beta type, 2 5,00 peptidase Nucleus
Q96L73 nuclear receptor binding SET domain
protein 1 4,44
transcription
regulator Nucleus
Q9NRK6 ATP-binding cassette, sub-family B
(MDR/TAP), member 10 4,42 transporter
Membrane
fraction,
mitochondria inner
membrane
P02462 collagen, type IV, alpha 1 4,04 other
Basement
membrane,
extracellular space
Q9UL99 hyaluronoglucosaminidase 4 3,53 enzyme Unknown
Q14146 KIAA0133 3,36 other Cellular membrane,
cytoplasm, nucleus
‘Interactome’ using Ingenuity Pathway Analysis
Data were also analyzed through the use of Ingenuity Pathway Analysis software
(Ingenuity® Systems, www.ingenuity.com) applying a 1.5-fold change (over or
underexpression) expression value cutoff. According to Ingenuity knowledgebase, from
the total 314 identified proteins submitted for analysis, 305 were found mapped and
46
159 molecules were found to be eligible for analysis. Eligible molecules were searched
against the Ingenuity knowledgebase and top-10 networks according to the statistic
score are displayed in Supplemental Figure III.3. Interestingly, top-10 networks were
interconnected together and so it was possible to merge them (Supplemental Figures
III.4 and III.5, Supporting Information, respectively). The top network (possessing the
highest statistic score) was related to cell-to-cell signaling and interaction,
hematological system development and function and immune response (Supplemental
Figure III.6, Supporting Information).
There were found four proteins associated with oxidative stress in the top-10
networks. Catalase (CAT) is associated to cellular movement, hematological system
development and function and immune response (network 2). CAT was found to be
overexpressed by two fold in COPD patients. At the same level of overexpression was
found peroxiredoxin 2 (PRDX2, network 1). In contrast, myeloperoxidase (MPO,
networks 1 and 4) and nuclear factor of kappa light polypeptide gene enhancer
(NFKB2, network 8) were found to be underexpressed by two fold in this study as
shown in Table III.5.
Table III.5: Proteins associated to oxidative stress present in top-10 networks. a)
Swiss-Prot/Uniprot accession number; b) According to ingenuity pathways analysis.
Gene
Symbol Description
Accession
numbera
Fold
Change Type Networks
b
CAT Catalase P04040 2,22 enzyme 2
MPO Myeloperoxidase P05164 -2,22 enzyme 1, 4
NFKB2
nuclear factor of kappa light
polypeptide gene enhancer in
B-cells 2 (p49/p100)
Q00653 -2,86 transcription
regulator 8
PRDX2 peroxiredoxin 2 P32119 2,31 enzyme 1
Interestingly, among the thirteen members of the solute carrier (SLC) family identified,
only two of them (SLC2A4, SLC30A1) were found overexpressed in COPD and this
overexpression was below the 1.5-fold threshold. All the other eleven members of this
family were quantified as underexpressed, six of which beyond the 1.5-fold threshold
(highlighted in Supplemental Table III.5, Supporting Information). A similar pattern was
47
observed for proteasome proteins, but with inverse expression. There were eleven
proteasome proteins quantified in this study and only two (PSMB1, PSMC2) were
downregulared in COPD (Supplemental Table III.6).
Protein-Protein Interactions using Cytoscape 2.6.3
Ingenuity Systems uses its own knowledge base to group proteins into pathways.
Opposite to this, Cytoscape is an open source bioinformatics software platform for
visualizing molecular interaction networks and biological pathways through an
interaction database that is selected by the user and that is matched to the dataset.
Figure III.6: Main protein-protein interaction network comprising 43 members
generated by Cytoscape 2.6.3 using PINA database. Red- Significantly differential
overexpressed proteins in COPD; Green- Significantly differentially underexpressed
proteins in COPD; Purple- Not significantly differential expressed proteins (1.5 fold
threshold).
48
In this work, two protein-protein interaction databases: human protein reference
database (HPRD) and protein interaction network analysis (PINA) were used to
investigate protein-protein interactions within our dataset alone, i.e. no first neighbors
were allowed.
Figure III.7: Overrepresented biological processes (GO) for the differentially (above
1.5-fold) expressed proteins in COPD patients.
Using HPRD, we were able to generate two networks containing more than 3 proteins
(Supplemental Figure 7, Supporting Information). Of particular interest, there was a
group of interacting proteins comprising a network of 25 proteins (Supplemental
Figure 8, Supporting Information). However, using PINA it was possible to merge these
two networks through ANK1 and constitute a larger network (Supplemental Figure 9,
Supporting Information). Also, since PINA set the interaction between PSMD7 and
PSMD6 it was possible to add some more members constituting a network of about 43
proteins (Figure III.6). This network could achieve at least 45 proteins by connecting
PLSCR1 to CRK and VAV1 as their interactions were shown when using HPRD
(Supplemental Figure 8, Supporting Information). PINA was the chosen approach for
evaluating differentially expressed proteins over 1.5-fold between patients and control
groups (Supplemental Figure 10, Supporting Information). A protein-protein
interaction network of 14 members was generated, from which only 3 proteins were
49
found to be overexpressed in COPD patients’ RBCs. It was interesting to observe that
generated protein-protein interactions exhibit the same pattern (over- or
underexpression) within their members. One of these networks grouped five
proteosome proteins, all found to be overexpressed in COPD patients – PSMD2,
PSMD3, PSMD6, PSMD7 and PSMA6.
BinGO, a freely available Java-based tool that is implemented as a plug-in for
Cytoscape was also employed. BinGO determines which gene ontology (GO) categories
are statistically over- or underrepresented in the submitted gene list. Again, both the
whole dataset (Supplemental Figure 11, Supporting Information) and differentially
expressed proteins (over 1.5-fold, Figure III.7) subset were submitted. As expected,
overrepresented terms are quite similar when running both analyses. There were a
few terms that were only overrepresented when the whole dataset was submitted and
thus when looking into data generated from differentially expressed proteins only, two
branches were maintained: terms related to regulation of ubiquitin-protein ligase and
catalytic activity and to proteasomal ubiquitin-dependent protein catabolic process.
Western blot validation
Biochemical validation of the results obtained from MS was performed by WB analysis.
Equal amounts of total protein extracts obtained from either controls or patients were
used in three independent experiments to quantify the relative abundance of NADH-
cytochrome b5 reductase 3 (CYB5R3), fructose-bisphosphate aldolase A (ALDOA),
acylamino-acid-releasing enzyme (AARE) and vacuolar protein sorting-associated
protein 13A (VPS13A) by densitometric analysis of immunoblots (Figure III.8). In the
absence of an internal housekeeping control, normalization to Ponceau-stained full
lane membrane intensity was the employed method. Data obtained by WB presented
the same expression trend, therefore confirming the results previously obtained by
MS, except for AARE where WB showed decrease abundance of this protein in COPD
patients, while the opposite was observed by MS.
50
Figure III.8: Western blot validation showing both representative close-up views of
each Ab reaction and graphic representation of the relative normalized abundance of
(A) Acylamino-acid-releasing enzyme (AARE ), (B) ALDOA, (C) VPS13A and (D)
CYB5R3, using the full intensity of the respective Ponceau-stained lane in the
nitrocellulose membrane for normalization (n=3 independent replicates/each Ab
reaction). The antigen–antibody complex was detected by ECL (GE Healthcare) and
Progenesis PG200v2006 software (Nonlinear Dynamics) was used for densitometry
analysis.
DISCUSSION
Human RBCs have a life-span of about 120 days and during this time they travel across
the body through the blood stream and communicate with different types of
metabolites, cells and tissues. COPD is not known to affect RBCs directly, but may be
responsible for alterations that could be exploited at the proteome level for diagnostic
purposes or at least to gain a better understanding of the disease and its implications
in RBCs. The RBC proteome was believed to be simple as RBCs lack internal organelles.
51
However, in the past years, the knowledge of the RBC proteome increased
dramatically and changed this idea [21-29]. This fact is due not only to the
development of sample preparation (fractionation/enrichment) techniques, but mostly
to the use of sophisticated mass spectrometers, appropriate search algorithms and to
comprehensive human protein databases.
The RBC membrane is a composite structure in which spectrin-actin based membrane
skeletal network is coupled to a lipid bilayer either by direct interaction with lipids or
by linker proteins which interact simultaneously with the cytoplasmic domain of
transmembrane proteins and spectrin. Linkage of membrane proteins (e.g. Stomatin
that was identified by 339 peptides), glycophorin C (132 peptides identified) and band
3 (1092 peptides identified) to spectrin-actin based skeleton by linker proteins 4.1R
and ankyrin has been well established for over two decades. These two major
complexes – ankyrin and 4.1R complexes – are shown in Figure III.5 where proteins
identified within the present study are reported as highlighted (see Supplemental
Figure III.2 and Supplemental Table III.4, Supporting Information for additional
information on relative abundance of these proteins). Erythrocyte morphology, their
mechanical deformability and elasticity are crucial for them to preserve their function
in oxygen uptake, especially as they pass across the circulating system at lung level
which must be carried out in an ordered and sequential manner. In the present study it
was possible to identify an important number of differentially expressed proteins
directly or indirectly linked to the erythrocyte plasma membrane as integral
membrane proteins or cytoskeletal proteins that may lead to alterations in the COPD
patients’ erythrocyte membrane.
Among the most overexpressed proteins found is ATPase, H+ transporting, lysosomal
16 kDa, V0 subunit c (ATP6V0C) which is related to oxidative stress. V1 domain is
cytosolic and it is the ATP catalytic site whereas V0 is the transmembrane domain. This
protein was reported to be a proton-transporting two-sector ATPase complex across
membranes [30] and is involved in oxidative phosphorylation pathway. Golgin-
245/p230 (GOLGA4), overexpressed in COPD, is reported to be essential for
intracellular trafficking and cell surface delivery of tumor necrosis factor-α (TNF) [31].
TNF is the main proinflammatory cytokine made and secreted by inflammatory
52
macrophages enhancing activation and recruitment of T-cells and ensures robust
innate and acquired immune responses. ATP-binding cassette, subfamily B (MDR/TAP)
member 10 (ABCB10) or ABC-me (ABC-mitochondrial erythroid), also overexpressed in
COPD, is located in the inner mitochondrial membrane and was suggested to play a
role in erythroid differentiation. ABC-me is induced during erythroid differentiation in
cell lines, where hem biosynthesis predominantly occurs, and its overexpression
enhances hemoglobin synthesis in erythroleukemia cells [32, 33]. Yet, its physiological
role in humans is still an open question. Hyaluronoglucosaminidase 4 (HYAL4), a
protein whose subcellular location is still unknown was also found to be overexpressed
in COPD patients and this protein’s activity is reported to be regulated by IL1B, an
interleukin that is associated with inflammation in airway diseases [34, 35].
Chorein or Vacuolar protein sorting-associated protein 13A (VPS13A) is a large protein
whose predicted molecular weight is approximately 360 kDa that is reported to play a
role in the cytoskeleton and intracellular transport (most likely Golgi to endosome
transport) and defects in this protein are associated with the presence of
acanthocytosis, thorny deformations of circulating erythrocytes, possibly due to red
cell membranes deformation [36, 37]. The mechanism by which acanthocytosis is
formed is not known, but it is hypothesized to be due to expansion of the outer leaflet
of the lipid bilayer of RBC membranes, in contrast to stomatocytosis, which results
from expansion of the inner leaflet [38]. This protein was found to be underexpressed
in COPD patients when compared to controls by MS and this underexpression was
confirmed by WB. Consequently, changes in VPS13A may play a central role in the
deformation of COPD RBCs that has already been reported before [12]. Another
protein associated with the cytoskeleton found to be underexpressed in COPD in this
study was kinesin heavy chain member 2A (KIF2A). Kinesin-2 is one of the most
ubiquitously expressed of the molecular motors known as kinesin superfamily proteins
(KIFs) that had been implicated key players in the intracellular transport system, which
is essential for cellular function and morphology [39]. Members of this superfamily
have been shown to transport membrane-bound organelles, protein complexes and
mRNAs to specific destinations along microtubules while hydrolyzing ATP for energy
[40]. Kinesin-2 interacts with tumor suppressor adenamatous polyposis coli (APC) and
53
this interaction is essential for transport of APC along microtubules to the tips of
membrane protrusions [41, 42]. Kinesin-2 is a heterotrimeric complex composed of a
KIF3A/3B heterodimer and an adaptor protein, the kinesin superfamily-associated
protein 3 (KAP3). APC interacts with KIF3A/3B via an association with KAP3. APC
activates APC-stimulated guanine nucleotide exchange factor (Asef) and regulates the
actin cytoskeletal networks, cell morphology, adhesion and migration [41]. In addition
to VPS13A underexpression, KIF2A, may also be contributing to the changes in COPD
patients’ RBCs shape, which in turn may be responsible for defective oxygen uptake
and deliver.
Cytochrome b5 reductase 3 (CYB5R3) was also found to be underexpressed in COPD
and its underexpression in COPD patients RBCs was confirmed by WB. The enzyme
cytochrome b5 reductase 3 catalyses the transfer of reducing equivalents from the
physiological electron donor, NADH, generated in the Emben Meyerhof pathway, to
the small haemoprotein of cb5. As indicated by its alternative name, methaemoglobin
reductase, one of the major activities of this protein is the reduction of
methaemoglobin. Hence, deficiency of this protein is associated with
methaemoglobinemia. When oxygen is released to the tissues, the iron atom is stored
at ferrous (Fe2+) state. In contrast, if the atom is oxidized to the ferric state (Fe3+) as the
result of oxidative stress (e.g. as a result of tobacco smoke), it lacks the electron
required to combine with oxygen and so, once the hem moiety is in the ferric state,
hemoglobin is incapable of transporting oxygen [45, 46]. The oxidative stress caused by
tobacco smoke leads to consequences in the oxidant and antioxidant balance [47],
which is one of the hallmarks of the disease [48].
Antioxidant proteins catalase and peroxiredoxin 2 were found 2.2 and 2.3-fold
overexpressed, respectively. Catalase is a homotetrameric antioxidant enzyme that
decomposes hydrogen peroxide into water and oxygen and is especially concentrated
in erythrocytes. Catalase has been shown to be increased in hyperoxia and its
significance in pulmonary defense, especially at the alveolar level is important [49].
Peroxiredoxins comprise a large group of proteins whose function is to catalyse the
degradation of lipid hydroperoxides and hydrogen peroxide [49, 50]. Peroredoxins are
a recently described family of nonselenoperoxidases that catalyses the reduction of a
54
broad spectrum of peroxides. Their function can be also associated with cellular
signaling mechanism during oxidative stress and, in human lungs, peroxiredoxins have
been implicated to have an important role in protection against exogenous as well as
endogenous oxidant challenge [51]. Peroxiredoxin 2 has been reported to be elevated
in lung carcinomas [52]. Another interesting fact is that amongst the thirteen members
of the solute carrier family, only two of them were found overexpressed in COPD and
this overexpression was below the 1.5-fold threshold. All the other members of this
family that were identified were underexpressed, six of which beyond the 1.5-fold
threshold (Supplemental Table 5, Supporting Information). A similar pattern was
observed for proteasome proteins, but with inverse expression. There were eleven
proteasome proteins quantified in this study and only two were downregulared in
COPD (Supplemental Table 6, Supporting Information). One of these is PSMB2 whose
activity is regulated by nicotine [53].
CONCLUSION
This work intended to provide new insights into which events may be leading to RBC
membrane deformation in COPD patients. These events have been reported before by
means of electronic microscopy, but there was no information available to what
molecular biology is concerned. In the present study it was possible to find some
differentially expressed proteins that may be contributing to this process of membrane
deformation. Chorein (VPS13A), a large protein whose predicted molecular weight is
approximately 360 kDa, was reported to play a role in the cytoskeleton and
intracellular transport. Importantly, defects in this protein are associated with the
presence of acanthocytosis, thorny deformations of circulating erythrocytes, possibly
due to red cell membranes deformation. This protein was found to be underexpressed
in COPD patients when compared to controls by MS and this underexpression was
confirmed by WB. There were a considerable number of proteins with none or very
few information which difficult to achieve a higher level of understanding to what
other proteins may be involved in this deformation from the dataset we produced. The
final step will be to set the connection between what is happening at RBCs’ membrane
level to the pathophysiology of COPD and to this extent new studies will be necessary
55
to assess whether RBCs can be a source of biomarkers for the prognostic/diagnosis of
this disease.
56
ACKNOWLEDGEMENTS
The authors would like to thank all patients and healthy individuals who voluntarily
participated in this study; Maria Teresa Seixas and colleagues from Laboratorio de
Hematologia, Instituto Nacional de Saude Dr. Ricardo Jorge (INSA) for complete blood
count (CBC); Pedro Loureiro, Arminda Vilares and Ana Cardoso for blood collection;
colleagues from Laboratorio de Proteomica (INSA) and National Cancer Institute at
Frederick (NCI-SAIC); Dr. Patricia Gomes-Alves for assistance with densitometry
including the respective statistical analysis. This work was partially supported by
Fundação para a Ciência e a Tecnologia (FCT)/FEDER (POCTI/SAU-MMO/56163/2004),
FCT/Poly-Annual Funding Program and FEDER/Saude XXI Program (Portugal). BMA and
NC are recipients of FCT doctoral fellowship (SFRH/BD/31415/2006 and
SFRH/BD/27906/2006).
57
REFERENCES
[1] Barnes, P. J., Shapiro, S. D., Pauwels, R. A., Chronic obstructive pulmonary disease:
molecular and cellular mechanisms. Eur Respir J 2003, 22, 672-688.
[2] Barnes, P. J., Stockley, R. A., COPD: current therapeutic interventions and future
approaches. Eur Respir J 2005, 25, 1084-1106.
[3] Global initiative for chronic obstructive lung disease 2008.
[4] Lopez, A. D., Shibuya, K., Rao, C., Mathers, C. D., et al., Chronic obstructive
pulmonary disease: current burden and future projections. Eur Respir J 2006, 27, 397-
412.
[5] Geneva 2000.
[6] Pauwels, R. A., Rabe, K. F., Burden and clinical features of chronic obstructive
pulmonary disease (COPD). Lancet 2004, 364, 613-620.
[7] Goodman, S. R., Kurdia, A., Ammann, L., Kakhniashvili, D., Daescu, O., The human
red blood cell proteome and interactome. Experimental biology and medicine
(Maywood, N.J 2007, 232, 1391-1408.
[8] Agusti, A., Systemic effects of chronic obstructive pulmonary disease: what we
know and what we don't know (but should). Proceedings of the American Thoracic
Society 2007, 4, 522-525.
[9] Agusti, A., Soriano, J. B., COPD as a systemic disease. Copd 2008, 5, 133-138.
[10] Agusti, A., Systemic effects of COPD: just the tip of the Iceberg. Copd 2008, 5, 205-
206.
[11] Lucantoni, G., Pietraforte, D., Matarrese, P., Gambardella, L., et al., The red blood
cell as a biosensor for monitoring oxidative imbalance in chronic obstructive
pulmonary disease: an ex vivo and in vitro study. Antioxidants & redox signaling 2006,
8, 1171-1182.
[12] Santini, M. T., Straface, E., Cipri, A., Peverini, M., et al., Structural alterations in
erythrocytes from patients with chronic obstructive pulmonary disease. Haemostasis
1997, 27, 201-210.
58
[13] Blonder, J., Chan, K. C., Issaq, H. J., Veenstra, T. D., Identification of membrane
proteins from mammalian cell/tissue using methanol-facilitated solubilization and
tryptic digestion coupled with 2D-LC-MS/MS. Nature protocols 2006, 1, 2784-2790.
[14] Dodge, J. T., Mitchell, C., Hanahan, D. J., The preparation and chemical
characteristics of hemoglobin-free ghosts of human erythrocytes. Archives of
biochemistry and biophysics 1963, 100, 119-130.
[15] Blonder, J., Yu, L. R., Radeva, G., Chan, K. C., et al., Combined chemical and
enzymatic stable isotope labeling for quantitative profiling of detergent-insoluble
membrane proteins isolated using Triton X-100 and Brij-96. Journal of proteome
research 2006, 5, 349-360.
[16] Sonnhammer, E. L., von Heijne, G., Krogh, A., A hidden Markov model for
predicting transmembrane helices in protein sequences. Proceedings / ... International
Conference on Intelligent Systems for Molecular Biology ; ISMB 1998, 6, 175-182.
[17] Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E. L., Predicting
transmembrane protein topology with a hidden Markov model: application to
complete genomes. Journal of molecular biology 2001, 305, 567-580.
[18] Kyte, J., Doolittle, R. F., A simple method for displaying the hydropathic character
of a protein. Journal of molecular biology 1982, 157, 105-132.
[19] Qian, W. J., Liu, T., Monroe, M. E., Strittmatter, E. F., et al., Probability-based
evaluation of peptide and protein identifications from tandem mass spectrometry and
SEQUEST analysis: the human proteome. Journal of proteome research 2005, 4, 53-62.
[20] Mohandas, N., Gallagher, P. G., Red cell membrane: past, present, and future.
Blood 2008, 112, 3939-3948.
[21] Low, T. Y., Seow, T. K., Chung, M. C., Separation of human erythrocyte membrane
associated proteins with one-dimensional and two-dimensional gel electrophoresis
followed by identification with matrix-assisted laser desorption/ionization-time of
flight mass spectrometry. Proteomics 2002, 2, 1229-1239.
[22] Tyan, Y. C., Jong, S. B., Liao, J. D., Liao, P. C., et al., Proteomic profiling of
erythrocyte proteins by proteolytic digestion chip and identification using two-
59
dimensional electrospray ionization tandem mass spectrometry. Journal of proteome
research 2005, 4, 748-757.
[23] Kakhniashvili, D. G., Bulla, L. A., Jr., Goodman, S. R., The human erythrocyte
proteome: analysis by ion trap mass spectrometry. Mol Cell Proteomics 2004, 3, 501-
509.
[24] Bruschi, M., Seppi, C., Arena, S., Musante, L., et al., Proteomic analysis of
erythrocyte membranes by soft Immobiline gels combined with differential protein
extraction. Journal of proteome research 2005, 4, 1304-1309.
[25] Pasini, E. M., Kirkegaard, M., Mortensen, P., Lutz, H. U., et al., In-depth analysis of
the membrane and cytosolic proteome of red blood cells. Blood 2006, 108, 791-801.
[26] Roux-Dalvai, F., Gonzalez de Peredo, A., Simo, C., Guerrier, L., et al., Extensive
analysis of the cytoplasmic proteome of human erythrocytes using the peptide ligand
library technology and advanced mass spectrometry. Mol Cell Proteomics 2008, 7,
2254-2269.
[27] Pasini, E. M., Lutz, H. U., Mann, M., Thomas, A. W., Red Blood Cell (RBC)
membrane proteomics - Part II: Comparative proteomics and RBC patho-physiology.
Journal of proteomics 2009.
[28] Liumbruno, G., D'Alessandro, A., Grazzini, G., Zolla, L., Blood-related proteomics.
Journal of proteomics 2009.
[29] Alexandre, B. M., Proteomic mining of the red blood cell: focus on the membrane
proteome. Expert review of proteomics 2010, 7, 165-168.
[30] Simckes, A. M., Swanson, S. K., White, R. A., Chromosomal localization of three
vacuolar-H+ -ATPase 16 kDa subunit (ATP6V0C) genes in the murine genome.
Cytogenetic and genome research 2002, 97, 111-115.
[31] Lieu, Z. Z., Lock, J. G., Hammond, L. A., La Gruta, N. L., et al., A trans-Golgi network
golgin is required for the regulated secretion of TNF in activated macrophages in vivo.
Proceedings of the National Academy of Sciences of the United States of America 2008,
105, 3351-3356.
60
[32] Herget, M., Tampe, R., Intracellular peptide transporters in human--
compartmentalization of the "peptidome". Pflugers Arch 2007, 453, 591-600.
[33] Shirihai, O. S., Gregory, T., Yu, C., Orkin, S. H., Weiss, M. J., ABC-me: a novel
mitochondrial transporter induced by GATA-1 during erythroid differentiation. The
EMBO journal 2000, 19, 2492-2502.
[34] Anderson, G. P., COPD, asthma and C-reactive protein. Eur Respir J 2006, 27, 874-
876.
[35] Rahman, I., Adcock, I. M., Oxidative stress and redox regulation of lung
inflammation in COPD. Eur Respir J 2006, 28, 219-242.
[36] Kurano, Y., Nakamura, M., Ichiba, M., Matsuda, M., et al., In vivo distribution and
localization of chorein. Biochemical and biophysical research communications 2007,
353, 431-435.
[37] Walker, R. H., Liu, Q., Ichiba, M., Muroya, S., et al., Self-mutilation in chorea-
acanthocytosis: Manifestation of movement disorder or psychopathology? Mov Disord
2006, 21, 2268-2269.
[38] Iolascon, A., Perrotta, S., Stewart, G. W., Red blood cell membrane defects.
Reviews in clinical and experimental hematology 2003, 7, 22-56.
[39] Scholey, J. M., Kinesin-II, a membrane traffic motor in axons, axonemes, and
spindles. The Journal of cell biology 1996, 133, 1-4.
[40] Miki, H., Okada, Y., Hirokawa, N., Analysis of the kinesin superfamily: insights into
structure and function. Trends in cell biology 2005, 15, 467-476.
[41] Akiyama, T., Kawasaki, Y., Wnt signalling and the actin cytoskeleton. Oncogene
2006, 25, 7538-7544.
[42] Jimbo, T., Kawasaki, Y., Koyama, R., Sato, R., et al., Identification of a link between
the tumour suppressor APC and the kinesin superfamily. Nature cell biology 2002, 4,
323-327.
[43] Liao, R., Sun, J., Zhang, L., Lou, G., et al., MicroRNAs play a role in the development
of human hematopoietic stem cells. Journal of cellular biochemistry 2008, 104, 805-
817.
61
[44] Sasaki, T., Shiohama, A., Minoshima, S., Shimizu, N., Identification of eight
members of the Argonaute family in the human genome small star, filled. Genomics
2003, 82, 323-330.
[45] Percy, M. J., Lappin, T. R., Recessive congenital methaemoglobinaemia:
cytochrome b(5) reductase deficiency. British journal of haematology 2008, 141, 298-
308.
[46] Jaffe, E. R., Methemoglobin pathophysiology. Progress in clinical and biological
research 1981, 51, 133-151.
[47] Mak, J. C., Pathogenesis of COPD. Part II. Oxidative-antioxidative imbalance. Int J
Tuberc Lung Dis 2008, 12, 368-374.
[48] Santos, M. C., Oliveira, A. L., Viegas-Crespo, A. M., Vicente, L., et al., Systemic
markers of the redox balance in chronic obstructive pulmonary disease. Biomarkers
2004, 9, 461-469.
[49] Rahman, I., Biswas, S. K., Kode, A., Oxidant and antioxidant balance in the airways
and airway diseases. European journal of pharmacology 2006, 533, 222-239.
[50] Chae, H. Z., Robison, K., Poole, L. B., Church, G., et al., Cloning and sequencing of
thiol-specific antioxidant from mammalian brain: alkyl hydroperoxide reductase and
thiol-specific antioxidant define a large family of antioxidant enzymes. Proceedings of
the National Academy of Sciences of the United States of America 1994, 91, 7017-7021.
[51] Lehtonen, S. T., Markkanen, P. M., Peltoniemi, M., Kang, S. W., Kinnula, V. L.,
Variable overoxidation of peroxiredoxins in human lung cells in severe oxidative stress.
American journal of physiology 2005, 288, L997-1001.
[52] Lehtonen, S. T., Svensk, A. M., Soini, Y., Paakko, P., et al., Peroxiredoxins, a novel
protein family in lung cancer. International journal of cancer 2004, 111, 514-521.
[53] Rezvani, K., Teng, Y., Shim, D., De Biasi, M., Nicotine regulates multiple synaptic
proteins by inhibiting proteasomal activity. J Neurosci 2007, 27, 10508-10519.
63
Chapter IV
A comparative, Global Proteomic
Analyses of Human Nasal
Epithelial Cells Obtained by Nasal
Brushing in Nonsmoking versus
Smoking Healthy Individuals
64
A comparative, global proteomic analyses of human nasal epithelial cells obtained by
nasal brushing in nonsmoking versus smoking healthy individuals
Bruno M. Alexandre1, Nicholas W. Bateman2, Brian L. Hood2, Mai Sun2, Thomas P.
Conrads2* and Deborah Penque1*
1Laboratorio de Proteómica, Departamento de Genética, Instituto Nacional de Saúde
Dr. Ricardo Jorge (INSA-IP), Av. Padre Cruz 1649-016 Lisboa, Portugal and the
2Department of Pharmacology & Chemical Biology, University of Pittsburgh Cancer
Institute, University of Pittsburgh
Keywords: Nasal epithelial cells, nasal brushing, tobacco, lung, cigarette smoke,
proteomics.
*Corresponding authors: Deborah Penque, Ph.D., Laboratório de Proteómica,
Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo
Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8137,
Fax: +351 21752 6410, E-mail: [email protected] and Thomas P.
Conrads, Ph.D., 204 Craft Avenue, Suite B401, Pittsburgh, PA, 15213, Tel: 412-641-
7556, Fax: 412-641-2356, E-mail: [email protected]
65
ABSTRACT
Cigarette smoking is the leading cause of preventable death worldwide and yet,
premature tobacco-attributable deaths are projected to rise from 5.4 million in 2004
to 8.3 million in 2030, about 10% of the deaths worldwide, with more than 80% in
developing countries. The nasal epithelium is the initial point of contact of the
respiratory tract to the external environment, and is continuously subjected to the
influence of irritating particles and chemicals as the ones present in cigarette smoke.
This is a pioneer work as for the first time the proteome of nasal epithelial cells
obtained from smoker subjects is revealed and compared to the one of non-smokers.
Moreover, samples were analyzed by a high-resolution mass spectrometer which was
capable of generating over 900 protein identifications by two or more peptides.
Ninety-six proteins were found to be differentially expressed between the proteomes
of healthy smokers and non-smokers, which were related to processes of antigen
presentation, cell-to-cell signaling and interaction, cell morphology, drug metabolism,
DNA repair, energy production or mitochondrial dysfunction. Previous evidences as the
overexpression of CD44 and MUC5AC due to cigarette smoking were confirmed, but
importantly, many proteins related to the aforementioned processes and others had
never been associated with cigarette smoking.
66
INTRODUCTION
Cigarette smoking is the leading cause of preventable death worldwide and yet,
despite anti-smoking campaign efforts from the European Respiratory Society [1],
American Thoracic Society [2] or the World Health Organization (WHO) [3], the
number of smokers keeps increasing. Thus, global epidemic of tobacco-associated
diseases has progressively increased. According to the 2008 WHO Report, premature
tobacco-attributable deaths from ischemic heart disease, cerebrovascular disease,
chronic obstructive pulmonary disease among others are projected to rise from 5.4
million in 2004 to 8.3 million in 2030, about 10% of the deaths worldwide, with more
than 80% in developing countries [4]. Equally troubling to the medical impact of
smoking is the economic burden, where the recent global annual cost estimate of 500
billion US dollars that is spent on caring for and treating tobacco-related illnesses is
projected to reach 1 trillion by 2030 [3].
Cigarette smoke is a complex mixture of over 4000 substances, including antigenic,
cytotoxic, mutagenic and carcinogenic agents, that are inhaled directly or as products
of high temperature combustion on the end of the cigarette [5, 6]. This includes high
levels of oxidants and reactive oxygen species (ROS) detected in both mainstream and
sidestream smoke [5-7]. Cigarette smoke-mediated oxidative stress produces DNA
damage and activates survival signalling cascades resulting in uncontrolled cell
proliferation and transformation [7, 8]. Oxidative stress that ensues, when the
antioxidant defenses are depleted, is accompanied by further increase in ROS
production in lung epithelial cells [8]. The nasal epithelium is the initial point of contact
of the respiratory tract to the external environment, and is continuously subjected to
the influence of irritating particles and chemicals as the ones present in cigarette
smoke, viruses, bacteria, airborne allergens and other environmental pollutants. The
major function of the nasal epithelium has been regarded to be primarily that of a
physical barrier, but recent evidence strongly supports that epithelial cells are quite
active metabolically and capable of modulating a variety of inflammatory processes
and immune responses [9, 10].
Although investigations into the effects of cigarette smoke are plentiful, to the best of
our knowledge, there is no work describing the proteomic alterations induced in nasal
epithelial cells from chronic smoking reported so far. Moreover, reported studies on
67
cigarette smoke are often based on animal models or cultured cells treated with
cigarette smoke extract. Culture conditions of airway epithelial cells, their proliferation
and immortalization may influence their protein expression levels and therefore their
action. To overcome these issues, the present investigation was conducted utilizing
freshly obtained epithelial cells collected by nasal brushing from nonsmokers and
smokers. Our group has successfully demonstrated that nasal brushing is capable of
yielding numerous and well-preserved dissociated cells that are representative of the
human superficial respiratory mucosa [11] and their utility in the study of the
monogenic disease cystic fibrosis by proteomics [12, 13]. Nasal epithelial cells were
also reported to constitute an accessible surrogate for studying lower airway
inflammation [14]. Liquid chromatography-tandem mass spectrometry and spectral
counting were utilized to derive protein abundance changes in the nasal epithelial cells
caused by the chronic exposure to the cigarette smoke in smokers as compared to
nonsmokers.
MATERIALS AND METHODS
Individuals and Sample Collection
The study was approved by the Ethics Committee of both Hospital de Santa Maria,
Lisbon and Instituto Nacional de Saude Dr. Ricardo Jorge (INSA)-Lisboa. After informed
consent, nasal epithelial cells were collected by nasal brushing as previously described
[11, 15], from nonsmokers (n=8) and cigarette smokers individuals (n=10). To be
included within the smoker group, individuals had to be smokers for at least 20 years,
smoking at least 10 cigarettes per day. All subjects presented no signs or symptoms of
any respiratory or other chronic diseases. Lung function was evaluated by means of
spirometry and FEV1/FVC > 0.7 was set to be the criterion for a normal lung function.
Nonsmokers and smokers’ individuals were matched for age (54±3.1 and 51±5.4 years
old, respectively) and gender (20% and 12.5% of males, respectively). Cell suspensions
from each individual were cytospun onto a microscopy slide, stained with May-
Grünwald-Giemsa (MGG-) staining and examined for evidence of epithelial cells
(ciliated, goblet and basal cells) and for red blood cell contamination. Only samples
presenting about 0-1% of red blood cell contamination were included in the study.
68
Nasal Epithelial Cells Lysis
Cell suspensions were centrifuged after collection and pelleted cells were resuspended
in the presence of 10 mM Tris-Cl pH 7.6 in 1 mM EDTA containing protease inhibitors.
Cells were lysed by intermittent sonication cycles (10 cycles of 10 sec-pulse followed
by 30 sec pause on dry ice). Lysates were centrifuged twice at 2000 x g for 3 min at 4 °C
to discard any unlysed cells or cell debris. Before storing at -80 °C, an aliquot of 10 μL
from each individual was removed to perform a BCA protein assay (Pierce, Rockford IL,
USA).
Sample Preparation for LC-MS/M
Two biological replicates were constituted within each of the groups under analysis
(Table IV.1). Each biological replicate containing 30 μg of total cell lysate of each of the
groups under analysis (nonsmokers and smokers), was spiked with 3 pmol of chicken
ovalbumin.
Table IV.1: Main characteristics of the biological replicates of the samples under
analysis.
Pool n Biological
Replicate n Age (y) FVC(%) FEF(%) FEV1(%) FEV1/FVC(%)
Nonsmokers 10 1 5 54 ± 2.9 97 ± 20.5 81 ± 17.9 98 ± 19.7 86 ± 4.1
2 5 53 ± 3.6 98 ± 10.4 84 ± 17.6 98 ± 13.8 88 ± 11.8
Smokers 8 1 4 52 ± 7.7 112 ± 19.5 68 ± 6.5 108 ± 15.6 79 ± 3.1
2 4 50 ± 1.7 103 ± 18.9 76 ± 24.0 98 ± 21.9 80 ± 1.9
Each sample was loaded into duplicate gel lanes onto 1D SDS-PAGE on a 4-12% bis-tris
gel (NuPAGE, Invitrogen, Carlsbad, CA) and electrophoresed for approximately 10 min
at a constant voltage of 150 V. Gels were stained with Coomassie blue (SimplyBlue
SafeStain, Invitrogen) and bands belonging to the same sample were excised, sliced
into small pieces and pooled together into the same tube. Gel slices were destained in
50% acetonitrile (ACN) and 50mM ammonium bicarbonate (AMB) overnight at 4 ºC
and in the next morning for another hour. Fully destained gel slices were dehydrated in
69
100% AcN. Gel slices were then rehydrated in 25 mM AMB containing 20 µg/mL
porcine sequencing grade modified trypsin (Promega, Madison, WI) on ice for 45 min.
This solution was discarded and a 25 mM AMB solution was added to the gel slices and
incubated overnight at 37 °C. Tryptic peptides were extracted with 70% ACN and 5%
formic acid (FA) and dried by vacuum centrifugation. Each digest was resuspended in
60 µl of 0.1% trifluoroacetic acid (TFA).
Proteomic analysis by liquid chromatography-tandem mass spectrometry
Peptide digests were resolved by nanoflow reverse-phase liquid chromatography
(Ultimate 3000, Dionex Inc.) coupled online via electrospray ionization to a hybrid
linear ion trap - Orbitrap mass spectrometer (LTQ-Orbitrap, ThermoFisher Scientific,
Inc., San Jose, CA). Five injections of 2 µL of peptide extracts corresponding to 1 μg
total protein were resolved on 100 μm i.d. by 360 μm o.d. by 200 mm long fused silica
capillary columns (Polymicro Technologies, Phoenix, AZ) slurry-packed in-house with 5
μm, 300 Å pore size C-18 silica-bonded stationary phase (Jupiter, Phenomenex,
Torrance, CA). After sample injection, peptides were eluted from the column using a
linear gradient of 2% mobile phase B (100% AcN and 0.1% formic acid) to 40% mobile
phase B over 125 min at a constant flow rate of 200 nL/min followed by a column wash
consisting of 95% B for an additional 30 min at a constant flow rate of 400 nL/min. The
LTQ-Orbitrap MS was configured to collect high resolution (R=60,000 at m/z 400)
broadband mass spectra (m/z 375-1800) from which the thirteen most abundant
peptide molecular ions dynamically determined from the MS scan were selected for
tandem MS using a relative CID energy of 30%. Dynamic exclusion was utilized to
minimize redundant selection of peptides for CID.
Peptide Identification and Spectral Count Analysis
Peptide identifications were obtained by searching the LC-MS/MS data utilizing
SEQUEST (Thermo Scientific BioWorks 3.2) on a 72 node Beowulf cluster against a
UniProt-derived human proteome database (version 03/10) obtained from the
European Bioinformatics Institute (EBI). Search parameters consisted of enzyme:
70
trypsin (KR); enzyme limits: full enzymatic-cleavage at both ends; missed cleavages
sites: 2; peptide tolerance: 20 ppm; fragment ion tolerance: 0.5 amu; and variable
modifications on methionine of 15.99492 m/z. Resulting peptide identifications were
filtered according to specific SEQUEST scoring criteria [delta correlation (ΔCn) ≥ 0.08
and charge state dependent cross correlation (Xcorr) ≥ 1.9 for *M+H+1+ (mass+proton),
≥ 2.2 for *M+2H+2+, ≥ 3.5 for *M+3H+3++ and ≥ 3.0 for *M+4H+4++. Differences in
protein abundance between the samples were derived by spectral counting (SC).
Peptides whose sequence mapped to multiple protein isoforms were grouped as per
the principle of parsimony [16]. A value of 0.5 was added to each spectral count value
prior to log2 transformation to enable ratio values to be calculated for proteins
identified in one group, but not another [17]. Proteins which exhibited a >95%
confidence interval from the mean for each comparison performed were considered
statistically significant.
Bioinformatic analyses
Uniprot accessions corresponding to proteins identified by at least two peptides were
mapped to HUGO (HGNC) gene symbols utilizing Ingenuity Pathway Analysis (IPA)
(Ingenuity® Systems, www.ingenuity.com). Accessions which failed to map were
converted to IPI identifiers with the mapping utility available at www.uniprot.org and
remapped to IPA to maximize protein identifications available for downstream
bioinformatic analyses.
Protein localization and subtype assignments were derived from IPA-mapped data
sets. Functional analysis of significant protein lists were performed utilizing the “Core
Analysis” function in IPA using default parameters (p<0.05, Fischer’s Exact test).
Network and protein interaction analyses were also performed utilizing IPA for
significant proteins in which a maximum of 35 proteins per network assignment were
allowed. ProteinCenter (Thermo Fischer Scientific) was used to retrieve information on
gene ontology terms to annotate proteins according to biological process and
molecular function using default parameters.
71
RESULTS AND DISCUSSION
Global proteomics analysis of nasal epithelial cells from smokers and nonsmoker
subjects.
It has been demonstrated that nasal brushing is capable of yielding numerous and
well-preserved dissociated cells, representative of the human superficial respiratory
mucosa [11]. To confirm this statement, nasal epithelial cell suspensions from each
individual were evaluated upon collection. Cells were harvested and the cytospins
were prepared and stained with MGG in order to evaluate the presence of ciliated,
goblet and basal cells and also for red cell contamination (Figure IV.1).
Figure IV.1: MGG-staining of nasal cells collected by brushing. Magnification: 80x.
After microscope examination, nasal epithelial samples presenting 0-1% of red blood
cell contamination were processed as described in Figure IV.2. Two biological
replicates were constituted for each of the groups under analysis, smokers and
nonsmoker subjects. Five injections corresponding to 1 µg total protein from each of
the biological replicates were analyzed by LC-MS/MS using a high-resolution mass
spectrometer resulting in the identification of a total of 42666 peptides and 1190
72
proteins in total (Supplemental Table IV.1, Supporting Information), of which 910 were
identified by at least two peptides (Supplemental Table IV.2A, Supporting Information).
Figure IV.2: Basic workflow of the methodology employed for the study of the nasal
epithelial cells proteome of healthy smokers and nonsmokers.
Information on the respective Log2 Smokers/Nonsmokers Ratio of each of these
proteins, on cellular location and functional type, and on the profile of the number of
transmembrane domains can be found in Supplemental Tables IV.2B, IV.3, and IV.4,
Supporting Information, respectively. Also, information on the gene ontology terms for
biological process and molecular function is also provided in Supplemental Figures IV.1
and IV.2, Supporting Information, respectively. Protein digests equivalency was
determined by comparing the total number of peptides identified in each of the
analytical samples, which resulted in a calculated relative standard deviation (RSD) of
8.5%. This was also evaluated through comparison of total peptides identified for
chicken ovalbumin, which was added in equal amounts to each of the analyzed
samples according to the workflow displayed in Figure IV.2. RSD for chicken ovalbumin
was calculated to be as low as 5.3%. Equivalency in peptide load was also determined
73
by comparison of the spectral count values for the “housekeeping” protein actin,
cytoplasmic 1 (ACTB, Uniprot Accession (Acc): P60709), commonly used to correct for
protein loading in western blot analysis [18], which revealed a RSD of 13.5%. When
performing the nasal brushing procedure, some local bleeding may occur and this is
responsible for the identification of proteins as hemoglobin subunits alpha, beta,
gamma and gamma-1 (HBA1, HBB, HBD and HBG1, respectively) or erythrocyte band 7
integral membrane protein (STOM). Nonetheless, the epithelial origin of most cells
obtained by nasal brushing procedure was confirmed by the identification of proteins
such as keratin type I cytoskeletal 19 (KRT19, Acc: P08727), keratin type II cytoskeletal
8 (KRT8, Acc: P05787), palate, lung and nasal epithelium carcinoma (PLUNC, Acc:
Q9NP55), long palate, lung and nasal epithelium carcinoma (LPLUNC, Acc: Q8TDL5),
epithelial cell adhesion molecule (EPCAM, Acc: P16422), and a handful of mucins such
as mucin-1 (MUC1, Acc: P15941), mucin-2 (MUC2. Acc: Q02817), mucin-4 (MUC4, Acc:
Q99102), mucin-5AC (MUC5AC, Acc: P98088) or mucin 5B (MUC5B, Acc: Q9HC84)
which are reported to be expressed at relatively high levels in the human respiratory
tracts when compared to other mucin genes [19]. Proteins identified in both groups
under analysis revealed a considerable overlap of 74.7%, as 680 proteins were
commonly identified from the total 910 proteins identified by at least two peptides
(Figure IV.3)
Figure IV.3: Venn diagram showing the overlap in proteins identified by at least two
peptides between the two groups under analysis.
74
Among those proteins found in only one group is Hypoxia upregulated protein 1
(HYOU1), which was confidently identified in the smoker group only. Expression of this
protein is involved in stress-dependent induction resulting in the accumulation of this
protein in the endoplasmic reticulum (ER) under hypoxic conditions. Also, HYOU1 is
suggested to have an important cytoprotective role in hypoxia-induced cellular
perturbation since suppression of this protein is associated with accelerated apoptosis
[20]. Hierarchical clustering was used to arrange proteins according to similarity in
protein expression across the samples under analysis [21, 22]. Not only does this
retrieve information from the datasets without any knowledge a priori as it also
delivers the output graphically in a very intuitive way. When applying this method to
our data, biological replicates from nonsmoker samples clustered together and apart
from the smoker samples (Supplemental Figure IV.3, Supporting Information). This is
very important as we move towards comparative proteomics. As biological replicates
present similar protein expression and, therefore, cluster together, then the
comparative proteomics analysis is reinforced.
Comparative proteomics analysis of smoker and nonsmoker subjects.
After submitting spectral counts of proteins identified by at least two peptides in
smoker and nonsmoker groups to a t-test, 96 proteins exhibited a >95% confidence
interval and were therefore considered to be significantly differentially expressed.
However, there was an entry that has been removed from the Uniprot database
(C9JFA0) and therefore 95 proteins were considered in the comparative analysis of
smoker and nonsmoker groups. Figure IV.4 shows the hierarchical clustering of these
proteins exhibiting their relative expression. Information on the identity of these
proteins as well as on its cellular location and functional type is exhibited in Table IV.2.
In order to acquire a better understanding in the context of biology, these proteins
were analyzed through the “Core Analysis” of Ingenuity Pathway Analysis (IPA). When
submitting this set to IPA, 87 proteins were found to be eligible for network analysis
and 86 to be eligible for functions and pathway analysis. Significantly differentially
expressed proteins were grouped into networks and top 5 networks possessing the
highest statistic score are shown in Table IV.3. Functions associated with top 3
networks include antigen presentation, cell-to-cell signaling and interaction, cell
75
morphology, drug metabolism, DNA replication, recombination and repair and energy
production, which is consistent to what has been reported to be the main effects of
cigarette smoke [7, 8, 23].
Figure IV.4: Hierarchical cluster of the significantly differentially expressed proteins.
Protein abundances are displayed as normalized expression. X-axis labels refer to
information displayed in Table IV.1.
Top network which comprises 24 proteins identified in the present work,
corresponding to 28% of the significantly differentially expressed proteins eligible for
network analysis, is displayed in Figure IV.5. Networks 2 and 3 are available as
Supplemental Figures IV.4 and IV.5, Supporting Information.
76
Table IV.2: Differentially expressed proteins in smokers (S) when compared to
nonsmokers (NS) exhibiting a >95% confidence interval. Cellular location and
functional type were retrieved by Ingenuity knowledgebase (Ingenuity Systems).
Uniprot
Acc.
HGNC
Symbol Entrez Gene Name
S/NS Log2
Ratio
Cellular
Location
Functional
Type
P24666 ACP1 acid phosphatase 1, soluble -3,46 Cytoplasm phosphatase
Q9C0K3 ACTR3C ARP3 actin-related protein 3
homolog C (yeast)
-2,00 unknown other
Q8WXS8 ADAMTS14 ADAM metallopeptidase
with thrombospondin type 1
motif, 14
-2,81 Extracellular
Space
peptidase
P14550 AKR1A1 aldo-keto reductase family
1, member A1 (aldehyde
reductase)
-0,68 Cytoplasm enzyme
Q13740 ALCAM activated leukocyte cell
adhesion molecule
-0,51 Plasma
Membrane
other
P51649 ALDH5A1 aldehyde dehydrogenase 5
family, member A1
0,85 Cytoplasm enzyme
Q07960 ARHGAP1 Rho GTPase activating
protein 1
2,00 Cytoplasm other
P25705 ATP5A1 ATP synthase, H+
transporting, mitochondrial
F1 complex, alpha subunit 1,
cardiac muscle
0,60 Cytoplasm transporter
P36542 ATP5C1 ATP synthase, H+
transporting, mitochondrial
F1 complex, gamma
polypeptide 1
2,59 Cytoplasm transporter
Q13867 BLMH bleomycin hydrolase -3,59 Cytoplasm peptidase
Q86WA6 BPHL biphenyl hydrolase-like
(serine hydrolase)
-2,32 Cytoplasm enzyme
Q9HB07 C12ORF10 chromosome 12 open
reading frame 10
-2,00 unknown other
Q8TDL5 C20ORF11
4
chromosome 20 open
reading frame 114
0,45 Extracellular
Space
other
P16070 CD44 CD44 molecule (Indian
blood group)
0,91 Plasma
Membrane
other
77
Q59FS9 CD81 CD81 molecule 1,59 Plasma
Membrane
other
P60953 CDC42 cell division cycle 42 (GTP
binding protein, 25kDa)
-3,00 Cytoplasm enzyme
Q07065 CKAP4 cytoskeleton-associated
protein 4
3,32 Cytoplasm other
P17540 CKMT2 creatine kinase,
mitochondrial 2
(sarcomeric)
1,00 Cytoplasm kinase
Q9Y696 CLIC4 chloride intracellular
channel 4
2,00 Plasma
Membrane
ion channel
P13073 COX4I1 cytochrome c oxidase
subunit IV isoform 1
1,59 Cytoplasm enzyme
P50416 CPT1A carnitine
palmitoyltransferase 1A
(liver)
2,00 Cytoplasm enzyme
P00387 CYB5R3 cytochrome b5 reductase 3 2,54 Cytoplasm enzyme
P08574 CYC1 cytochrome c-1 3,17 Cytoplasm enzyme
P39656 DDOST dolichyl-
diphosphooligosaccharide--
protein glycosyltransferase
1,59 Cytoplasm enzyme
Q9NY33 DPP3 dipeptidyl-peptidase 3 -0,97 Cytoplasm peptidase
P05198 EIF2S1 eukaryotic translation
initiation factor 2, subunit 1
alpha, 35kDa
-2,00 Cytoplasm translation
regulator
B7Z3Q9 EML2 echinoderm microtubule
associated protein like 2
-3,46 Cytoplasm other
P58107 EPPK1 epiplakin 1 1,87 Cytoplasm other
A2A2Y4 FRMD3 FERM domain containing 3 -1,59 unknown other
P21217 FUT3 fucosyltransferase 3
(galactoside 3(4)-L-
fucosyltransferase, Lewis
blood group)
2,00 Cytoplasm enzyme
P50395 GDI2 GDP dissociation inhibitor 2 -1,46 Cytoplasm other
Q9HC38 GLOD4 glyoxalase domain
containing 4
-2,32 Cytoplasm enzyme
P63244 GNB2L1 guanine nucleotide binding
protein (G protein), beta
0,95 Cytoplasm enzyme
78
polypeptide 2-like 1
P15586 GNS glucosamine (N-acetyl)-6-
sulfatase
-1,46 Cytoplasm enzyme
P17174 GOT1 glutamic-oxaloacetic
transaminase 1, soluble
(aspartate aminotransferase
1)
-0,78 Cytoplasm enzyme
P48637 GSS glutathione synthetase -0,88 Cytoplasm enzyme
P46976 GYG1 glycogenin 1 -2,00 Cytoplasm enzyme
P19367 HK1 hexokinase 1 -0,79 Cytoplasm kinase
P00738 HP haptoglobin 0,81 Extracellular
Space
peptidase
HSFY1 HSFY1 heat shock transcription
factor, Y-linked 1
-1,59 Nucleus transcription
regulator
P34931 HSPA1L heat shock 70kDa protein 1-
like
-0,56 Cytoplasm other
P38646 HSPA9 heat shock 70kDa protein 9
(mortalin)
-0,31 Cytoplasm other
Q92598 HSPH1 heat shock 105kDa/110kDa
protein 1
-2,00 Cytoplasm other
P23276 KEL Kell blood group, metallo-
endopeptidase
-1,59 Plasma
Membrane
peptidase
Q6UXB3 LYPD2 LY6/PLAUR domain
containing 2
-1,42 unknown other
Q9HCC0 MCCC2 methylcrotonoyl-CoA
carboxylase 2 (beta)
2,59 Cytoplasm enzyme
P23368 ME2 malic enzyme 2, NAD(+)-
dependent, mitochondrial
-0,56 Cytoplasm enzyme
Q16798 ME3 malic enzyme 3, NADP(+)-
dependent, mitochondrial
-2,00 Cytoplasm enzyme
Q9Y2Q9 MRPS28 mitochondrial ribosomal
protein S28
-1,59 Cytoplasm other
Q96DH6 MSI2 musashi homolog 2
(Drosophila)
-3,00 Cytoplasm other
P26038 MSN moesin -1,42 Plasma
Membrane
other
Q9Y6C9 MTCH2 mitochondrial carrier
homolog 2 (C. elegans)
2,59 Cytoplasm other
79
P98088 MUC5AC mucin 5AC, oligomeric
mucus/gel-forming
0,45 Extracellular
Space
other
P35580 MYH10 myosin, heavy chain 10,
non-muscle
1,59 Cytoplasm other
Q16795 NDUFA9
(includes
EG:4704)
NADH dehydrogenase
(ubiquinone) 1 alpha
subcomplex, 9, 39kDa
2,81 Cytoplasm enzyme
Q9NX14 NDUFB11 NADH dehydrogenase
(ubiquinone) 1 beta
subcomplex, 11, 17.3kDa
-2,00 Cytoplasm enzyme
O75306 NDUFS2 NADH dehydrogenase
(ubiquinone) Fe-S protein 2,
49kDa (NADH-coenzyme Q
reductase)
3,17 Cytoplasm enzyme
O75489 NDUFS3 NADH dehydrogenase
(ubiquinone) Fe-S protein 3,
30kDa (NADH-coenzyme Q
reductase)
3,81 Cytoplasm enzyme
Q969S2 NEIL2 nei endonuclease VIII-like 2
(E. coli)
-1,59 Nucleus enzyme
Q5SPY9 NPDC1 neural proliferation,
differentiation and control,
1
-1,59 Extracellular
Space
other
B0ZBF1 NPR1 natriuretic peptide receptor
A/guanylate cyclase A
(atrionatriuretic peptide
receptor A)
-2,00 Plasma
Membrane
enzyme
Q96PE5 OPALIN oligodendrocytic myelin
paranodal and inner loop
protein
-1,59 Cytoplasm other
O00764 PDXK pyridoxal (pyridoxine,
vitamin B6) kinase
-0,50 Cytoplasm kinase
P30086 PEBP1 phosphatidylethanolamine
binding protein 1
-0,45 Cytoplasm other
P07737 PFN1 profilin 1 -0,56 Cytoplasm other
O95336 PGLS 6-phosphogluconolactonase -2,32 Cytoplasm enzyme
Q8IV08 PLD3 phospholipase D family,
member 3
-2,00 Cytoplasm enzyme
80
P28072 PSMB6 proteasome (prosome,
macropain) subunit, beta
type, 6
1,59 Cytoplasm peptidase
P62191 PSMC1 proteasome (prosome,
macropain) 26S subunit,
ATPase, 1
2,00 Nucleus peptidase
P18754 RCC1
(includes
EG:1104)
regulator of chromosome
condensation 1
-0,71 Cytoplasm other
IPI00815
843
RPL14 ribosomal protein L14 3,32 Cytoplasm other
Q02878 RPL6 ribosomal protein L6 1,59 Cytoplasm other
Q96T51 RUFY1 RUN and FYVE domain
containing 1
-2,00 Cytoplasm transporter
Q9NP81 SARS2 seryl-tRNA synthetase 2,
mitochondrial
-1,42 Cytoplasm enzyme
Q13228 SELENBP1 selenium binding protein 1 -0,51 Cytoplasm other
P35237 SERPINB6 serpin peptidase inhibitor,
clade B (ovalbumin),
member 6
-2,26 Cytoplasm other
Q9UJS0 SLC25A13 solute carrier family 25,
member 13 (citrin)
2,59 Cytoplasm transporter
P12236 SLC25A6 solute carrier family 25
(mitochondrial carrier;
adenine nucleotide
translocator), member 6
1,59 Cytoplasm transporter
P11166 SLC2A1 solute carrier family 2
(facilitated glucose
transporter), member 1
-0,29 Plasma
Membrane
transporter
SNRPEL1 SNRPEL1 small nuclear
ribonucleoprotein
polypeptide E-like 1
-2,81 Nucleus other
P04179 SOD2 superoxide dismutase 2,
mitochondrial
-0,39 Cytoplasm enzyme
P05455 SSB Sjogren syndrome antigen B
(autoantigen La)
-1,54 Nucleus enzyme
Q9UNL2 SSR3 signal sequence receptor,
gamma (translocon-
3,32 Cytoplasm other
81
associated protein gamma)
Q12846 STX4 syntaxin 4 -2,81 Plasma
Membrane
transporter
P51687 SUOX sulfite oxidase -2,32 Cytoplasm enzyme
O60506 SYNCRIP synaptotagmin binding,
cytoplasmic RNA interacting
protein
-0,95 Nucleus other
P09758 TACSTD2 tumor-associated calcium
signal transducer 2
-1,06 Plasma
Membrane
other
Q99805 TM9SF2 transmembrane 9
superfamily member 2
1,59 Plasma
Membrane
transporter
Q9HC07 TMEM165 transmembrane protein 165 1,59 Plasma
Membrane
other
Q9NS69 TOMM22 translocase of outer
mitochondrial membrane 22
homolog (yeast)
2,81 Cytoplasm transporter
P07437 TUBB tubulin, beta 0,74 Cytoplasm other
Q16881 TXNRD1 thioredoxin reductase 1 -0,85 Cytoplasm enzyme
Q9NVA1 UQCC ubiquinol-cytochrome c
reductase complex
chaperone
-1,59 Cytoplasm other
P21796 VDAC1 voltage-dependent anion
channel 1
0,53 Cytoplasm ion channel
Q86UZ6 ZBTB46 zinc finger and BTB domain
containing 46
-1,59 Nucleus other
Top 10 significant (p<0.05, Fischer’s exact test) biological functions (biofunctions)
related to disease and disorders along with the proteins involved in each biofunction
were also derived from IPA and are displayed in Table IV.4. Infection mechanism,
inflammatory response and cancer were the biofunctions possessing the lowest p-
value and there were two proteins, CD 44 antigen (CD44) and CD81 antigen (CD81),
which were part of the three of them. CD44, which was found to be overexpressed in
smokers when compared to nonsmokers, is a cellular surface glycoprotein involved
cell-cell and cell-matrix interactions, through its affinity for hyaluronic acid and
possibly to its affinity for other ligands as osteopontin, collagens and matrix
metalloproteinases. In the presence of cigarette smoke in vivo or reactive oxygen
82
species (ROS) in vitro, CD44 was reported to mediate oxidative stress-induced mucus
hypersecretion in airway epithelium from smokers or primary cultures of human
bronchial epithelial cells [24].
Table IV.3: Top 5 protein interaction networks generated from proteins found to be
significantly differentially expressed proteins between smokers and nonsmokers.
Score Focus
Molecules Top Functions Molecules in Network
52 24
Antigen Presentation, Cell-To-
Cell Signaling and Interaction,
Hematological System
Development and Function
Actin,ARHGAP1,Beta
Tubulin,Caspase,CD44,CD81,CDC42,Ck2,CLIC4,CYB
5R3,Cytochrome c,DDOST,Erm,F
Actin,FSH,GDI2,GOT1,HSPA9,HSPA1L,KEL,MSN,MY
H10,NFkB (complex),PDXK,PEBP1,PFN1,Ras
homolog,Rho
gdi,RUFY1,SLC2A1,SSB,STX4,TUBB,TXNRD1,VDAC1
29 15
Cell Morphology, Drug
Metabolism, Molecular
Transport
beta-
estradiol,CD81,CKB,CKMT2,CLIC4,COX1,CYBA,CYC
1,CYTB,EPHA2,FUT3,FUT7,GNS,GYG1,HOXA9,hydr
ogen peroxide,lipoxin
A4,magnesium,MSI2,MUC5AC,PGLS,PHLDA1,PLD3
,PLSCR1,RAB9A,RALA,RALBP1,RCC1 (includes
EG:1104),SELENBP1,SLK,TACSTD2,TGFB1,TNF,UQC
R10,UQCRH
28 15
DNA Replication,
Recombination, and Repair,
Energy Production, Nucleic
Acid Metabolism
19S proteasome,ADAMTS14,adenosine-
tetraphosphatase,AKR1A1,ATP
synthase,ATP5A1,ATP5C1,ATP5D,ATP5E,ATP5O,AT
P6V1B2,BLMH,COX4I1,DPP3,ECHS1,ETFA,GLOD4,I
KBKE,KLK2,MCCC2,NAPA,NCSTN,peptidase,Protea
some
PA700/20s,PSMB6,PSMC,PSMC1,PSMD1,retinoic
acid,RPL6,RPL11,RPL14,SERPINB6,SLC2A4,SYNCRIP
24 13
Cell-To-Cell Signaling and
Interaction, Cellular Assembly
and Organization, Cellular
Movement
ACP1,Akt,ALCAM,Ap1,CPT1A,DEFA1 (includes
EG:1667),EIF2S1,ERK,ERK1/2,ERRFI1,FUT7,GEFT,G
NB2L1,GSS,HK1,HP,Hsp70,IL1,Insulin,Integrin,LOC
290704,Mapk,MTCH2,NPR1,P38 MAPK,Pdgf,PDGF
BB,Pdgfr,Pdgfra-
Pdgfrb,PI3K,Pkc(s),PP2A,PVR,SLC25A6,SOD2
83
24 13
Carbohydrate Metabolism,
Hepatic System Development
and Function, Small Molecule
Biochemistry
ALDH5A1,BAT2,BPHL,BRF2,EIF2C1,EIF2C4,EPPK1,F
7,GSTK1,HNF4A,KCNN2,malate dehydrogenase
(oxaloacetate-decarboxylating)
(NADP),ME2,ME3,MIR18A (includes
EG:406953),MIR293 (includes
EG:100049714),MIR34A (includes
EG:407040),MOD2,MRPS28,NEDD8,NEIL2,PHB2,PI
NX1,PTP4A3,RAB11A,SARS2,SLC25A13,SSR3,TM9S
F2,TMEM165,UQCC,USP15,WDR8 (includes
EG:49856),YBX1
Table IV.4: Top 10 significant biofunctions in disease and disorders observed in
differentially expressed proteins of smokers when compared to nonsmokers.
Category p-value Molecules
Infection Mechanism 2,52E-03-2,89E-02 CD81,CD44,SSB
Inflammatory
Response 2,52E-03-3,58E-02 CD81,HP,SOD2,CDC42,CD44,STX4
Cancer 5,84E-03-4,58E-02 CD81,PEBP1,SSR3,SOD2,SLC2A1,DDOST,CD44,TM9SF2,KE
L,FUT7,SERPINB6,SSB
Cardiovascular
Disease 5,84E-03-4,58E-02 BLMH,PFN1,NPR1,CD44
Dermatological
Diseases and
Conditions
5,84E-03-5,84E-03 BLMH
Developmental
Disorder 5,84E-03-5,84E-03 CD44
Genetic Disorder 5,84E-03-4,8E-02
CD81,MYH10,GNB2L1,UQCC,TM9SF2,FUT7,TUBB,GSS,SSR
3,SLC25A6,SOD2,DPP3,C20ORF114,null,GOT1,PDXK,CPT1
A,SLC2A1,ATP5A1,DDOST,ATP5C1,NPR1,ZBTB46,ACP1,M
E2,ALCAM,TMEM165,CYC1,VDAC1,GNS,SUOX,MCCC2,PE
BP1,CDC42,ADAMTS14,CKAP4,EIF2S1,CLIC4,PSMB6,SELE
NBP1,NDUFS2,MUC5AC,ALDH5A1,COX4I1,TACSTD2,SLC2
5A13,BLMH,HSPH1,GDI2,MSI2,NDUFS3,PSMC1,HP,CD44
Immunological
Disease 5,84E-03-4,52E-02 CD81,FUT7,GSS
Inflammatory Disease 5,84E-03-3,46E-02 SOD2,BLMH,ADAMTS14,CD44,MUC5AC,TUBB,SELENBP1
84
Metabolic Disease 5,84E-03-4,58E-02 SLC25A13,SOD2,CPT1A,NDUFS2,GNS,SUOX,ALDH5A1,MC
CC2
Figure IV.5: Top protein network as obtained from Ingenuity Pathway Analysis.
Mucin-5AC (MUC5AC), a gel-forming glycoprotein of respiratory tract epithelia that
protects the mucosa from infection and chemical damage by binding to inhaled
microrganisms and particules, together with MUC5B were the proteins responsible for
the reported mucus hypersecretion, which means that CD44 is involved in ROS-
induced MUC5AC expression [24]. In fact, in the present work, MUC5AC was also
found to be overexpressed in the smokers group. CD44 was also reported to be
involved in activation of NFkB complex, which is important in regulating cellular
response to harmful cellular stimuli such as ROS [25]. This interaction is shown in the
network displayed in Figure IV.5. CD44 is also a player in other actions involving
inflammatory response. Inflammatory response can also be viewed as part of an innate
tissue repair process following an insult, such as invading pathogens or damaged cells
as a result of cigarette smoking. The ultimate fate of many recruited inflammatory cells
is death, may it be apoptotic or necrotic. Phagocytic removal of apoptotic cells by
macrophages is an immunologically silent process that does not provoke release of
85
pro-inflammatory mediators [26]. Conversely, failure to remove apoptotic
inflammatory cells can result in secondary necrosis and consequent release of their
toxic granule contents causing further tissue damage and exacerbating the
inflammatory response [27-29]. Hence, removal of apoptotic cells by phagocytosis will
determine whether the inflammatory response will succeed [27-29]. Phagocytosis is
controlled by a variety of factors including cell surface molecules such as CD44 [28, 30].
Cigarette smoke is reported to include a high concentration of oxidants that
consequently lead to oxidative stress, which in turn leads to mitochondrial dysfunction
and DNA damage [7, 8, 31]. Within the significantly differentially expressed proteins,
four are related to oxidative stress or oxidative stress response mediated by Nrf2 –
superoxide dismutase 2, mitochondrial (SOD2), aldo-keto reductase family 1 member
A1 (AKR1A1), thioredoxin reductase 1, cytoplasmic (TXNRD1) and glutathione
synthetase (GSS). SOD2 is a mitochondrial antioxidant enzyme that detoxifies
superoxide anion radicals, a byproduct of mitochondrial respiration. SOD2 was found
to be underexpressed in the smokers which is consistent to what have recently been
reported [32]. The same trend was observed for AKR1A1 and TXNRD1, which as SOD2
are parts of the oxidative stress response mediated by Nrf2, and GSS. GSS catalyzes the
second step of glutathione (GSH) biosynthesis which in turn is important for a variety
of biological functions, including protection of cells from oxidative damage by free
radicals and detoxification of xenobiotics [33-35]. Bleomycin also triggers excess
production of ROS and DNA damage in the lung and bleomycin hydrolase (BLMH) [36],
an enzyme whose only known function is the metabolic inactivation of bleomycin, was
found to be underexpressed in smokers. This may contribute to the action of
bleomycin which has long been associated with pulmonary fibrosis and lung
emphysema in the presence of cigarette smoke [36-39]. Since the oxidative balance is
impaired due to the elevated levels of ROS present in the smoker subjects, it was
expected that the antioxidant response would also be increased in order to
compensate for this fact and therefore lower levels of SOD2, AKR1A1, TXNRD1 and GSS
found in smokers may have a serious impact in the cellular environment as reactions
between cellular components and ROS lead to DNA damage, mitochondrial
dysfunction, cell membrane damage and cell death [7, 8, 31]. Damaged DNA-binding
protein 1 (DDB1) plays a key role in the normal cell cycle and in response to DNA
86
damage and was also found to be underexpressed in the smoker group [40]. Cell
division cycle 42 (CDC42) is a small GTPase of the Rho-subfamily that is required for the
establishment of the apical-basal axis in epithelial cells and in differentiating neurons
[41-43]. It controls epithelial tissue morphogenesis by regulating spindle orientation
during cell division [44] and modulates cell adhesion and polarity during embryonic
morphogenesis by regulating the trafficking of key cell junction proteins [45, 46].
CDC42 was also found to be under-expressed in smoker subjects. Eleven proteins were
found to be related to mitochondrial dysfunction canonical pathway according to IPA
including cytochromes c1 (CYC1), c subunit 4 isoform (COX4I1) and b5 reductase 3
(CYB5R3), all overexpressed in the smoker group. Full list of proteins including
differential expression values can be accessed in Supplemental Table 6, Supplemental
Information. Cigarette smoke, as a mixture of a large number of substances is
expected to influence drug metabolism. In fact, 11 proteins were found to be related
to drug metabolism among the differentially expressed ones. Information on the
identity of this proteins and it differential expression and also on the type of action of
each protein within drug metabolism can be found in Supplemental Tables IV.7A and
IV.7B, Supporting Information, respectively.
CONCLUSION
Cigarette smoke is the most preventable cause of sickness and death worldwide.
Chronic cigarette smoking contributes to cardiovascular diseases, oral cancers and
even ocular diseases, but the respiratory tract is the most affected and thus cigarette
smoke is an important risk factor for lung diseases such as lung cancer, tuberculosis or
chronic obstructive pulmonary disease. This is a pioneer work as for the first time the
proteome of nasal epithelial cells obtained from smoker subjects is revealed and
compared to the one of nonsmokers. Moreover, samples were analyzed by a high-
resolution mass spectrometer which was capable of generating over 900 protein
identifications by two or more peptides. Ninety-six proteins were found to be
differentially expressed between the proteomes of healthy smokers and nonsmokers,
which were related to processes of antigen presentation, cell-to-cell signaling and
interaction, cell morphology, drug metabolism, DNA repair, energy production or
mitochondrial dysfunction. Although requiring further orthogonal validation our data
87
was consistent with previous evidences showing CD44, MUC5AC or SOD2 differential
modulation in smokers due to inflammatory response pathways. In addition, the data
presented here may provide new insights into processes such as drug metabolism,
energy production or mitochondrial dysfunction shedding a light onto proteins that
had never been associated with cigarette smoke.
88
ACKNOWLEDGEMENTS
All volunteers for their cooperation in this work. Work partially supported by FCT-
FEDER POCI/SAU-MMO/56163/2004, FCT/Poly-Annual Funding Program and FEDER-
Saude XXI Program (Portugal). BMA is recipient of FCT doctoral fellowship
(SFRH/BD/31415/2006).
89
REFERENCES
[1] European Respiratory Society and European Lung Foundation 2003.
[2] in: Schraufnagel, D. E. (Ed.), American Thoracic Society 2010.
[3] Eriksen, D. J. M. D. M. (Ed.), The Tobacco Atlas, World Health Organization 2002.
[4] The world health report 2008: primary health care now more than ever, World
Health Organization 2008.
[5] Tobacco: a major international health hazard. Proceedings of an international
meeting. Moscow, 4-6 June 1985. IARC Sci Publ 1986, 1-319.
[6] Church, D. F., Pryor, W. A., Free-radical chemistry of cigarette smoke and its
toxicological implications. Environ Health Perspect 1985, 64, 111-126.
[7] Rahman, I., Biswas, S. K., Kode, A., Oxidant and antioxidant balance in the airways
and airway diseases. European journal of pharmacology 2006, 533, 222-239.
[8] Faux, S. P., Tai, T., Thorne, D., Xu, Y., et al., The role of oxidative stress in the
biological responses of lung epithelial cells to cigarette smoke. Biomarkers 2009, 14
Suppl 1, 90-96.
[9] Shaykhiev, R., Bals, R., Interactions between epithelial cells and leukocytes in
immunity and tissue homeostasis. Journal of leukocyte biology 2007, 82, 1-15.
[10] Pahl, A., Preclinical modelling using nasal epithelial cells for the evaluation of
herbal extracts for the treatment of upper airway diseases. Planta Med 2008, 74, 693-
696.
[11] Beck, S., Penque, D., Garcia, S., Gomes, A., et al., Cystic fibrosis patients with the
3272-26A-->G mutation have mild disease, leaky alternative mRNA splicing, and CFTR
protein at the cell membrane. Hum Mutat 1999, 14, 133-144.
[12] Roxo-Rosa, M., da Costa, G., Luider, T. M., Scholte, B. J., et al., Proteomic analysis
of nasal cells from cystic fibrosis patients and non-cystic fibrosis control individuals:
search for novel biomarkers of cystic fibrosis lung disease. Proteomics 2006, 6, 2314-
2325.
[13] Gomes-Alves, P., Imrie, M., Gray, R. D., Nogueira, P., et al., SELDI-TOF biomarker
signatures for cystic fibrosis, asthma and chronic obstructive pulmonary disease. Clin
Biochem, 43, 168-177.
90
[14] McDougall, C. M., Blaylock, M. G., Douglas, J. G., Brooker, R. J., et al., Nasal
epithelial cells as surrogates for bronchial epithelial cells in airway inflammation
studies. American journal of respiratory cell and molecular biology 2008, 39, 560-568.
[15] Penque, D., Mendes, F., Beck, S., Farinha, C., et al., Cystic fibrosis F508del patients
have apically localized CFTR in a reduced number of airway cells. Laboratory
investigation; a journal of technical methods and pathology 2000, 80, 857-868.
[16] Marengo, E., Robotti, E., Bobba, M., Gosetti, F., The principle of exhaustiveness
versus the principle of parsimony: a new approach for the identification of biomarkers
from proteomic spot volume datasets based on principal component analysis.
Analytical and bioanalytical chemistry, 397, 25-41.
[17] McDonald, J. H., Handbook of Biological Statistics, Sparky House Publishing,
Baltimore, MD 2009.
[18] Ferguson, R. E., Carroll, H. P., Harris, A., Maher, E. R., et al., Housekeeping
proteins: a preliminary study illustrating some limitations as useful references in
protein expression studies. Proteomics 2005, 5, 566-571.
[19] Di, Y. P., Harper, R., Zhao, Y., Pahlavan, N., et al., Molecular cloning and
characterization of spurt, a human novel gene that is retinoic acid-inducible and
encodes a secretory protein specific in upper respiratory tracts. The Journal of
biological chemistry 2003, 278, 1165-1173.
[20] Ozawa, K., Kuwabara, K., Tamatani, M., Takatsuji, K., et al., 150-kDa oxygen-
regulated protein (ORP150) suppresses hypoxia-induced apoptotic cell death. The
Journal of biological chemistry 1999, 274, 6397-6404.
[21] Eisen, M. B., Spellman, P. T., Brown, P. O., Botstein, D., Cluster analysis and display
of genome-wide expression patterns. Proceedings of the National Academy of Sciences
of the United States of America 1998, 95, 14863-14868.
[22] Ross, D. T., Scherf, U., Eisen, M. B., Perou, C. M., et al., Systematic variation in
gene expression patterns in human cancer cell lines. Nat Genet 2000, 24, 227-235.
[23] Lan, M. Y., Ho, C. Y., Lee, T. C., Yang, A. H., Cigarette smoke extract induces
cytotoxicity on human nasal epithelial cells. Am J Rhinol 2007, 21, 218-223.
[24] Casalino-Matsuda, S. M., Monzon, M. E., Day, A. J., Forteza, R. M., Hyaluronan
fragments/CD44 mediate oxidative stress-induced MUC5B up-regulation in airway
91
epithelium. American journal of respiratory cell and molecular biology 2009, 40, 277-
285.
[25] Lieberman, L. A., Hunter, C. A., Regulatory pathways involved in the infection-
induced production of IFN-gamma by NK cells. Microbes Infect 2002, 4, 1531-1538.
[26] Meagher, L. C., Savill, J. S., Baker, A., Fuller, R. W., Haslett, C., Phagocytosis of
apoptotic neutrophils does not induce macrophage release of thromboxane B2.
Journal of leukocyte biology 1992, 52, 269-273.
[27] Ward, C., Dransfield, I., Chilvers, E. R., Haslett, C., Rossi, A. G., Pharmacological
manipulation of granulocyte apoptosis: potential therapeutic targets. Trends in
pharmacological sciences 1999, 20, 503-509.
[28] Kirkham, P. A., Spooner, G., Rahman, I., Rossi, A. G., Macrophage phagocytosis of
apoptotic neutrophils is compromised by matrix proteins modified by cigarette smoke
and lipid peroxidation products. Biochemical and biophysical research communications
2004, 318, 32-37.
[29] Haslett, C., Granulocyte apoptosis and its role in the resolution and control of lung
inflammation. American journal of respiratory and critical care medicine 1999, 160, S5-
11.
[30] Hart, S. P., Dougherty, G. J., Haslett, C., Dransfield, I., CD44 regulates phagocytosis
of apoptotic neutrophil granulocytes, but not apoptotic lymphocytes, by human
macrophages. J Immunol 1997, 159, 919-925.
[31] Sies, H., Oxidative stress: oxidants and antioxidants. Exp Physiol 1997, 82, 291-295.
[32] Russo, M., Cocco, S., Secondo, A., Adornetto, A., et al., Cigarette smoke
condensate causes a decrease of the gene expression of Cu-Zn superoxide dismutase,
mn superoxide dismutase, glutathione peroxidase, catalase, and free radical-induced
cell injury in SH-SY5Y human neuroblastoma cells. Neurotox Res 2011, 19, 49-54.
[33] Meister, A., Anderson, M. E., Glutathione. Annual review of biochemistry 1983, 52,
711-760.
[34] Brown, L. A., Glutathione protects signal transduction in type II cells under oxidant
stress. The American journal of physiology 1994, 266, L172-177.
[35] Njalsson, R., Norgren, S., Physiological and pathological aspects of GSH
metabolism. Acta Paediatr 2005, 94, 132-137.
92
[36] Cho, H. Y., Kleeberger, S. R., Nrf2 protects against airway disorders. Toxicology and
applied pharmacology, 244, 43-56.
[37] Takada, K., Takahashi, K., Sato, S., Yasui, S., Cigarette smoke modifies bleomycin-
induced lung injury to produce lung emphysema. Tohoku J Exp Med 1987, 153, 137-
144.
[38] Decologne, N., Wettstein, G., Kolb, M., Margetts, P., et al., Bleomycin induces
pleural and subpleural fibrosis in the presence of carbon particles. Eur Respir J, 35,
176-185.
[39] Cisneros-Lira, J., Gaxiola, M., Ramos, C., Selman, M., Pardo, A., Cigarette smoke
exposure potentiates bleomycin-induced lung fibrosis in guinea pigs. American journal
of physiology 2003, 285, L949-956.
[40] Lv, X. B., Xie, F., Hu, K., Wu, Y., et al., Damaged DNA-binding protein 1 (DDB1)
interacts with Cdh1 and modulates the function of APC/CCdh1. The Journal of
biological chemistry 2010, 285, 18234-18240.
[41] Cappello, S., Attardo, A., Wu, X., Iwasato, T., et al., The Rho-GTPase cdc42
regulates neural progenitor fate at the apical surface. Nat Neurosci 2006, 9, 1099-
1107.
[42] Etienne-Manneville, S., Hall, A., Rho GTPases in cell biology. Nature 2002, 420,
629-635.
[43] Florian, M. C., Geiger, H., Concise review: polarity in stem cells, disease, and aging.
Stem cells (Dayton, Ohio), 28, 1623-1629.
[44] Jaffe, A. B., Kaji, N., Durgan, J., Hall, A., Cdc42 controls spindle orientation to
position the apical surface during epithelial morphogenesis. The Journal of cell biology
2008, 183, 625-633.
[45] Georgiou, M., Marinari, E., Burden, J., Baum, B., Cdc42, Par6, and aPKC regulate
Arp2/3-mediated endocytosis to control local adherens junction stability. Curr Biol
2008, 18, 1631-1638.
[46] Duncan, M. C., Peifer, M., Regulating polarity by directing traffic: Cdc42 prevents
adherens junctions from crumblin' aPart. The Journal of cell biology 2008, 183, 971-
974.
94
Proteomic profiling of nasal epithelial cells in chronic obstructive pulmonary disease
Bruno M. Alexandre1, Brian L. Hood2, Mai Sun2,
Thomas P. Conrads2* and Deborah Penque1*
1Laboratorio de Proteómica, Departamento de Genética, Instituto Nacional de Saúde
Dr. Ricardo Jorge (INSA-IP), Av. Padre Cruz 1649-016 Lisboa, Portugal and the
2Department of Pharmacology & Chemical Biology, University of Pittsburgh Cancer
Institute, University of Pittsburgh
Keywords: Nasal epithelial cells, nasal brushing, chronic obstructive pulmonary
disease, lung, tobacco, cigarette smoke, proteomics.
*Corresponding authors: Deborah Penque, Ph.D., Laboratório de Proteómica,
Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo
Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8137,
Fax: +351 21752 6410, E-mail: [email protected] and Thomas P.
Conrads, Ph.D., 204 Craft Avenue, Suite B401, Pittsburgh, PA, 15213, Tel: 412-641-
7556, Fax: 412-641-2356, E-mail: [email protected] and
95
ABSTRACT
Chronic obstructive pulmonary disease (COPD), a chronic lung disease, is the fourth
leading cause of death in world. COPD is primarily characterized by the presence of
airflow limitation resulting from inflammation and remodeling of small airways and is
often associated with lung parenchymal destruction or emphysema.
Fresh nasal epithelial cells collected by noninvasive brushing technique have been used
as surrogates of lower airway cells in the investigation of respiratory diseases. Here,
for the first time, a wider proteomic profiling of nasal epithelial cells from COPD
patients in comparison with healthy smokers and nonsmokers was performed using a
combination of 1D-PAGE and high resolution liquid chromatography-tandem mass
spectrometry approach. About 1173 protein were identified confidentially by at least
two peptides and compared across conditions. Functional characterization by
Ingenuity pathway analysis (IPA) revealed that about 40% of significantly differentially
expressed proteins in COPD nasal epithelial cells are related to cancer. The data also
revealed that unfolded protein response (UPR) is activated in COPD nasal epithelial
cells confirming previous evidence of the UPR in COPD. The upregulation of proteins
related to drug metabolism and oxidative stress response, in particular with Nrf2-
mediated oxidative stress response was also observed. Further validation of these data
by orthogonal methods will emphasize the value of using native nasal epithelial cells by
showing primary molecular networks/pathways associated with COPD pathogenesis.
96
INTRODUCTION
Chronic obstructive pulmonary disease (COPD) is primarily characterized by the
presence of airflow limitation resulting from inflammation and remodeling of small
airways and is often associated with lung parenchymal destruction or emphysema [1].
Additionally, it has been recognized that COPD extends beyond the lung and that many
patients have several systemic manifestations that can further impair functional
capacity and health-related quality of life [2, 3]. COPD is a major cause of chronic
morbidity and mortality throughout the world. Many people suffer from this disease
for years and die prematurely from it or its complications. COPD is the fourth leading
cause of death worldwide and its morbidity and mortality is expected to rise as
population age and mortality from cardiovascular and infectious diseases falls [1].
Among respiratory diseases, COPD is the leading cause of lost work days. In the United
States of America, medical costs credited to COPD were estimated at $32.1 billion [4].
In the European Union, productivity losses are estimated to amount to a total of €28.5
billion annually [5]. Therefore, according to these data, the total costs associated with
the disease, including indirect ones, are quite relevant.
Cigarette smoke is the most commonly encountered risk factor for COPD. Cigarette
smokers have a higher prevalence of respiratory symptoms, lung function
abnormalities, a greater rate of decline in forced expiratory volume in the first second,
FEV1, and higher death rates for COPD than nonsmokers [1, 5]. A 25-year follow up
study of the general population concluded that 92% of COPD deaths occurred in
subjects who were current smokers at the beginning of the follow up period and that,
after 25 years of smoking, at least 25% of smokers without initial disease will develop
clinically significant and 30-40% will have COPD [6].
The main functions of the nose are the sense of smell, the regulation of humidity and
temperature of inhaled air, and the removal of large particulates from the inhaled air.
Once it enters the nasal vestibule, inhaled air is forced to pass through the nasal valve,
and then expands as it travels further in the nasal cavity, which offers little airflow
resistance. This sudden change in speed and pressure produces turbulence and eddies
[7]. These currents allow adequate contact of inhaled air with respiratory epithelium
due to the presence of three shelf-like projections, the concha or turbinates. The
inferior turbinate has a central osseous core surrounded by lamina propria covered
97
with pseudostratified ciliated columnar (respiratory) epithelium resting over a thick
basement membrane [7]. The respiratory epithelium is composed of four types of cells,
namely, nonciliated and ciliated columnar cells, basal cells and goblet cells. The major
function of the nasal epithelium has been regarded to be primarily that of a physical
barrier, but recent evidence strongly supports that epithelial cells are quite active
metabolically and capable of modulating a variety of inflammatory processes and
immune responses [8, 9].
Culture conditions of airway epithelial cells, their proliferation and immortalization
may influence their protein expression levels and therefore modify cellular processes.
In the present work, freshly obtained epithelial cells were collected by nasal brushing.
This technique has been shown to be capable of yielding numerous and well-preserved
dissociated cells that are representative of the human superficial respiratory mucosa
[10]. Our group have already conducted successful proteomic approaches on nasal
epithelial cells to describe the proteomic profiling of other chronic lung diseases such
as the monogenic disease cystic fibrosis by means of two-dimensional gel
electrophoresis and surface enhanced laser desorpion/ ionization time of flight (SELDI-
TOF)-MS [11, 12], but no studies on COPD have been reported so far. Nasal epithelial
cells were also reported to constitute an accessible surrogate for studying lower airway
inflammation [13]. Identification of disease-specific, severity-related biomarkers is an
essential step for diagnosis and monitoring of therapeutics in COPD patients. In the
past decade, sample preparation techniques greatly evolved, but it was mass
spectrometry (MS), with new capabilities as high resolution, high mass accuracy that
supported proteomics with the effective means for high-throughput, comprehensive,
comparative examinations of protein expression in healthy and disease states.
Proteomics is now widely used and it is responsible for a better understanding of
biological processes that ultimately leads to the discovery of new biomarkers. In the
present work, four groups were constituted: healthy smokers and nonsmokers and
two groups of patients: mild and severe. Protein extracts derived from nasal epithelial
cells collected by nasal brushing were loaded onto a 1D-PAGE and separated by
reverse phase liquid chromatography prior to MS analysis on a high resolution mass
spectrometer.
98
MATERIALS AND METHODS
Individuals and Sample Collection
The study was approved by the Ethics Committee of Hospital de Santa Maria, Lisbon
and Instituto Nacional de Saude Dr. Ricardo Jorge (INSA-IP), Lisbon. After informed
consent, nasal epithelial cells were collected by nasal brushing as previously described
[10, 14], from healthy nonsmokers (n=8), healthy cigarette smokers (n=10), and COPD
mild (n=9) and COPD severe patients (n=6). Lung function was evaluated by means of
spirometry and, according to Global initiative for chronic obstructive lung disease
(GOLD) guidelines, FEV1/FVC < 0.7 was set to be the criterion for an obstructed lung
function [1]. Each group was split into two to achieve two biological replicates. Main
characteristics of each of the biological replicates are displayed in Table V.1.
Table V.I: Demographics of biological replicates.
n
Biological
Replicates n Age (y) FVC(%) FEF(%) FEV1(%) FEV1/FVC(%)
Nonsmokers 10 1 5 54 ± 2.9 97 ± 20.5 81 ± 17.9 98 ± 19.7 86 ± 4.1
2 5 53 ± 3.6 98 ± 10.4 84 ± 17.6 98 ± 13.8 88 ± 11.8
Smokers 8 1 4 52 ± 7.7 112 ± 19.5 68 ± 6.5 108 ± 15.6 79 ± 3.1
2 4 50 ± 1.7 103 ± 18.9 76 ± 24.0 98 ± 21.9 80 ± 1.9
COPD Mild 9 1 4 67 ± 11.8 97 ± 11.4 45 ± 24.4 81 ± 20.9 65 ± 12.5
2 5 66 ± 9.8 122 ± 33.3 47 ± 13.2 102 ± 23.6 67 ± 7.6
COPD Severe 6 1 3 64 ± 6.6 63 ± 21.9 7 ± 2.3 24 ± 4.9 35 ± 12.1
2 3 64 ± 3.8 106 ± 50.1 33 ± 21.8 60 ± 49.9 40 ± 15.9
Table V.2: Smoking history of biological replicates.
n
Biological
Replicates n Smoking history (y) Cigarettes per day
Nonsmokers 10 1 5 - -
2 5 - -
Smokers 8 1 4 31 ± 12.2 21 ± 6.3
2 4 29 ± 4.2 15 ± 4.1
COPD Mild 9 1 4 34 ± 9.7 35 ± 12.9
2 5 27 ± 8.2 34 ± 16.7
COPD Severe 6 1 3 39 ± 3.2 37 ± 11.5
2 3 41 ± 9.0 33 ± 25.7
99
Except for the individuals included in the nonsmoker group, all subjects ought to
possess a smoking history of at least 20 years, smoking a minimum of 10 cigarettes per
day. Information on smoking history can be found in Table V.2. Cell suspensions from
each individual were cytospun onto a microscopy slide, stained with May-Grünwald-
Giemsa (MGG-) staining and examined for evidence of epithelial cells (ciliated, goblet
and basal cells) and for red blood cell contamination (Figure IV.1). Individuals whose
cell preparation was contaminated were removed from the study.
Nasal Epithelial Cells Lysis
Cell suspensions were centrifuged after collection and pelleted cells were resuspended
in the presence of 10 mM Tris-Cl pH 7.6 in 1 mM EDTA containing protease inhibitors.
Cells were lysed by intermittent sonication cycles (10 cycles of 10 sec-pulse followed
by 30 sec pause on dry ice). Lysates were centrifuged twice at 2000 x g for 3 min at 4 °C
to discard any unlysed cells or cell debris. Before storing at -80 °C, an aliquot of 10 μL
from each individual was removed to perform a BCA protein assay (Pierce, Rockford IL,
USA).
Sample Preparation for LC-MS/MS
Two biological replicates were constituted within each of the groups under analysis
(Table V.1). Each biological replicate containing 30 μg of total cell lysate was spiked
with 3 pmol of chicken ovalbumin. Each sample was loaded into duplicate gel lanes
onto 1D SDS-PAGE on a 4-12% bis-tris gel (NuPAGE, Invitrogen, Carlsbad, CA) and
electrophoresed for approximately 10 min at a constant voltage of 150 V. Gels were
stained with Coomassie blue (SimplyBlue SafeStain, Invitrogen) and each pair of bands
belonging to the same sample were excised, sliced into small pieces and pooled
together into the same tube. Gel slices were destained in 50% acetonitrile (AcN) and
50mM ammonium bicarbonate (AMB) overnight at 4 °C and in the next morning for
another hour. Fully destained gel slices were dehydrated in 100% AcN. Gel slices were
then rehydrated in 25 mM AMB containing 20 µg/mL porcine sequencing grade
modified trypsin (Promega, Madison, WI) on ice for 45 min. This solution was
discarded and a 25 mM AMB solution was added to the gel slices and incubated
overnight at 37 °C. Tryptic peptides were extracted with 70% ACN and 5% formic acid
100
(FA) and dried by vacuum centrifugation. Each digest was resuspended in 60 µl of 0.1%
trifluoroacetic acid (TFA).
Proteomic analysis by liquid chromatography-tandem mass spectrometry
Peptide digests were resolved by nanoflow reverse-phase liquid chromatography
(Ultimate 3000, Dionex Inc.) coupled online via electrospray ionization to a hybrid
linear ion trap-Orbitrap mass spectrometer (LTQ-Orbitrap, ThermoFisher Scientific,
Inc., San Jose, CA). Five injections of 2 µL of peptide extracts corresponding to 1 μg
total protein were resolved on 100 μm i.d. by 360 μm o.d. by 200 mm long fused silica
capillary columns (Polymicro Technologies, Phoenix, AZ) slurry-packed in-house with 5
μm, 300 Å pore size C-18 silica-bonded stationary phase (Jupiter, Phenomenex,
Torrance, CA). After sample injection, peptides were eluted from the column using a
linear gradient of 2% mobile phase B (100% AcN and 0.1% formic acid) to 40% mobile
phase B over 125 min at a constant flow rate of 200 nL/min followed by a column wash
consisting of 95% B for an additional 30 min at a constant flow rate of 400 nL/min. The
LTQ-Orbitrap MS was configured to collect high resolution (R=60,000 at m/z 400)
broadband mass spectra (m/z 375-1800) from which the thirteen most abundant
peptide molecular ions dynamically determined from the MS scan were selected for
tandem MS using a relative CID energy of 30%. Dynamic exclusion was utilized to
minimize redundant selection of peptides for CID.
Peptide Identification and Spectral Count Analysis
Peptide identifications were obtained by searching the LC-MS/MS data utilizing
SEQUEST (Thermo Scientific BioWorks 3.2) on a 72 node Beowulf cluster against a
UniProt-derived human proteome database (version 03/10) obtained from the
European Bioinformatics Institute (EBI). Search parameters consisted of enzyme:
trypsin (KR); enzyme limits: full enzymatic-cleavage at both ends; missed cleavages
sites: 2; peptide tolerance: 20 ppm; fragment ion tolerance: 0.5 amu; and variable
modifications on methionine of 15.99492 m/z. Resulting peptide identifications were
filtered according to specific SEQUEST scoring criteria [delta correlation (ΔCn) ≥ 0.08
101
and charge state dependent cross correlation (Xcorr) ≥ 1.9 for *M+H+1+ (mass+proton),
≥ 2.2 for *M+2H+2+, ≥ 3.5 for *M+3H+3++ and ≥ 3.0 for *M+4H+4++. Differences in
protein abundance between the samples were derived by spectral counting (SC).
Peptides whose sequence mapped to multiple protein isoforms were grouped as per
the principle of parsimony [15]. A value of 0.5 was added to each spectral count value
prior to log2 transformation to enable ratio values to be calculated for proteins
identified in one group, but not another [16]. Proteins which exhibited a >95%
confidence interval from the mean for each comparison performed were considered
statistically significant.
Bioinformatic analyses
Uniprot accessions corresponding to proteins identified by at least two peptides were
mapped to HUGO (HGNC) gene symbols utilizing Ingenuity Pathway Analysis (IPA)
(Ingenuity® Systems, www.ingenuity.com). Accessions which failed to map were
converted to IPI identifiers with the mapping utility available at www.uniprot.org and
remapped to IPA to maximize protein identifications available for downstream
bioinformatic analyses. Protein localization and subtype assignments were derived
from IPA-mapped data sets. Functional analysis of significant protein lists were
performed utilizing the “Core Analysis” function in IPA using default parameters
(p<0.05, Fischer’s Exact test). Network and protein interaction analyses were also
performed utilizing IPA for significant proteins in which a maximum of 35 proteins per
network assignment were allowed. ProteinCenter (Thermo Fischer Scientific) was used
to retrieve information on gene ontology terms to annotate proteins according to
cellular component, biological process and molecular function using default
parameters.
RESULTS AND DISCUSSION
Global proteomics analysis of nasal epithelial cells
Two biological replicates were constituted for each of the groups under analysis and
five injections corresponding to 1 µg total protein from each of the biological replicates
was analyzed by LC-MS/MS using a high-resolution mass spectrometer resulting in the
102
identification of 89968 peptides and 1475 proteins in total (Supplemental Table V.1,
Supporting Information), of which 1173 were identified by at least two peptides across
the samples under analysis (Supplemental Table V.2, Supporting Information). Protein
digests equivalency was determined by comparing the total number of peptides
identified (total spectral counts) in each of the analytical samples, which resulted in a
calculated relative standard deviation (RSD) of 9.3%. This was also evaluated through
comparison of total peptides identified for chicken ovalbumin, which was added in
equal amounts to each of the analyzed samples according to the workflow displayed in
Figure IV.2. Calculated RSD for ovalbumin is 7.7%, which is consistent with the RSD
obtained for total spectral counts. Equivalency in peptide load was also determined by
comparison of the spectral count values for the “housekeeping” protein actin,
cytoplasmic 1 (ACTB, Uniprot Accession (Acc): P60709), commonly used to correct for
protein loading in western blot analysis [17], which revealed a RSD of 12.7%. As
reported in Chapter IV, when performing the nasal brushing procedure, some local
bleeding may occur, resulting in the identification of proteins such as hemoglobin
subunits alpha, beta, gamma and gamma-1 (HBA1, HBB, HBD and HBG1, respectively),
band 3 anion transport protein (SLC4A1) or erythrocyte band 7 integral membrane
protein (STOM). Epithelial origin of the samples was confirmed by the identification of
proteins such as keratin type I cytoskeletal 19 (KRT19, Acc: P08727), keratin type II
cytoskeletal 8 (KRT8, Acc: P05787), palate, lung and nasal epithelium carcinoma
(PLUNC, Acc: Q9NP55), long palate, lung and nasal epithelium carcinoma (LPLUNC, Acc:
Q8TDL5), epithelial cell adhesion molecule (EPCAM, Acc: P16422), and a handful of
mucins such as mucin-1 (MUC1, Acc: P15941), mucin-2 (MUC2. Acc: Q02817), mucin-4
(MUC4, Acc: Q99102), mucin-5AC (MUC5AC, Acc: P98088) or mucin 5B (MUC5B, Acc:
Q9HC84) which are reported to be expressed at relatively high levels in the human
respiratory tracts when compared to other mucin genes [18]. In an attempt to assess
proteins related to epithelium, a list of 111 proteins that matched to human
epithelium when cross-referenced to Uniprot database was generated and it is
available in Supplemental Table V.3, Supporting Information. However, visual
inspection of the generated list reveals this list is not accurate as expected as, for
instance, it lacks proteins as mucins 1, 2 and 4, which are expressed in the epithelium.
Cellular location and functional type of each of the proteins identified by two or more
103
peptides are listed as Supplemental Table V.4A, Supporting Information. These
proteins were also grouped according to their cellular components (Gene Ontology)
and to the number of transmembrane domains by Protein Center (Thermo Fisher
Scientific). This information is available as Supplemental Table V.4B and Supplemental
Figure V.1, Supporting Information, respectively. A very recent and comprehensive
work performed by our group on the nasal epithelium using 2D-LC-MS/MS generated
1482 protein identifications [19]. Comparing this set to all protein identifications in the
present work resulted in an overlap of about one third (702 proteins), while one third
of proteins were uniquely identified in each of the works under comparison. These
results are mainly due to differences in strategies used in both studies, especially with
regard to cell fractionation and separation of the samples before analysis by MS. Venn
diagram displaying this comparison can be found in Supplemental Figure V.2A and
V.2B, Supporting Information, in percentage and in number of protein identifications,
respectively.
Comparative proteomics analysis of COPD patients vs. healthy individuals
In order to assess the proteome of COPD patients in comparison to the one of healthy
individuals, all biological replicates from COPD mild and severe groups were combined
and the same was done for biological replicates from smoker and nonsmoker groups.
After submitting spectral counts of proteins identified within these two groups (COPD
patients and healthy individuals) to a t-test, 47 proteins exhibited a >95% confidence
interval and were therefore considered to be significantly differentially expressed.
However, there was an entry, UniProt acc: P01611, Ig kappa chain V-I region Wes, that
possesses no HGNC symbol, which was not recognized by Ingenuity pathway analysis
(IPA) upon submission and, therefore, 46 proteins were considered in this comparative
analysis. Figure V.1 shows the hierarchical clustering of these proteins exhibiting their
relative expression. Information on the identity of these proteins, on their relative
differentially expression, as well as on its cellular location and functional type is
exhibited in Table V.3. This list of 46 proteins was submitted to the IPA’s Core Analysis,
where 44 proteins were found to be eligible for network analysis and 43 to be eligible
for functions and pathway analysis.
104
Figure V.1: Hierarchical clustering of the significantly differentially expressed
proteins between COPD patients and healthy individuals. Protein abundances are
displayed as normalized expression.
Grouping significantly differentially expressed proteins into networks generated 3
different networks. Statistical scores, number of significantly differentially expressed
proteins, as well as the biological functions associated with these networks are shown
in Table V.4. The network exhibiting higher statistical score incorporates 18 proteins
and was related to cellular assembly and organization, lipid metabolism and small
105
molecule biochemistry (Figure V.2). Networks 2 and 3 are available as Supplemental
Figures V.3 and V.4, Supporting Information Interestingly, the three networks
generated by IPA were interconnected together (Supplemental Figure V.5, Supporting
Information), making it possible to merge them into a single network comprising all the
44 proteins found to be eligible for network analysis (Figure V.3), thus yielding a single
network containing the significantly differentially expressed proteins in COPD patients
when compared to healthy individuals. Information on all proteins displayed in this
network (Figure V.3) is exhibited in Supplemental Table V.5, Supporting Information.
Table V.3: Differentially expressed proteins in COPD patients when compared to
healthy individuals exhibiting a >95% confidence interval. Fold change along with
cellular location and functional type retrieved by Ingenuity knowledgebase
(Ingenuity Systems) are also provided.
Uniprot
Acc.
HGNC
Symbol Entrez Gene Name
COPD/
Healthy Log2
Ratio
Cellular
Location
Functional
Type
P24752 ACAT1 acetyl-CoA acetyltransferase 1 0,85 Cytoplasm enzyme
Q99798 ACO2 aconitase 2, mitochondrial 0,35 Cytoplasm enzyme
Q86TX2 ACOT1 acyl-CoA thioesterase 1 3,17 Cytoplasm enzyme
O43707 ACTN4 actinin, alpha 4 -0,63 Cytoplasm other
P42330 AKR1C3 aldo-keto reductase family 1,
member C3 (3-alpha
hydroxysteroid dehydrogenase,
type II)
1,93 Cytoplasm enzyme
P07355-1 ANXA2 annexin A2 0,27 Plasma
Membrane
other
P08758 ANXA5 annexin A5 0,49 Plasma
Membrane
other
P30042-1 C21orf33 chromosome 21 open reading
frame 33
1,56 Cytoplasm other
P27824 CANX calnexin -1,00 Cytoplasm other
Q13938 CAPS calcyphosine 0,58 Cytoplasm other
P13688-1 CEACAM
1
carcinoembryonic antigen-
related cell adhesion molecule
-3,00 Plasma
Membrane
transmem
brane
106
(includes
others)
1 (biliary glycoprotein) receptor
O00748-1 CES2 carboxylesterase 2 2,03 Cytoplasm enzyme
P12277 CKB creatine kinase, brain 1,94 Cytoplasm kinase
P30084 ECHS1 enoyl CoA hydratase, short
chain, 1, mitochondrial
0,93 Cytoplasm enzyme
P07099 EPHX1 epoxide hydrolase 1,
microsomal (xenobiotic)
-1,00 Cytoplasm peptidase
O75477 ERLIN1 ER lipid raft associated 1 -2,81 Plasma
Membrane
other
Q96BQ1 FAM3D family with sequence similarity
3, member D
-2,59 Extracellular
Space
cytokine
P22570-1 FDXR ferredoxin reductase 3,20 Cytoplasm enzyme
P21333-1 FLNA filamin A, alpha -2,00 Cytoplasm other
Q9HC38-1 GLOD4 glyoxalase domain containing 4 1,66 Cytoplasm enzyme
P43304-1 GPD2 glycerol-3-phosphate
dehydrogenase 2
(mitochondrial)
-3,86 Cytoplasm enzyme
Q9UBQ7 GRHPR glyoxylate
reductase/hydroxypyruvate
reductase
1,90 Cytoplasm enzyme
P00390-1 GSR glutathione reductase 0,41 Cytoplasm enzyme
P09211 GSTP1 glutathione S-transferase pi 1 0,33 Cytoplasm enzyme
P51858 HDGF hepatoma-derived growth
factor
1,70 Extracellular
Space
growth
factor
Q13151 HNRNPA
0
heterogeneous nuclear
ribonucleoprotein A0
2,42 Nucleus other
P61604 HSPE1 heat shock 10kDa protein 1
(chaperonin 10)
1,84 Cytoplasm enzyme
P19013 KRT4 keratin 4 -2,43 Cytoplasm other
P25325 MPST mercaptopyruvate
sulfurtransferase
1,09 Cytoplasm enzyme
P35579-1 MYH9 myosin, heavy chain 9, non-
muscle
-2,18 Cytoplasm enzyme
Q9NR45 NANS N-acetylneuraminic acid
synthase
0,94 Cytoplasm enzyme
Q56VL3-1 OCIAD2 OCIA domain containing 2 -2,32 Cytoplasm other
P05166 PCCB propionyl CoA carboxylase, beta 1,30 Cytoplasm enzyme
107
polypeptide
Q10713 PMPCA peptidase (mitochondrial
processing) alpha
2,64 Cytoplasm peptidase
P30044-1 PRDX5 peroxiredoxin 5 0,68 Cytoplasm enzyme
Q99873-1 PRMT1 protein arginine
methyltransferase 1
2,00 Nucleus enzyme
P12724 RNASE3 ribonuclease, RNase A family, 3 -3,17 Cytoplasm enzyme
Q15393-1 SF3B3 splicing factor 3b, subunit 3,
130kDa
1,78 Nucleus other
P12235 SLC25A4 solute carrier family 25
(mitochondrial carrier; adenine
nucleotide translocator),
member 4
-0,89 Cytoplasm transporte
r
P05141 SLC25A5 solute carrier family 25
(mitochondrial carrier; adenine
nucleotide translocator),
member 5
-3,00 Cytoplasm transporte
r
P11166 SLC2A1 solute carrier family 2
(facilitated glucose
transporter), member 1
0,59 Plasma
Membrane
transporte
r
Q13630 TSTA3 tissue specific transplantation
antigen P35B
0,78 Plasma
Membrane
enzyme
Q16881-1 TXNRD1 thioredoxin reductase 1 0,79 Cytoplasm enzyme
P31930 UQCRC1 ubiquinol-cytochrome c
reductase core protein I
-2,24 Cytoplasm enzyme
P21796 VDAC1 voltage-dependent anion
channel 1
-2,04 Cytoplasm ion
channel
P45880-3 VDAC2 voltage-dependent anion
channel 2
-2,08 Cytoplasm ion
channel
Table V.4: Protein interaction networks generated by IPA from 44 proteins found to
be eligible for network analysis among the 46 significantly differentially expressed
proteins between COPD patients and healthy individuals.
Score Focus
Molecules Top Functions Molecules in Network
108
43 18
Cellular Assembly and
Organization, Lipid
Metabolism, Small Molecule
Biochemistry
ACAT1,ACAT2,Actin,ACTN4,ANG,ANXA2,ANXA5,CANX
,CEACAM1,DHRS2 (includes EG:10202),ERK1/2,F
Actin,FER (includes
EG:2241),FLNA,GAS8,GSTP1,HDGF,HSPE1,Insulin,MYH
9,NFkB (complex),PCCB,PDGF BB,PDGF-
AA,S100P,SLC25A4,SLC25A5,SLC2A1,TCR,TIE1,TIMM1
7A (includes
EG:10440),Tropomyosin,TXNRD1,VDAC1,VDAC2
34 15
Cellular Development,
Hematological System
Development and Function,
Hematopoiesis
ALDH2,ASB9,BCKDHA,BCKDHB,beta-
estradiol,C21ORF33,CBR1,CD209,CKB,CLEC4E,dehydr
oisoandrosterone,ECHS1,EPHX1,ERLIN1,FDXR,GLOD4,
GPD2,GSR,Histone
h3,HSP90AB1,HSPD1,HSPE1,IKBKE,IL4,KRT4,MPST,MT
-
CYB,NANS,NFATC2IP,NR6A1,SF3B3,TNF,TPM3,TRAF6,
UQCRC1
34 15
Carbohydrate Metabolism,
Small Molecule Biochemistry,
Post-Translational
Modification
ACIN1,ACO2,ACOT1 (includes
EG:641371),AKR1C3,ALDH2,BCKDHA,BTG1,C21ORF33,
CAPRIN1,CCT3,CES2 (includes
EG:8824),CSDA,F7,FAM3D,GRHPR,HNF4A,HNRNPA0,
HNRNPR,IDH3B,palmitoyl-CoA
hydrolase,PMPCA,PRDX5,PRMT1,PXR ligand-PXR-
Retinoic acid-
RXRα,RNASE3,SLC2A4,SRPRB,SSSCA1,SUPT5H,SYNCRI
P,TSTA3,VDAC1,VDAC2,YBX2
Table V.5: Top 25 significant biofunctions generated from significantly differentially
expressed proteins on COPD patients when compared to healthy individuals.
Category p-value Molecules
Cellular Assembly and
Organization
4,87E-05-3,97E-
02
CKB,PRMT1,SLC25A4,FLNA,ANXA5,MYH9,ANXA2,ACTN4,VDA
C1
Molecular Transport 8,11E-05-4,8E-02 SLC25A4,SLC2A1,PRDX5,GPD2,ACAT1,MYH9,VDAC1,EPHX1,V
DAC2
Nucleic Acid Metabolism 8,11E-05-4,53E-
02
TSTA3,SLC25A4,GPD2,HSPE1,VDAC1
Small Molecule 8,11E-05-4,8E-02 AKR1C3,PRDX5,ACO2,PCCB,CES2,CKB,PRMT1,GPD2,ANXA5,H
109
Biochemistry SPE1,ACOT1,TSTA3,SLC25A4,ECHS1,SLC2A1,MPST,ANXA2,FD
XR,TXNRD1,ACAT1,MYH9,NANS,VDAC1,EPHX1
Cancer 7,26E-04-3,69E-
02
RNASE3,ECHS1,SLC2A1,AKR1C3,CANX,ANXA2,SLC25A5,FDXR,
VDAC2,Ceacam1,PRMT1,FLNA,ANXA5,HSPE1,ACAT1,EPHX1,
GSTP1
Lipid Metabolism 7,26E-04-4,8E-02 ECHS1,SLC2A1,AKR1C3,ANXA5,ACAT1,ACOT1,MYH9,ANXA2,
PCCB,FDXR,EPHX1
Reproductive System
Disease
7,54E-04-2,61E-
02
SLC2A1,PRDX5,FLNA,ANXA5,ANXA2,VDAC1
Energy Production 7,9E-04-4,53E-02 SLC25A4,GPD2,HSPE1,ACO2,VDAC1,FDXR
Cellular Development 8,36E-04-4,25E-
02
Ceacam1,PRMT1,RNASE3,ANXA5,MYH9,CES2,VDAC1
Cellular Growth and
Proliferation
8,36E-04-4,25E-
02
RNASE3,AKR1C3,SLC2A1,CAPS,ANXA2,CES2,SF3B3,FDXR,TXN
RD1,HDGF,Ceacam1,PRMT1,HNRNPA0,ACAT1,MYH9,ACTN4,
VDAC1
Respiratory System
Development and
Function
8,36E-04-2,85E-
02
PRMT1,RNASE3
Renal and Urological
System Development and
Function
8,66E-04-4,25E-
02
Ceacam1,CES2,ACTN4,VDAC1
Drug Metabolism 2,16E-03-3,41E-
02
GSR,AKR1C3,ANXA2,CES2
Endocrine System
Development and
Function
2,16E-03-3,69E-
02
AKR1C3,FDXR
Amino Acid Metabolism 2,89E-03-3,13E-
02
PRMT1,ANXA2,TXNRD1
Antimicrobial Response 2,89E-03-2,89E-
03
HSPE1
Carbohydrate Metabolism 2,89E-03-3,13E-
02
TSTA3,SLC2A1,GPD2,ANXA5,ACO2,MYH9,ANXA2,NANS
Cardiovascular System
Development and
Function
2,89E-03-4,25E-
02
Ceacam1,SLC2A1,MYH9,ACTN4
Cell Death 2,89E-03-4,51E-
02
SLC25A4,RNASE3,SLC2A1,PRDX5,FDXR,TXNRD1,HDGF,VDAC2
,GSR,Ceacam1,FLNA,HSPE1,MYH9,ACTN4,VDAC1,EPHX1
Cell Morphology 2,89E-03-4,25E- SLC25A4,FLNA,ANXA5,MYH9,ANXA2,ACTN4
110
02
Cell Signaling 2,89E-03-3,69E-
02
FLNA,VDAC1
Cell-To-Cell Signaling and
Interaction
2,89E-03-2,85E-
02
TSTA3,Ceacam1,RNASE3,ANXA5,MYH9,ANXA2,ACTN4,VDAC
1
Cellular Compromise 2,89E-03-1,72E-
02
TSTA3,ANXA5
Cellular Function and
Maintenance
2,89E-03-4,7E-02 CKB,Ceacam1,PRMT1,SLC2A1,FLNA,MYH9,ACTN4,VDAC1,UQ
CRC1
Cellular Movement 2,89E-03-4,8E-02 Ceacam1,SLC2A1,FLNA,ACAT1,MYH9
Figure V.2: Top protein network as obtained from Ingenuity pathway analysis.
Top 25 significant (p<0.05, Fischer’s exact test) biological functions (biofunctions) along
with the proteins involved in each biofunction were also derived from IPA and are
displayed in Table V.5. Cellular assembly and organization was ranked first as it possess
the lowest p-value. Complete information on the full list of biofunctions, including
details on function annotation can be found in Supplemental Table V.6, Supporting
Information. Biofunctions are divided in three brunches when submitted to IPA
according to Ingenuity Knowledgebase: diseases and disorders, molecular and cellular
functions and physiological system development and function. Top 10 significant
111
biofunctions for each of these three brunches, along with information on respective p-
value and proteins involved in each biofunction are displayed in Tables V.6, V.7 and
V.8, respectively. Within diseases and disorders, cancer was the top biofunction
accounting for 17 proteins in total. This means that about 40% of significantly
differentially expressed proteins eligible for biofunctions analysis by IPA are related to
cancer. This is no surprise since cancer is a hot topic and is one of the most studied
subjects of the past decades.
Figure V.3: Merged network comprising all 44 proteins found eligible for networks
analysis by Ingenuity pathway analysis.
Table V.6: Top 10 significant biofunctions within diseases and disorders together with
proteins involved in each biofunction.
Category p-value Molecules
Cancer 7,26E-04-3,69E-02 RNASE3,ECHS1,SLC2A1,AKR1C3,CANX,ANXA2,SLC25A5,
FDXR,VDAC2,Ceacam1,PRMT1,FLNA,ANXA5,HSPE1,ACA
112
T1,EPHX1, GSTP1
Reproductive
System Disease
7,54E-04-2,61E-02 SLC2A1,PRDX5,FLNA,ANXA5,ANXA2,VDAC1
Antimicrobial
Response
2,89E-03-2,89E-03 HSPE1
Connective Tissue
Disorders
2,89E-03-2,89E-03 FLNA
Developmental
Disorder
2,89E-03-4,8E-02 FLNA
Genetic Disorder 2,89E-03-4,8E-02 SLC25A4,SLC2A1,ANXA2,PCCB,FDXR,GSR,GRHPR,FLNA,
HSPE1,ACAT1,MYH9,ACTN4,EPHX1,KRT4
Hematological
Disease
2,89E-03-3,69E-02 RNASE3,GPD2,ANXA5,ACO2,MYH9,ANXA2,EPHX1
Metabolic Disease 2,89E-03-3,69E-02 SLC2A1,ACAT1,PCCB
Renal and Urological
Disease
2,89E-03-4,8E-02 GRHPR,MYH9,ACTN4
Skeletal and
Muscular Disorders
2,89E-03-2,89E-03 FLNA
Table V.7: Top 10 significant biofunctions within molecular and cellular functions
together with proteins involved in each biofunction.
Category p-value Molecules
Cellular Assembly and
Organization
4,87E-05-3,97E-02 CKB,PRMT1,SLC25A4,FLNA,ANXA5,MYH9,ANXA2,ACTN
4,VDAC1
Molecular Transport 8,11E-05-4,8E-02 SLC25A4,SLC2A1,PRDX5,GPD2,ACAT1,MYH9,VDAC1,EP
HX1,VDAC2
Nucleic Acid
Metabolism
8,11E-05-4,53E-02 TSTA3,SLC25A4,GPD2,HSPE1,VDAC1
Small Molecule
Biochemistry
8,11E-05-4,8E-02 AKR1C3,PRDX5,ACO2,PCCB,CES2,CKB,PRMT1,GPD2,AN
XA5,HSPE1,ACOT1,TSTA3,SLC25A4,ECHS1,SLC2A1,MPST
,ANXA2,FDXR,TXNRD1,ACAT1,MYH9,NANS,VDAC1,EPH
X1
Lipid Metabolism 7,26E-04-4,8E-02 ECHS1,SLC2A1,AKR1C3,ANXA5,ACAT1,ACOT1,MYH9,AN
XA2,PCCB,FDXR,EPHX1
Energy Production 7,9E-04-4,53E-02 SLC25A4,GPD2,HSPE1,ACO2,VDAC1,FDXR
113
Cellular Development 8,36E-04-4,25E-02 Ceacam1,PRMT1,RNASE3,ANXA5,MYH9,CES2,VDAC1
Cellular Growth and
Proliferation
8,36E-04-4,25E-02 RNASE3,AKR1C3,SLC2A1,CAPS,ANXA2,CES2,SF3B3,FDXR
,TXNRD1,HDGF,Ceacam1,PRMT1,HNRNPA0,ACAT1,MY
H9,ACTN4,VDAC1
Drug Metabolism 2,16E-03-3,41E-02 GSR,AKR1C3,ANXA2,CES2
Amino Acid
Metabolism
2,89E-03-3,13E-02 PRMT1,ANXA2,TXNRD1
Table V.8: Top 10 significant biofunctions within physiological system development
and function together with proteins involved in each biofunction.
Category p-value Molecules
Respiratory System Development and
Function
8,36E-04-2,85E-02 PRMT1,RNASE3
Renal and Urological System
Development and Function
8,66E-04-4,25E-02 Ceacam1,CES2,ACTN4,VDAC1
Endocrine System Development and
Function
2,16E-03-3,69E-02 AKR1C3,FDXR
Cardiovascular System Development
and Function
2,89E-03-4,25E-02 Ceacam1,SLC2A1,MYH9,ACTN4
Embryonic Development 2,89E-03-1,72E-02 Ceacam1,HSPE1
Nervous System Development and
Function
2,89E-03-1,44E-02 MYH9,VDAC1
Reproductive System Development and
Function
2,89E-03-3,97E-02 Ceacam1,AKR1C3
Skeletal and Muscular System
Development and Function
2,89E-03-4,25E-02 MYH9,VDAC1,HDGF
Tissue Development 2,89E-03-2,85E-02 TSTA3,Ceacam1,MYH9,ACTN4
Tumor Morphology 2,89E-03-3,13E-02 TSTA3,Ceacam1
Cigarette smoking exposes the lung to high concentrations of reactive oxygen species
(ROS) and it is the major risk factor for chronic obstructive pulmonary disease (COPD).
It is estimated that only 15-35% of chronic, continuous cigarette smokers develop
COPD [20-22]. Thus, the majority of long-term smokers do not develop COPD, which
suggests that failure of compensatory mechanisms that protect the lung from ROS or
114
xenobiotic materials contributes to the development of COPD. In this way, expression
of antioxidant proteins believed to be important in protection of the lung from
cigarette smoke-induced injuries such as peroxiredoxin, glutathione S-transferase or
glutathione reductase varies widely in airway epithelial cells harvested from chronic
cigarette smokers [23, 24]. Noteworthy, results obtained from studies performed on
bronchial epithelial cells, were found to be consistent with the ones obtained from
nasal epithelial cells [25]. Recent reports indicate that that ROS interfere with protein
folding in the endoplasmic reticulum (ER), a complex molecular cascade termed
unfolded protein response (UPR) [26]. UPR is one of the signaling pathways comprising
the proteostasis network, which regulates and maintains protein folding and function
in the face of many cellular challenges during cell lifetime [27, 28]. Activation of the
UPR compensates for abnormalities in protein folding by increasing the expression of
genes involved in protein chaperoning and folding, protein translation, and protein
degradation [26]. In the present work, a considerable number of proteins related to
UPR were confidently identified, although most of them did not pass the t-test and
were therefore not significantly differentially expressed. Translational endoplasmic
reticulum ATPase, also known as Vasolin-containing protein (VCP), was found to be
overexpressed in COPD. VCP is associated with cellular functions comprising nuclear
envelope reconstruction, cell cycle, post-mitotic Golgi reassembly, suppression of
apoptosis, DNA damage response and endoplasmic reticulum-associated degradation
(ERAD) [29, 30]. VCP overexpression has been implicated in chronic inflammations of
other lung diseases as cystic fibrosis and lung cancer [31, 32]. It has been reported that
overexpression of VCP in COPD may induce protein aggregation triggering to chronic
oxidative stress, inflammation and apoptosis, which in turn may lead to severe
emphysema [30]. In the present study, both proteins from the calnexin/calreticulin
chaperone system were identified. Calnexin (CANX) is a 90 kDa type I ER membrane
protein and calreticulin (CALR) is a 60 kDa soluble ER lumen protein [33]. Both are
thought to play a central role in quality control of protein folding since they have the
capability of stabilizing nascent proteins until they are properly folded and assembled
or retaining incorrectly folded protein subunits within the ER for degradation by ERAD
mechanisms [22, 33-35]. CANX was found to be significantly underexpressed, while
CALR was found to be overexpressed in COPD patients. Endoplasmin, also known as
115
GRP94 (HSP90B) and 78 kDa glucose-regulated protein, better known as GRP78 or BiP
(HSPA5), members of the GRP78/GRP94 chaperone system were also identified in this
study, but no differential expression was observed between COPD patients and
healthy individuals [33]. Furthermore, we were able to confidently identify most of the
proteins that belong to a large ER-localized multiprotein complex which comprises
DNAJB11, HSP90B1, HSPA5, HYOU1, PDIA2, PDIA4, PDIA6, PPIB, SDF2L1, UGT1A1 and
ERP29 [36]. In fact, HSP90B1, HSPA5, HYOU1, PDIA4, PDIA6, PPIB and ERP 29 were
identified. Although none of these proteins were found to be significantly differentially
expressed when submitted to a t-test, PPIB and ERP29 exhibit overexpression in COPD
patients based on their spectral counts, similarly to what was stated for CALR.
Moreover, both components of the Hsp10/Hsp60 chaperone complex were also
identified in the present study. 10 kDa heat shock protein, mitochondrial (HSPE1, also
known as Hsp10), a protein which functions as a chaperonin was also found to be
overexpressed in COPD. Its structure consists of a heptameric ring which binds to
another HSP, 60 kDa heat shock protein, mitochondrial (HSPD1, also known as Hsp60)
in order to form a symmetric functional heterodimer which enhances protein folding in
an ATP-dependent manner [37, 38]. Its antichaperonin is HSPD1 which was also
observed to be overexpressed in COPD patients, although it was not found to be
significantly differentially expressed according to the employed statistical test. Since
the processes involved in protein transport and folding consume ATP and generate
ROS, the UPR induces expression of a variety of genes involved in processes such as
antioxidant defense, inflammation, xenobiotic metabolism, energy metabolism,
protein synthesis and apoptosis [39-42].
Glutathione S-transferase P (GSTP1), which was found to be overexpressed in the
proteome of nasal epithelial cells of COPD patients when compared to healthy
subjects, is known to play a role in detoxification by catalyzing the conjugation of many
hydrophobic and electrophilic compounds with reduced glutathione and has been
linked to cancer and other diseases including COPD. In fact, upregulation of GSTP1 was
found to be associated with lung carcinoma [43], while recently GSTP1 was found to be
significantly associated with emphysema severity in COPD patients [44]. Lung cancer
and COPD are leading causes of death (second and fourth, respectively), and both are
associated with cigarette smoke exposure. It has been shown that 50-70% of patients
116
diagnosed with lung cancer suffer from COPD, and reduced lung function is an
important event in lung cancer suggesting an association between COPD and lung
cancer [45]. Simultaneous overexpression of GSTP1 and underexpression of epoxide
hydrolase 1 (EPHX1) had been associated with the increase of lung cancer risk among
smokers and, as a matter of fact, we also report underexpression of EPHX1 in COPD
patients, which is consistent with previous observations that lung cancer and COPD do
share some molecular events. Both GSTP1 and EPHX1 are part of the nuclear factor
erythroid 2-related factor 2 (Nrf2)-mediated oxidative stress response, an important
regulator of lung antioxidant defenses, which has been implicated in ER-stress induced
apoptosis in COPD [46]. Another two members of this pathway were found to be
among the significantly differentially expressed proteins: thioredoxin reductase 1,
cytoplasmic (TXNR1) and glutathione-disulfide reductase (GSR), both found to be
overexpressed in COPD patients. Another antioxidant enzyme, peroredoxin-5
mitochondrial (PRDX5), which does not belong to Nrf2-mediated oxidative stress
response, was also observed to be overexpressed in COPD. As stated before, UPR may
also induce xenobiotic metabolism, which have been associated with COPD as a result
of the amount of substances present in the cigarette smoke that are released into the
lungs. Besides GSTP1 and GSR, which were already referred to, aldo-keto reductase
family 1 member 3 (AKR1C3), annexin A2 (ANXA2) and cocaine esterase (CES2) are also
associated with drug metabolism and they were all found to be overexpressed in
COPD.
CONCLUSION
COPD is one of the leading causes of death in the world and has been intensively
studied for the past decades. Current biological-derived samples in use for the
assessment of COPD mechanisms include blood, sputum, bronchoalveolar fluid,
exhaled breath and bronchial biopsies. Nasal epithelial cells collected by nasal brushing
were shown to mimic the lower airway epithelial, having the advantage of being a
noninvasive and not painful procedure. Here, for the first time fresh obtained nasal
epithelial cells were used to perform a proteomic study in COPD. In the present study
we were able to identify 1475 proteins in total, contributing to expand the knowledge
on the proteome of nasal epithelial cells, since we reported 769 proteins that had not
117
been described yet. From the total 1475, 1173 proteins were identified by at least two
peptides and those were the ones that were taken into account towards the
comparative analysis between COPD patients and healthy individuals. Our data
confirmed previous evidences that UPR is activated in COPD patients since we were
able to observe overexpression in a considerable number of proteins associated in
different protein complexes involved in UPR. This includes overexpression of VCP, both
components of the Hsp10/Hsp60 chaperone complex (HSPD1 and HSPE1), CALR and
two members of a large ER-localized multiprotein complex of at least 11 proteins, PPIB
and ERP29. We also observed an increase in expression of proteins related to Nrf2-
mediated oxidative stress response such as GSTP1, TXNRD1 and GSR. Finally, we also
report an increase in drug metabolism, as all significantly differentially expressed
proteins related to this biofunction were overexpressed in COPD: GSTP1, GSR, AKR1C3
and ANXA2. These data needs further validation by orthogonal methods so that the
activation of UPR and Nrf2-mediated oxidative stress response and the increase in
drug metabolism on the nasal epithelial cells of COPD patients is fully confirmed. This
work also emphasize further value of using nasal epithelial cells in COPD pathogenesis
investigation that can lead to identification of new candidate biomarkers for this
disease.
ACKNOWLEDGEMENTS
All volunteers for their cooperation in this work. Work partially supported by FCT-
FEDER POCI/SAU-MMO/56163/2004, FCT/Poly-Annual Funding Program and FEDER-
Saude XXI Program (Portugal). BMA is recipient of FCT doctoral fellowship
(SFRH/BD/31415/2006).
118
REFERENCES
[1] Global initiative for chronic obstructive lung disease 2010.
[2] Barnes, P. J., Celli, B. R., Systemic manifestations and comorbidities of COPD. Eur
Respir J 2009, 33, 1165-1185.
[3] Barnes, P. J., Chronic obstructive pulmonary disease: effects beyond the lungs. PLoS
medicine 2010, 7, e1000220.
[4] Mannino, D. M., Buist, A. S., Global burden of COPD: risk factors, prevalence, and
future trends. Lancet 2007, 370, 765-773.
[5] European Respiratory Society and European Lung Foundation 2003.
[6] Lokke, A., Lange, P., Scharling, H., Fabricius, P., Vestbo, J., Developing COPD: a 25
year follow up study of the general population. Thorax 2006, 61, 935-939.
[7] Standring, S., Gray's Anatomy: The Anatomical Basis of Clinical Practice, 40th
Edition. Chapter 32. Nose, nasal cavity and paranasal sinuses, Churchill Livingstone
2008.
[8] Shaykhiev, R., Bals, R., Interactions between epithelial cells and leukocytes in
immunity and tissue homeostasis. Journal of leukocyte biology 2007, 82, 1-15.
[9] Pahl, A., Preclinical modelling using nasal epithelial cells for the evaluation of herbal
extracts for the treatment of upper airway diseases. Planta Med 2008, 74, 693-696.
[10] Beck, S., Penque, D., Garcia, S., Gomes, A., et al., Cystic fibrosis patients with the
3272-26A-->G mutation have mild disease, leaky alternative mRNA splicing, and CFTR
protein at the cell membrane. Hum Mutat 1999, 14, 133-144.
[11] Roxo-Rosa, M., da Costa, G., Luider, T. M., Scholte, B. J., et al., Proteomic analysis
of nasal cells from cystic fibrosis patients and non-cystic fibrosis control individuals:
search for novel biomarkers of cystic fibrosis lung disease. Proteomics 2006, 6, 2314-
2325.
[12] Gomes-Alves, P., Imrie, M., Gray, R. D., Nogueira, P., et al., SELDI-TOF biomarker
signatures for cystic fibrosis, asthma and chronic obstructive pulmonary disease. Clin
Biochem 2010, 43.
[13] McDougall, C. M., Blaylock, M. G., Douglas, J. G., Brooker, R. J., et al., Nasal
epithelial cells as surrogates for bronchial epithelial cells in airway inflammation
studies. American journal of respiratory cell and molecular biology 2008, 39, 560-568.
119
[14] Penque, D., Mendes, F., Beck, S., Farinha, C., et al., Cystic fibrosis F508del patients
have apically localized CFTR in a reduced number of airway cells. Laboratory
investigation; a journal of technical methods and pathology 2000, 80, 857-868.
[15] Marengo, E., Robotti, E., Bobba, M., Gosetti, F., The principle of exhaustiveness
versus the principle of parsimony: a new approach for the identification of biomarkers
from proteomic spot volume datasets based on principal component analysis.
Analytical and bioanalytical chemistry, 397, 25-41.
[16] McDonald, J. H., Handbook of Biological Statistics, Sparky House Publishing,
Baltimore, MD 2009.
[17] Ferguson, R. E., Carroll, H. P., Harris, A., Maher, E. R., et al., Housekeeping
proteins: a preliminary study illustrating some limitations as useful references in
protein expression studies. Proteomics 2005, 5, 566-571.
[18] Di, Y. P., Harper, R., Zhao, Y., Pahlavan, N., et al., Molecular cloning and
characterization of spurt, a human novel gene that is retinoic acid-inducible and
encodes a secretory protein specific in upper respiratory tracts. The Journal of
biological chemistry 2003, 278, 1165-1173.
[19] Simoes, T., Charro, N., Blonder, J., Faria, D., et al., Molecular profiling of the
human nasal epithelium: A proteomics approach. Journal of proteomics 2011.
[20] Rennard, S. I., Vestbo, J., COPD: the dangerous underestimate of 15%. Lancet
2006, 367, 1216-1219.
[21] Fletcher, C., Peto, R., Tinker, C., Speizer, F. E., The Natural History of Chronic
Bronchitis and Emphysema Oxford University Press, Oxford, UK 1976.
[22] Kelsen, S. G., Duan, X., Ji, R., Perez, O., et al., Cigarette smoke induces an unfolded
protein response in the human lung: a proteomic approach. American journal of
respiratory cell and molecular biology 2008, 38, 541-550.
[23] Hackett, N. R., Heguy, A., Harvey, B. G., O'Connor, T. P., et al., Variability of
antioxidant-related gene expression in the airway epithelium of cigarette smokers.
American journal of respiratory cell and molecular biology 2003, 29, 331-343.
[24] Spira, A., Beane, J., Shah, V., Liu, G., et al., Effects of cigarette smoke on the
human airway epithelial cell transcriptome. Proceedings of the National Academy of
Sciences of the United States of America 2004, 101, 10143-10148.
120
[25] Sridhar, S., Schembri, F., Zeskind, J., Shah, V., et al., Smoking-induced gene
expression changes in the bronchial airway are reflected in nasal and buccal
epithelium. BMC genomics 2008, 9, 259.
[26] Gomes-Alves, P., Neves, S., Penque, D., Signaling pathways of proteostasis
network unrevealed by proteomic approaches on the understanding of misfolded
protein rescue. Methods in enzymology 2011, 491, 217-233.
[27] Powers, E. T., Morimoto, R. I., Dillin, A., Kelly, J. W., Balch, W. E., Biological and
chemical approaches to diseases of proteostasis deficiency. Annual review of
biochemistry 2009, 78, 959-991.
[28] Roth, D. M., Balch, W. E., Modeling general proteostasis: proteome balance in
health and disease. Curr Opin Cell Biol 2011, 23, 126-134.
[29] Vij, N., AAA ATPase p97/VCP: cellular functions, disease and therapeutic potential.
J Cell Mol Med 2008, 12, 2511-2518.
[30] Min, T., Bodas, M., Mazur, S., Vij, N., Critical role of proteostasis-imbalance in
pathogenesis of COPD and severe emphysema. J Mol Med 2011, 89, 577-593.
[31] Yamamoto, S., Tomita, Y., Hoshida, Y., Iizuka, N., et al., Expression level of valosin-
containing protein (p97) is correlated with progression and prognosis of non-small-cell
lung carcinoma. Annals of surgical oncology 2004, 11, 697-704.
[32] Vij, N., Fang, S., Zeitlin, P. L., Selective inhibition of endoplasmic reticulum-
associated degradation rescues DeltaF508-cystic fibrosis transmembrane regulator and
suppresses interleukin-8 levels: therapeutic implications. The Journal of biological
chemistry 2006, 281, 17369-17378.
[33] Ni, M., Lee, A. S., ER chaperones in mammalian development and human diseases.
FEBS letters 2007, 581, 3641-3651.
[34] Caramelo, J. J., Parodi, A. J., Getting in and out from calnexin/calreticulin cycles.
The Journal of biological chemistry 2008, 283, 10221-10225.
[35] Schroder, M., Kaufman, R. J., ER stress and the unfolded protein response.
Mutation research 2005, 569, 29-63.
[36] Meunier, L., Usherwood, Y. K., Chung, K. T., Hendershot, L. M., A subset of
chaperones and folding enzymes form multiprotein complexes in endoplasmic
reticulum to bind nascent proteins. Molecular biology of the cell 2002, 13, 4456-4469.
121
[37] Bross, P., Li, Z., Hansen, J., Hansen, J. J., et al., Single-nucleotide variations in the
genes encoding the mitochondrial Hsp60/Hsp10 chaperone system and their disease-
causing potential. J Hum Genet 2007, 52, 56-65.
[38] Hansen, J. J., Bross, P., Westergaard, M., Nielsen, M. N., et al., Genomic structure
of the human mitochondrial chaperonin genes: HSP60 and HSP10 are localised head to
head on chromosome 2 separated by a bidirectional promoter. Hum Genet 2003, 112,
71-77.
[39] Marciniak, S. J., Ron, D., Endoplasmic reticulum stress signaling in disease. Physiol
Rev 2006, 86, 1133-1149.
[40] Gorlach, A., Klappa, P., Kietzmann, T., The endoplasmic reticulum: folding, calcium
homeostasis, signaling, and redox control. Antioxidants & redox signaling 2006, 8,
1391-1418.
[41] Gregersen, N., Bross, P., Protein misfolding and cellular stress: an overview.
Methods in molecular biology (Clifton, N.J 2010, 648, 3-23.
[42] Harding, H. P., Zhang, Y., Zeng, H., Novoa, I., et al., An integrated stress response
regulates amino acid metabolism and resistance to oxidative stress. Molecular cell
2003, 11, 619-633.
[43] Hayes, J. D., Pulford, D. J., The glutathione S-transferase supergene family:
regulation of GST and the contribution of the isoenzymes to cancer chemoprotection
and drug resistance. Crit Rev Biochem Mol Biol 1995, 30, 445-600.
[44] Kim, W. J., Hoffman, E., Reilly, J., Hersh, C., et al., Association of COPD candidate
genes with computed tomography emphysema and airway phenotypes in severe
COPD. Eur Respir J 2011, 37, 39-43.
[45] Yao, H., Rahman, I., Current concepts on the role of inflammation in COPD and
lung cancer. Current opinion in pharmacology 2009, 9, 375-383.
[46] Malhotra, D., Thimmulappa, R., Vij, N., Navas-Acien, A., et al., Heightened
endoplasmic reticulum stress in the lungs of patients with chronic obstructive
pulmonary disease: the role of Nrf2-regulated proteasomal activity. American journal
of respiratory and critical care medicine 2009, 180, 1196-1207.
124
Serum proteomics of chronic obstructive pulmonary disease patients
Bruno M. Alexandre1, Pang-ning Teng2, Brian L. Hood2, Mai Sun2,
Deborah Penque1* and Thomas P. Conrads2*
1Laboratorio de Proteómica, Departamento de Genética, Instituto Nacional de Saúde
Dr. Ricardo Jorge (INSA-IP), Av. Padre Cruz 1649-016 Lisboa, Portugal and the
2Department of Pharmacology & Chemical Biology, University of Pittsburgh Cancer
Institute, University of Pittsburgh
Keywords: Serum, blood, chronic obstructive pulmonary disease, emphysema, chronic
bronchitis, mass spectrometry, proteomics.
*Corresponding authors: Thomas P. Conrads, Ph.D., 204 Craft Avenue, Suite B401,
Pittsburgh, PA, 15213, Tel: 412-641-7556, Fax: 412-641-2356, E-mail:
[email protected] and Deborah Penque, Ph.D., Laboratório de Proteómica,
Departamento de Genética, Edifício INSA II, Instituto Nacional de Saúde Dr. Ricardo
Jorge, INSA, I.P., Avenida Padre Cruz, 1649-016 Lisboa, Portugal, Tel: +351 21750 8137,
Fax: +351 21752 6410, E-mail: [email protected]
125
ABSTRACT
Chronic obstructive pulmonary disease (COPD) is a major cause of morbidity and
mortality in adults, and its incidence is increasing worldwide. Patients may have
chronic bronchitis, emphysema, small airway disease or a combination of these that
modulate the course of the disease. There is still some ambiguousness concerning the
disease-specific molecular mechanisms of the inflammatory process and acute
exacerbation of COPD. Therefore, potential biomarkers which are specific for COPD
have not been fully identified and validated, even though there is a great need for such
biomarkers. To date no work had employed LC-MS methods to generate
comprehensive data on the serum proteome of COPD patients. A total 33049 peptides
corresponding to 2856 proteins were identified by the powerful shotgun approach
GeLC-MS/MS using a linear ion trap mass spectrometer. We were able to find proteins
potentially related to biological functions that have impact in pathophysiology of
COPD. This includes TRAF3IP2, which is associated with innate immunity in response to
pathogens, inflammatory signals and airway hyperresponsiveness; PLG, reported to be
involved in mechanisms of wound healing and development of pulmonary diseases
such as asthma, cystic fibrosis and COPD itself; GPLD1 and APOE. Further complete
validation and study of some of these proteins will provide a better understanding of
molecular pathways underlying COPD pathogenesis.
126
INTRODUCTION
Chronic obstructive pulmonary disease (COPD) is a major cause of morbidity and
mortality in adults, and its incidence is increasing worldwide. The pathogenesis of
COPD is still poorly understood and is likely to be a complex interplay between genetic
and environmental factors. Persistently decreased forced expiratory volume (FEV1)
and increased forced expiratory time are major diagnostic features of COPD. COPD is
characterized by long-term progressive bronchial airflow obstruction (FEV1/FVC < 0.7
as measured by spirometry), poorly responsive to bronchodilators as this obstruction is
not fully reversible [1]. Patients may have chronic bronchitis, emphysema, small airway
disease or a combination of these that modulate the course of the disease. There is no
treatment available with the capability to stop or revert disease process and heal the
patient. At the present time, therapy is directed to provide as much quality of life as
possible to the patient. In 2000, approximately 2.7 million deaths were caused by
COPD placing this disease as the fourth leading cause of death in the world [2].
Moreover, The Global Burden of Disease Study projected COPD, which was ranked
sixth in 1990, to be placed third among the leading causes of death in the world by the
year 2020 [2]. There is still some ambiguousness concerning the disease-specific
molecular mechanisms of the inflammatory process and acute exacerbation of COPD.
Therefore, potential biomarkers which are specific for COPD have not been fully
identified and validated, even though there is a great need for such biomarkers [3].
Proteomic technologies allow for identification of protein changes caused by the
disease process and recent advances, especially at mass spectrometry and
bioinformatics levels, raise the chances to identify novel putative biomarkers. Serum is
known to perfuse tissues and therefore to be a source of biochemical products that
can be indicative of the physiological status of the individual and even of the disease
status of the patient. Surprisingly, few serum proteomic studies were performed in
COPD. The first proteomic study was published in 2007 and consisted in measuring 143
pre-selected serum biomarkers by protein microarray platform [4]. Our group
conducted an extensive study on serum biomarker signatures of cystic fibrosis, asthma
and COPD patients by SELDI-TOF-MS where we were also able to find peaks
differentiating COPD from controls using Dunn’s comparison test (p<0.05) [5].
However, to date no work has employed LC-MS methods so far in order to generate
127
comprehensive data on the serum proteome of COPD patients. In the present study,
190 COPD patients were divided into four different groups according to the two main
clinical features of this disease – emphysema and chronic bronchitis – and analyzed by
the powerful shotgun approach which combines protein gel and liquid
chromatography separation methods before spectra acquisition by a linear ion trap
mass spectrometer (GeLC-MS/MS).
MATERIALS AND METHODS
Individuals, Sample Collection and Pooling Strategy
Serum samples (n=190 well characterized COPD patients) were collected from
peripheral blood and were provided by Center for Clinical Pharmacology, Department
of Medicine, University of Pittsburgh, Pittsburgh, PA. Detailed clinical parameters
concerning each one of the patients under analysis were also provided in order to
minimize bias when pooling the samples into each of the groups under analysis. In
order to assess the effects the proteome of the two main clinical features of COPD,
emphysema and chronic bronchitis, four different groups were constituted and each
group was further split into two to achieve two biological replicates per group. Main
characteristics of each of the biological replicates are displayed in Table VI.1.
Table VI.1: Main characteristics of the biological replicates for each of the groups
under analysis. “No features” (Group A) refers to emphysema and chronic bronchitis
only. (Biol Rep- Biological Replicate; BMI- body mass index).
Group n Emphysema Chronic
Bronchitis Biol Rep n Age (y) BMI FEV1/FVC(%)
No Features (A) 123 No No A1 62 68.1 ± 6.0 28.6 ± 3.5 64.0 ± 10.6
A2 61 69.5 ± 6.3 28.0 ± 4.1 66.7 ± 8.1
Emphysema (B) 32 Yes No B1 16 66.2 ± 6.4 26.8 ± 3.4 58.7 ± 8.1
B2 16 64.6 ± 6.0 25.0 ± 2.6 53.7 ± 10.7
Chronic
Bronchitis (C) 20 No Yes
C1 10 65.0 ± 7.8 26.6 ± 5.7 55.8 ± 17.4
C2 10 66.9 ± 6.6 27.7 ± 2.9 55.1 ± 14.5
Emphysema and
Chronic
Bronchitis (D)
15 Yes Yes
D1 8 60.1 ± 5.9 23.5 ± 5.0 35.3 ± 5.9
D2 7 67.7 ± 5.4 25.5 ± 4.8 36.7 ± 6.8
128
Sample Preparation for LC-MS/MS
Once pools were constituted, a first spiking step was introduced when beta-
galactosidase (E. coli) was added to a final concentration of 200 fmol/µL. Due to
sample complexity and high dynamic range, samples were immunodepleted of their
top-14 most abundant proteins through a Hu-14 multiple affinity removal system
(MARS, Agilent Technologies, Palo Alto CA, USA) column and resulting F1 and F2
fractions from each sample were pooled together. A second spiking step was
performed by adding chicken ovalbumin to a final concentration of 200 fmol/µL before
depleted samples were buffer-exchanged into 25 mM ammonium bicarbonate using
centrifugal ultrafiltration (3000 molecular weight cut-off) to a final volume of 500 µL.
Protein concentrations were determined by BCA (Pierce). Samples were deglycosylated
through incubation with PNGase F enzyme for 2h at 37 °C. One-dimensional
polyacrilamide gels (1D-PAGE) was performed in the high-resolution pre-cast gel
system XCell SureLock™ Mini-Cell and NuPAGE® Novex® Bis-Tris using pre-casted 4-
12% gels (Invitrogen, Carlsbad CA, USA) at a constant voltage of 150 V for 1h. After
staining with SimplyBlue™ SafeStain, gel images were acquired and evaluated in order
to divide each lane (sample) into 10 bands. Each of these bands was excised
accordingly and destained in a solution of 50% ACN in 50 mM AmB. Reactive cysteine
residues were reduced via rehydration of gel bands in 10 mM DTT and 25 mM AMB
followed by incubation at 56 °C for 45 min and alkyated via incubation in 55mM
iodoacetamide and 25mM AMB for 30 min at ambient temperature in the dark. Bands
were then dehydrated with acetonitrile, rehydrated with sequencing grade porcine
trypsin (Promega, Madison, WI, USA) in 25 mM ammonium bicarbonate and digested
at 37 °C for 16h. Peptide digests were extracted with 70% acetonitrile, 5% formic acid,
dried by vacuum centrifugation and stored at -80 °C until further analysis.
Proteomic analysis by liquid chromatography-tandem mass spectrometry.
Peptide extracts (6 µg per injection) from each gel fraction were resuspended in 0.1%
TFA and separately analysed by liquid chromatography (LC) using a Dionex Ultimate
3000 nanoflow LC system (Dionex Corporation, Sunnyvale, Calif., USA) coupled on-line
to a linear ion trap (LIT) mass spectrometer (MS) (LTQ; ThermoFisher Scientific Inc.,
San Jose, CA, USA). Separation of the sample was performed using a 75-µm inner
129
diameter x 360 outer diameter x 10 cm-long fused silica capillary column (Polymicro
Technologies, Phoenix, AZ, USA) packed in house with 5 µm, 300 Å pore size Jupiter C-
18 stationary phase (Phenomenex, Torrance, CA, USA). Following sample injection
onto a C-18 precolumn (Dionex), the column was washed for 3 min with mobile phase
A (2% acetonitrile, 0.1% formic acid) at a flow rate of 30 µL/min. Peptides were eluted
using a linear gradient of 0.33% mobile phase B (0.1% formic acid in
acetonitrile)/minute for 130 min, then to 95% B in an additional 15 min, all at a
constant flow rate of 200 nL/min. Column washing was performed at 95% B for 15 min
for all runs, after which the column was re-equilibrated in mobile phase A prior to
subsequent injections. The LIT-MS was operated in a data-dependent MS/MS mode in
which each full MS scan (precursor ion selection scan range of m/z 350–1,800) was
followed by seven MS/MS scans where the seven most abundant peptide molecular
ions dynamically determined from the MS scan were selected for tandem MS using a
relative collision-induced dissociation (CID) energy of 35%. Dynamic exclusion was
utilized to minimize redundant selection of peptides for CID.
Peptide Identification and Spectral Count Analysis.
Tandem mass spectra were searched (Thermo Scientific BioWorks 3.3.1 software suite)
on a 72 node Beowulf cluster against the UniProt Homo sapiens proteome database
(October 2008 release, http://www.expasy.org) using SEQUEST (ThermoFisher
Scientific, Inc.). Additionally, peptides were searched for dynamic methionine
oxidation (15.9949) and cysteine carboxyamidomethylation (57.0215) modifications.
Peptides were considered legitimately identified if they achieved specific charge state
and proteolytic cleavage-dependent cross-correlation (Xcorr) scores of 1.9 for [M+H]1+
, 2.2 for [M+2H]2+ , and 3.5 for [M+3H]3+ , and a minimum delta correlation score (Δ Cn)
of 0.08. False positive rate for this dataset is estimated to be less than 5%, in
accordance to probability-based evaluation of peptide and protein identifications from
tandem mass spectra and SEQUEST analysis of the human proteome [6]. Results were
further filtered using software developed in-house, and differences in protein
abundance between samples were derived by summing the total CID events that
130
resulted in a positively identified peptide for a given protein accession across all
samples (spectral counting) [7].
Western Blot Analysis
Primary antibodies used were mouse monoclonal anti-human Apoliprotein E, APOE
(Abcam) and Phosphatidylinositol-glycan-specific phospholipase D, GPLD1 (Abcam).
Secondary antibody was horseradish peroxidase-conjugated goat anti-mouse IgG (L+H)
that was pre-absorbed with serum (Pierce). The serum samples (20 µg) were resolved
by 1D-PAGE (Invitrogen) and transferred to Immobilon-PSQ PVDF membranes
(Millipore) using the Invitrogen Xcell II Blot Module according to the manufacturer’s
protocol. Membranes were blocked with 0.5% low-fat milk (w/v) and incubated in
primary antibody (1:1000 dilution for α-APOE, 1:500 for α-GPLD1) for 16 h at 4 °C.
Membranes were washed with TBS with 0.1% Tween (TBST) and incubated with
secondary antibody (1:50,000 dilution) for 1 h at ambient temperature. After washing
with TBST, the membranes were incubated with either Dura or Femto Super Signal ECL
(ThermoFisher) for 2 or 5 min prior to chemiluminescent exposure.
Enzyme-linked immunosorbent assay (ELISA)
Serum plasminogen (PLG) level was quantified using the Plasminogen (Human) ELISA
kit (ALPCO Immunoassays, Salem, NH) according to the manufacture’s protocol. The
assay was performed in duplicate.
Bioinformatic analysis
Uniprot accessions corresponding to proteins identified by at least two peptides were
mapped to HUGO (HGNC) gene symbols utilizing Ingenuity Pathway Analysis (IPA)
(Ingenuity® Systems, www.ingenuity.com). Protein localization and subtype
assignments were derived from IPA-mapped data sets. IPA and ProteinCenter (Thermo
Fischer Scientific) were used to retrieve information on proteins annotation according
to cellular component, biological process and molecular function using default
parameters.
131
RESULTS AND DISCUSSION
Serum Proteomics of COPD patients
Serum is a biological material of high interest to be used in proteomics studies,
especially clinical proteomics studies. Blood serum is readily accessible in contrast to
tissue or organ biospecimens. Serum constantly perfuses all tissues of the body in a
dynamic exchange of molecules, making serum a rich environment and therefore a
very attractive biological material for proteomics-based biomarker research [8-14].
Another advantage is that serum possesses a high protein content, i.e. 60–80 mg of
protein/mL [15]. The major protein constituents of serum include albumin, which
accounts for about 55% of total protein mass alone, immunoglobulins, transferrin,
haptoglobin, and lipoproteins [12, 15]. Thousands of different proteins are present in
serum, but their concentration varies dramatically. In fact, dynamic range in protein
concentration differs in ten orders of magnitude, between albumin and interleukin-6
[12, 14]. Hence, the ability to detect proteins that are present at low concentrations is
critical to the discovery of new biomarkers of disease. While mass spectrometry is an
ideal tool to identify and quantify proteins in blood, abundant peptides may be
detected with greater intensity and therefore mask less abundant proteins [16]. To
overcome this issue and be able to investigate into the low abundant proteins where
tissue leakage derived proteins reside, serum samples were immunodepleted from the
14 abundant proteins, which altogether account for about 94% of serum protein mass.
An additional step on N-deglycosylation was also introduced to reduce sample
complexity and allow a higher number of confident identifications by mass
spectrometry. Samples were separated first by 1D-PAGE on a pre-casted 4-12%
gradient gel and later by LC coupled online to a LIT mass spectrometer. Main steps of
the whole workflow are shown in Figure VI.1.
Two biological replicates were constituted for each of the groups under analysis (Table
VI.1) and two injections corresponding to 6 µg total protein from each of the biological
replicates were analyzed by LC-LIT-MS/MS resulting in the identification of a total
33049 peptides and 2856 proteins (Supplemental Table VI.1, Supporting Information),
of which 929 were identified by at least two peptides across the samples under
analysis (Supplemental Table VI.2, Supporting Information).
132
Figure VI.1: Basic scheme of the methodology employed to study COPD patients’
proteome.
Protein digests equivalency was determined by comparing the total number of
peptides identified (total spectral counts) in each of the analytical samples, which
resulted in a calculated relative standard deviation (RSD) of 16.1%. This was also
133
evaluated through comparison of total peptides identified for chicken ovalbumin,
which was added in equal amounts to each of the analyzed samples according to the
workflow displayed in Figure VI.1. Calculated RSD for ovalbumin was 19.1%, which is
consistent with the RSD obtained for total spectral counts.
Cellular location and functional type of each of the proteins identified by two or more
peptides were obtained from Ingenuity pathway analysis (IPA) and full information is
available as Supplemental Table VI.3, Supporting Information. Cellular location of
proteins identified by at least two peptides is also displayed in Figure VI.2.
Figure VI.2: Cellular location of proteins identified by two or more peptides.
Interestingly, the larger slice of the chart represents proteins whose cellular location is
unknown (30%). This is a result of the lack of information acquired so far for a vast
number of proteins deposited in protein databases and enforces the need for more
and more biochemical and proteomics studies that will raise our current knowledge to
a higher level. As for the annotated proteins, most of the proteins were assigned to the
cytoplasm (26%).
Information on the profile of the number of transmembrane domains obtained for
proteins identified by at least two peptides is displayed in Supplemental Figure VI.1,
Supporting Information.
134
Comparative proteomics analysis of COPD patients
COPD patients’ samples were pooled into four different groups (Table VI.1) according
to each patient clinical data concerning chronic bronchitis and emphysema: a group of
patients that exhibited none of these two features prior to sample collection (Group
A); a group containing patients that have emphysema, but not chronic bronchitis
(Group B); a group of patients with chronic bronchitis but no emphysema (Group C);
and finally, a group of patients that possessed both chronic bronchitis and emphysema
preceding sample collection (Group D).
Using bioinformatics tools as IPA (Ingenuity Systems) or ProteinCenter (ThermoFisher),
it was possible to identify proteins associated with diverse biological processes such as
cell communication, defense response, response to stimulus or response to wounding.
For instance, von Willebrand factor (VWF, UniProt acc: P04275), a glycoprotein that is
involved in blood coagulation system, has been reported to be down regulated in
pulmonary adenocarcinoma individuals [17]. In the present study COPD patients with
both emphysema and chronic bronchitis revealed underexpression of this protein
when compared to the other three groups under analysis.
N-methyl-D-aspartate receptor 2C subunit (NMDAR2C, also known as GRIN2C, UniProt
Acc: O15398) which belongs to a class of ionotropic glutamate receptors has been
associated with tobacco disorders and with dry cough. In fact, Dextromethorphan, an
antagonist of human GRIN2C protein has been approved for the treatment of dry
cough (http://www. drugbank.ca/drugs/DB00514). We found GRIN2C underexpressed
in COPD patients with chronic bronchitis (group C) and both chronic bronchitis and
emphysema (group D).
Eighty-six proteins were identified by analysis of variance (ANOVA) with statistical
significant (p<0.05) differential abundances in spectral counts across the four groups
under analysis (Supplemental Table VI.4, Supporting Information). A supervised
hierarchical cluster analysis was performed on these 86 significantly differentially
expressed proteins (Figure VI.3).
Adapter protein CIKS isoform 3 (TRAF3IP2, UniProt acc: Q7Z6Q1), was found to be
overexpressed in COPD patients with emphysema (group B) and in patients with both
emphysema and chronic bronchitis (group D). TRAF3IP2 is involved in regulating
responses by cytokines by members of the Rel/NFkB transcription factor family, which
135
play a central role in innate immunity in response to pathogens, inflammatory signals
and stress and has also been implicated in airway hyperresponsiveness [18].
Figure VI.3: Hierarchical clustering exhibiting relative abundance of the eighty-six
significantly differentially expressed proteins across the four groups under analysis.
Plasminogen (PLG, UniProt acc: 00747) was found to be underexpressed in COPD
patients that suffer simultaneously from emphysema and chronic bronchitis (group D),
while it maintained about the same expression level over the three other groups of
COPD patients. Activation of the fibrinolytic system is dependent on the conversion of
the plasma zymogen, PLG, to the serine protease plasmin by the physiological
activators urokinase-type PLG activator (uPA) or tissue-type plasminogen activator
(tPA) [19]. Besides regulation of vascular patency by degrading fibrin-containing
thrombi, this system have been reported to be involved in other functions with impact
in important events as embryogenesis, angiogenesis, tumor growth and dissemination
and wound healing [19]. Moreover, impaired fibrinolytic activity is an underlying
feature in the development of pulmonary diseases [19], including COPD [20, 21].
136
Isoform 1 of phosphatidylinositol-glycan-specific phospholipase D (GPLD1, UniProt acc:
P80108) presented the same behavior described for PLG, i.e. it was found to be
underexpressed in COPD patients diagnosed with emphysema and chronic bronchitis
and it showed the same expression level across the three remaining groups. GPLD1 is a
GPI degrading enzyme that hydrolyzes the inositol phosphate linkage in proteins
anchored by phosphatidylinositol glycans, thereby releasing the attached protein from
the plasma membrane. Prostasin is a trypsin-like serine peptidase, highly expressed in
prostate, bronchus, and kidney [22, 23]. It has been shown that prostasin secretion
depends on GPI anchor cleavage by endogenous GPLD1 [22]. Prostasin is also known as
channel-activating protease (CAP)-1 and was the first of several membrane serine
peptidases found to activate the epithelium-sodium channel (ENaC) [24]. ENaC
function is tightly regulated and is critical for maintaining salt and fluid balance in the
lung [25]. Prostasin has been reported to have a critical role in regulating epithelial
sodium transport in normal and pathological conditions in the lung, since it has been
observed that prostasin is highly expressed in cystic fibrosis airways and is a strong
basal activator of ENaC in cystic fibrosis airway epithelial cells [26, 27].
Apolipoprotein E (APOE, UniProt acc: P02649) was found to be overexpressed in COPD
patients diagnosed with emphysema (group B) or chronic bronchitis (group C) or a
combination of both (group D), when compared to COPD patients that were not
diagnosed with emphysema or chronic bronchitis (group A). APOE plays a role in
cholesterol metabolism and is linked to cardiovascular diseases [28]. Cardiovascular
diseases are high-prevalent comorbidities of COPD [29]. Recently, it has been shown
by 2D gel-based proteomics approach overexpression of plasma APOE in COPD
patients when compared with healthy controls [30]. Here, we confirm this fact but also
suggest that overexpression of APOE may be associated with more severe form of
COPD, i.e., diagnosed with emphysema or chronic bronchitis.
Validation of protein abundances
Four proteins were selected for validation by western blot (WB), enzyme-linked
immunosorbent assay (ELISA) or single reaction monitoring (SRM), a new MS-based
strategy for a robust quantification [31]. Selected proteins were PLG, APOE, GPLD1 and
FCGBP.
137
PLG underexpression in COPD patients simultaneously suffering from emphysema and
chronic bronchitis (group D) was successfully validated by commercially available ELISA
kit. (Table VI.2, Figure VI.4).
Table VI.2: ELISA determination of serum PLG in COPD patients.
Sample Peptide spectral
counts (discovery)
PLG protein
concentration µg/mL
(ELISA)
A1 87 145,5
A2 108 121,4
B1 87 137,3
B2 79 121,5
C1 78 137,7
C2 76 149,0
D1 49 94,9
D2 40 110,2
Figure VI.4: ELISA determination of serum plasminogen in COPD patients. X-axes
labels refer to nomenclature displayed in Table VI.1.
138
APOE and GPLD1 were evaluated by WB analysis but unsuccessfully. The densitometry
analysis of APOE immunoreactive signal at the expected molecular mass (38 kDa)
showed no concordance with MS spectral count (Figure VI.5). One possible reason for
this is the presence of another variant of APOE below 28 kDa that could have
contributed to the mismatch between these two techniques. The antibody
immunoreactions for GPLD1 showed no signal by western blot (data not shown).
Validations of FCGBP, as well as, all those three aforementioned proteins are under
progress by SRM analysis. Selected peptides for this validation are displayed in
Supplemental Table 5, Supporting Information.
Figure VI.5: Western blot analysis of APOE. Primary antibody dilution: 1:1,000.
Secondary antibody dilution: 1:50,000.
CONCLUSION
Serum is widely used nowadays for assessing the diagnosis and follow up of many
diseases. It is ready available and it contains information derived from virtually every
part of the human body.
COPD is one of the leading causes of death in the world and has been the subject of
many studies. However, serum proteome of COPD patients has been lacking
comprehensive studies. In the present work, a powerful LC-MS approach was
employed for the first time to enlarge previous knowledge on COPD serum proteome.
We were able to identify a total 33049 peptides corresponding to 2856 proteins, of
which 929 were identified by two or more peptides across the four groups of COPD
patients under study, which were separated according to previous diagnosis of
emphysema and chronic bronchitis. However, many of the identified proteins still lack
crucial information related to cellular location or biological function in protein
139
databases. This is a major downside in proteomics research, especially in an era where
sophisticated mass spectrometers and powerful bioinformatics tools are on hand.
Nevertheless, we were able to find proteins related to biological functions that may
have impact in pathophysiology of COPD. This includes TRAF3IP2, which is associated
with innate immunity in response to pathogens, inflammatory signals and stress and
has also been implicated in airway hyperresponsiveness; PLG, reported to be involved
in mechanisms of wound healing and development of pulmonary diseases such as
asthma, cystic fibrosis and COPD itself; GPLD1, a GPI degrading enzyme associated with
cell homeostasis balance; and APOE, a protein of cholesterol metabolism related to
cardiovascular diseases. PLG differential abundance was successfully validated by
ELISA. Additional validations by SRM, a robust quantitative MS-based approach, are
under progress. Further more focused and dedicated studies on relevant proteins
highlighted here will certainly provide new insights into COPD pathological
mechanisms and/or provide therapeutic and/or diagnostic tools for COPD.
140
ACKNOWLEDGEMENTS
BMA would like to acknowledge Center for Clinical Pharmacology, Department of
Medicine, University of Pittsburgh for providing serum samples and patient clinical
data. BMA is recipient of FCT doctoral fellowship (SFRH/BD/31415/2006).
141
REFERENCES
[1] Global initiative for chronic obstructive lung disease 2010.
[2] Lopez, A. D., Shibuya, K., Rao, C., Mathers, C. D., et al., Chronic obstructive
pulmonary disease: current burden and future projections. Eur Respir J 2006, 27, 397-
412.
[3] Chen, H., Wang, D., Bai, C., Wang, X., Proteomics-based biomarkers in chronic
obstructive pulmonary disease. Journal of proteome research 2010, 9, 2798-2808.
[4] Pinto-Plata, V., Toso, J., Lee, K., Park, D., et al., Profiling serum biomarkers in
patients with COPD: associations with clinical parameters. Thorax 2007, 62, 595-601.
[5] Gomes-Alves, P., Imrie, M., Gray, R. D., Nogueira, P., et al., SELDI-TOF biomarker
signatures for cystic fibrosis, asthma and chronic obstructive pulmonary disease. Clin
Biochem 2010, 43.
[6] Qian, W. J., Liu, T., Monroe, M. E., Strittmatter, E. F., et al., Probability-based
evaluation of peptide and protein identifications from tandem mass spectrometry and
SEQUEST analysis: the human proteome. Journal of proteome research 2005, 4, 53-62.
[7] Liu, H., Sadygov, R. G., Yates, J. R., 3rd, A model for random sampling and
estimation of relative protein abundance in shotgun proteomics. Anal Chem 2004, 76,
4193-4201.
[8] Adkins, J. N., Varnum, S. M., Auberry, K. J., Moore, R. J., et al., Toward a human
blood serum proteome: analysis by multidimensional separation coupled with mass
spectrometry. Mol Cell Proteomics 2002, 1, 947-955.
[9] Kennedy, S., Proteomic profiling from human samples: the body fluid alternative.
Toxicol Lett 2001, 120, 379-384.
[10] Schrader, M., Schulz-Knappe, P., Peptidomics technologies for human body fluids.
Trends Biotechnol 2001, 19, S55-60.
[11] Zhang, H., Liu, A. Y., Loriaux, P., Wollscheid, B., et al., Mass spectrometric
detection of tissue proteins in plasma. Mol Cell Proteomics 2007, 6, 64-71.
[12] Anderson, N. L., Anderson, N. G., The human plasma proteome: history, character,
and diagnostic prospects. Mol Cell Proteomics 2002, 1, 845-867.
[13] Anderson, N. L., The clinical plasma proteome: a survey of clinical assays for
proteins in plasma and serum. Clinical chemistry 2010, 56, 177-185.
142
[14] Surinova, S., Schiess, R., Huttenhain, R., Cerciello, F., et al., On the development of
plasma protein biomarkers. Journal of proteome research 2011, 10, 5-16.
[15] Burtis, C. A., Ashwood, E. R., Tietz Fundamentals of Clinical Chemistry, 5th Ed.,
W.B. Saunders Company, Philadelphia, PA 2001.
[16] Tucholska, M., Bowden, P., Jacks, K., Zhu, P., et al., Human serum proteins
fractionated by preparative partition chromatography prior to LC-ESI-MS/MS. Journal
of proteome research 2009, 8, 1143-1155.
[17] Stearman, R. S., Dwyer-Nield, L., Zerbe, L., Blaine, S. A., et al., Analysis of
orthologous gene expression between human pulmonary adenocarcinoma and a
carcinogen-induced murine model. The American journal of pathology 2005, 167,
1763-1775.
[18] Zhao, Z., Qian, Y., Wald, D., Xia, Y. F., et al., IFN regulatory factor-1 is required for
the up-regulation of the CD40-NF-kappa B activator 1 axis during airway inflammation.
J Immunol 2003, 170, 5674-5680.
[19] Castellino, F. J., Ploplis, V. A., Structure and function of the plasminogen/plasmin
system. Thromb Haemost 2005, 93, 647-654.
[20] Stewart, C. E., Sayers, I., Characterisation of urokinase plasminogen activator
receptor variants in human airway and peripheral cells. BMC Mol Biol 2009, 10, 75.
[21] Jiang, Y., Xiao, W., Zhang, Y., Xing, Y., Urokinase-type plasminogen activator
system and human cationic antimicrobial protein 18 in serum and induced sputum of
patients with chronic obstructive pulmonary disease. Respirology (Carlton, Vic 2010,
15, 939-946.
[22] Verghese, G. M., Gutknecht, M. F., Caughey, G. H., Prostasin regulates epithelial
monolayer function: cell-specific Gpld1-mediated secretion and functional role for GPI
anchor. Am J Physiol Cell Physiol 2006, 291, C1258-1270.
[23] Verghese, G. M., Tong, Z. Y., Bhagwandin, V., Caughey, G. H., Mouse prostasin
gene structure, promoter analysis, and restricted expression in lung and kidney.
American journal of respiratory cell and molecular biology 2004, 30, 519-529.
[24] Vallet, V., Chraibi, A., Gaeggeler, H. P., Horisberger, J. D., Rossier, B. C., An
epithelial serine protease activates the amiloride-sensitive sodium channel. Nature
1997, 389, 607-610.
143
[25] Schild, L., Kellenberger, S., Structure function relationships of ENaC and its role in
sodium handling. Advances in experimental medicine and biology 2001, 502, 305-314.
[26] Donaldson, S. H., Hirsh, A., Li, D. C., Holloway, G., et al., Regulation of the
epithelial sodium channel by serine proteases in human airways. The Journal of
biological chemistry 2002, 277, 8338-8345.
[27] Tong, Z., Illek, B., Bhagwandin, V. J., Verghese, G. M., Caughey, G. H., Prostasin, a
membrane-anchored serine peptidase, regulates sodium currents in JME/CF15 cells, a
cystic fibrosis airway epithelial cell line. American journal of physiology 2004, 287,
L928-935.
[28] Martins, I. J., Hone, E., Foster, J. K., Sunram-Lea, S. I., et al., Apolipoprotein E,
cholesterol metabolism, diabetes, and the convergence of risk factors for Alzheimer's
disease and cardiovascular disease. Mol Psychiatry 2006, 11, 721-736.
[29] Dalal, A. A., Shah, M., Lunacsek, O., Hanania, N. A., Clinical and economic burden
of patients diagnosed with COPD with comorbid cardiovascular disease. Respiratory
medicine 2011.
[30] Bandow, J. E., Baker, J. D., Berth, M., Painter, C., et al., Improved image analysis
workflow for 2-D gels enables large-scale 2-D gel-based proteomics studies--COPD
biomarker discovery study. Proteomics 2008, 8, 3030-3041.
[31] Picotti, P., Rinner, O., Stallmach, R., Dautel, F., et al., High-throughput generation
of selected reaction-monitoring assays for proteins and proteomes. Nature methods
2010, 7, 43-46.
146
COPD is at the present time the fourth leading cause of death in the world and it is the
only major cause of death that has been increasing in the past decades while the
others have been decreasing. COPD also has a big impact in economy due to
hospitalization and healthcare related costs, but also the costs of work absence not
only from the patients, but also from their relatives. Therefore, it has been the subject
for many studies and much has been understood in the past years. However, diagnosis
is still performed by measuring lung functions through spirometry and although many
lives could be saved if there was a worldwide spirometry screening, patients are still
being diagnosed by nothing more than the evaluation of their lung function without
any screening of the biomolecules responsible for their disease status.
There are quite a lot biological materials that can be used to investigate biomarkers for
this disease. Although COPD is now known to possess a systemic inflammation
component which is responsible for affecting other organs, it is in the lung that the
events that lead to breathless take place. Investigating to the lung directly is therefore
an optimal strategy to be able to identify proteins that may not be detectable
elsewhere either because they are not present or diluted into undetectable
concentrations. But this means that lung tissue has to be collected by biopsy which is
an extremely invasive technique. But besides tissue biospecimens, other sources of
biological materials used to study COPD is biofluids which includes sputum,
bronchoalveolar and nasal lavage fluid, exhaled breath condensate, and blood.
Proteomics has the capacity to provide large-scale information and consequently it has
the potential to expand previous knowledge on COPD. Surprisingly, given the need for
new biomarkers in COPD and the power of proteomics, proteomics have been quite
neglected to present. Hence, in this work we combined different biospecimens and
proteomics methodologies to provide new insight into the disease.
It had been observed before by means of microscopy that red blood cells (RBCs) from
COPD patients showed deformations in their shape. RBCs are crucial to the uptake of
oxygen from the lungs to the cells and this transport is dependent on their ability to
change shapes rapidly while navigating through blood vessels. In addition, RBCs play a
147
crucial role in antioxidant defense when fighting against oxidative stress, which has
long been recognized as feature of COPD. In this work we made use of a membrane
fractionation procedure, stable isotope labeling and a high-resolution fourier
transformed - ion cyclotron resonance (FT-ICR) mass spectrometer. Chorein or
Vacuolar protein sorting-associated protein 13A (VPS13A) is reported to play a role in
the cytoskeleton organization has been associated with thorny deformations of
circulating erythrocytes, possibly due to red cell membranes deformation. This protein
was found to be underexpressed in COPD patients when compared to controls by MS
and this underexpression was confirmed by WB. Consequently, underexpression of
chorein may play an important role in the deformation of COPD RBCs. Many other
interesting proteins were identified in the context of COPD and, additionally, there
were a considerable number of proteins described in RBC for the first time.
To overcome the difficulty of acquiring fresh biopsies of well characterized patients, in
our laboratory we have established a procedure to capture nasal epithelial cells. In
fact, in previous works it has been shown that these cells presented similar behavior to
the epithelial cells of the lower airway. Two different types of studies were presented
from these cells: a study performed on the effects of cigarette smoke, which is the
main risk factor for COPD and a comparison study between COPD patients and healthy
individuals. Both were pioneer studies since the nasal epithelial cells proteome of
cigarette smoker subjects or COPD patients had not been assessed. Moreover, in both
studies a high-resolution mass analyzer, the orbitrap, was employed which increases
the number of confident peptide/protein identifications. In the study on the effect of
cigarette smoking, ninety-six proteins were found to be differentially expressed
between the proteomes of healthy smokers and nonsmokers. These proteins were
related to processes of antigen presentation, cell-to-cell signaling and interaction, cell
morphology, drug metabolism, DNA repair, energy production or mitochondrial
dysfunction. Although requiring further orthogonal validation, our data was consistent
with previous evidences showing CD44, MUC5AC or SOD2 differential modulation in
smokers due to inflammatory response pathways. When studying the nasal epithelial
cells proteome of COPD patients compared to healthy individuals, previous evidences
that UPR is activated in COPD patients were confirmed since we were able to observe
148
overexpression in a considerable number of proteins involved in different protein
complexes involved in UPR. This includes overexpression of VCP, both components of
the Hsp10/Hsp60 chaperone complex (HSPD1 and HSPE1), CALR and two members of a
large ER-localized multiprotein complex of at least 11 proteins, PPIB and ERP29. We
also observed an increase in expression of proteins related to Nrf2-mediated oxidative
stress response such as GSTP1, TXNRD1 and GSR. Finally, we also report an increase in
drug metabolism, as all significantly differentially expressed proteins related to this
biofunction were overexpressed in COPD: GSTP1, GSR, AKR1C3 and ANXA2. These data
needs further validation by orthogonal methods so that the activation of UPR and
Nrf2-mediated oxidative stress response and the increase in drug metabolism on the
nasal epithelial cells of COPD patients is fully confirmed.
Serum collected from COPD patients was divided into 4 different groups in all different
combinations of presence/absence of the two main features of COPD, chronic
bronchitis and emphysema, to study their impact in the serum proteome. Due to its
complex protein mixture, serum was first immunodepleted from its most abundant
proteins comprising about 94% of total protein content before being analyzed by
GeLC-MS/MS. This powerful strategy was able to identify as many as 2856 proteins, of
which 929 were identified by two or more peptides. Plasminogen was found to be
underexpressed in COPD patients that suffer simultaneously from emphysema and
chronic bronchitis, while it maintained about the same expression level over the three
other groups of COPD patients and this differential expression was successfully
validated by ELISA. It was possible to identify other interesting proteins as TRAF3IP2,
which is associated with innate immunity in response to pathogens, inflammatory
signals and stress and has also been implicated in airway hyperresponsiveness or
Isoform 1 of phosphatidylinositol-glycan-specific phospholipase D (GPLD1), which is
GPI degrading enzyme that was described to be responsible for secretion of prostasin,
which and was the first of several membrane serine peptidases found to activate the
epithelium-sodium channel (ENaC). Prostasin was also reported to have a critical role
in regulating epithelial sodium transport in normal and pathological conditions in the
lung.
149
The work herein presented confirmed a few findings that had already been reported
and at the same time revealed many new possibilities for disease mechanisms and also
for new biomarkers. Some data requires further validation by orthogonal techniques,
but there is no doubt that this work has shed the light into proteins and even
processes that had not been associated to COPD before. This work also emphasizes
further value of using nasal epithelial cells in COPD pathogenesis investigation since it
can lead to identification of new candidate biomarkers for this disease.
150
REFERENCES
[1] Petty, T. L., The history of COPD. International journal of chronic obstructive
pulmonary disease 2006, 1, 3-14.
[2] Tiffeneau, R., Pinelli, [Not Available]. Paris Med 1947, 37, 624-628.
[3] Gaensler, E. A., Air velocity index; a numerical expression of the functionally
effective portion of ventilation. Am Rev Tuberc 1950, 62, 17-28.
[4] Briscoe, W. A., Nash, E. S., The Slow Space in Chronic Obstructive Pulmonary
Diseases. Annals of the New York Academy of Sciences 1965, 121, 706-722.
[5] Viegi, G., Pistelli, F., Sherrill, D. L., Maio, S., et al., Definition, epidemiology and
natural history of COPD. Eur Respir J 2007, 30, 993-1013.
[6] Vestbo, J., Lange, P., Can GOLD Stage 0 provide information of prognostic value in
chronic obstructive pulmonary disease? American journal of respiratory and critical
care medicine 2002, 166, 329-332.
[7] Barnes, P. J., Shapiro, S. D., Pauwels, R. A., Chronic obstructive pulmonary disease:
molecular and cellular mechanisms. Eur Respir J 2003, 22, 672-688.
[8] Barnes, P. J., Stockley, R. A., COPD: current therapeutic interventions and future
approaches. Eur Respir J 2005, 25, 1084-1106.
[9] Global initiative for chronic obstructive lung disease 2010.
[10] Lopez, A. D., Shibuya, K., Rao, C., Mathers, C. D., et al., Chronic obstructive
pulmonary disease: current burden and future projections. Eur Respir J 2006, 27, 397-
412.
[11] World health organization, Geneva 2000.
[12] European Respiratory Society and European Lung Foundation 2003.
[13] Jemal, A., Ward, E., Hao, Y., Thun, M., Trends in the leading causes of death in the
United States, 1970-2002. JAMA 2005, 294, 1255-1259.
[14] Pauwels, R. A., Rabe, K. F., Burden and clinical features of chronic obstructive
pulmonary disease (COPD). Lancet 2004, 364, 613-620.
[15] 2007.
[16] Novos Dados da DPOC em Portugal (BOLD initiative), Lisbon 2010.
[17] in: Schraufnagel, D. E. (Ed.), American Thoracic Society 2010.
[18] Eriksen, D. J. M. D. M. (Ed.), The Tobacco Atlas, World Health Organization 2002.
151
[19] Lokke, A., Lange, P., Scharling, H., Fabricius, P., Vestbo, J., Developing COPD: a 25
year follow up study of the general population. Thorax 2006, 61, 935-939.
[20] Stoller, J. K., Aboussouan, L. S., Alpha1-antitrypsin deficiency. Lancet 2005, 365,
2225-2236.
[21] Molfino, N. A., Current thinking on genetics of chronic obstructive pulmonary
disease. Curr Opin Pulm Med 2007, 13, 107-113.
[22] Molfino, N. A., Genetics of COPD. Chest 2004, 125, 1929-1940.
[23] Molfino, N. A., Coyle, A. J., Gene-environment interactions in chronic obstructive
pulmonary disease. International journal of chronic obstructive pulmonary disease
2008, 3, 491-497.
[24] Wood, A. M., Stockley, R. A., The genetics of chronic obstructive pulmonary
disease. Respiratory research 2006, 7, 130.
[25] Heaney, L. G., Lindsay, J. T., McGarvey, L. P., Inflammation in chronic obstructive
pulmonary disease: implications for new treatment strategies. Curr Med Chem 2007,
14, 787-796.
[26] Hogg, J. C., Chu, F., Utokaparch, S., Woods, R., et al., The nature of small-airway
obstruction in chronic obstructive pulmonary disease. The New England journal of
medicine 2004, 350, 2645-2653.
[27] Lapperre, T. S., Postma, D. S., Gosman, M. M., Snoeck-Stroband, J. B., et al.,
Relation between duration of smoking cessation and bronchial inflammation in COPD.
Thorax 2006, 61, 115-121.
[28] Lapperre, T. S., Sont, J. K., van Schadewijk, A., Gosman, M. M., et al., Smoking
cessation and bronchial epithelial remodelling in COPD: a cross-sectional study.
Respiratory research 2007, 8, 85.
[29] Roth, M., Pathogenesis of COPD. Part III. Inflammation in COPD. Int J Tuberc Lung
Dis 2008, 12, 375-380.
[30] Barnes, P. J., Mediators of chronic obstructive pulmonary disease.
Pharmacological reviews 2004, 56, 515-548.
[31] Barnes, P. J., The cytokine network in chronic obstructive pulmonary disease.
American journal of respiratory cell and molecular biology 2009, 41, 631-638.
[32] Mak, J. C., Pathogenesis of COPD. Part II. Oxidative-antioxidative imbalance. Int J
Tuberc Lung Dis 2008, 12, 368-374.
152
[33] MacNee, W., Pathogenesis of chronic obstructive pulmonary disease. Proceedings
of the American Thoracic Society 2005, 2, 258-266; discussion 290-251.
[34] Rahman, I., Adcock, I. M., Oxidative stress and redox regulation of lung
inflammation in COPD. Eur Respir J 2006, 28, 219-242.
[35] MacNee, W., Oxidative stress and lung inflammation in airways disease. European
journal of pharmacology 2001, 429, 195-207.
[36] Agusti, A., Soriano, J. B., COPD as a systemic disease. Copd 2008, 5, 133-138.
[37] Gan, W. Q., Man, S. F., Senthilselvan, A., Sin, D. D., Association between chronic
obstructive pulmonary disease and systemic inflammation: a systematic review and a
meta-analysis. Thorax 2004, 59, 574-580.
[38] Mannino, D. M., Buist, A. S., Global burden of COPD: risk factors, prevalence, and
future trends. Lancet 2007, 370, 765-773.
[39] Chapman, K. R., Mannino, D. M., Soriano, J. B., Vermeire, P. A., et al.,
Epidemiology and costs of chronic obstructive pulmonary disease. Eur Respir J 2006,
27, 188-207.
[40] Hamdan, M., Righetti, P. G., Proteomics today: protein assessment and biomarkers
using mass spectrometry, 2D electrophoresis, and microarray technology, John Wiley &
Sons, Inc., Hoboken, NJ 2005.
[41] Thomson, J. J., Rays Of Positive Electricity and Their Application to Chemical
Analysis, Longman's Green and Company, London 1913.
[42] Yamashita, M., Fenn, J. B., Electrospray Ion Source. Another Variation on the Free-
Jet Theme. J. Phys. Chem. 1984, 88, 4451-4459.
[43] Karas, M., Hillenkamp, F., Laser desorption ionization of proteins with molecular
masses exceeding 10,000 daltons. Analytical chemistry 1988, 60, 2299-2301.
[44] Tanaka, K., Waki, H., Ido, Y., Akita, S., et al., Protein and polymer analyses up to
m/z 100 000 by laser ionization time-of-flight mass spectrometry. Rapid
Communications in Mass Spectrometry 1988, 2, 151-153.
[45] O'Farrell, P. H., High resolution two-dimensional electrophoresis of proteins. The
Journal of biological chemistry 1975, 250, 4007-4021.
[46] Penque, D., Two-dimensional gel electrophoresis and mass spectrometry for
biomarker discovery. PROTEOMICS – Clinical Applications 2009, 3, 155-172.
153
[47] Washburn, M. P., Wolters, D., Yates, J. R., 3rd, Large-scale analysis of the yeast
proteome by multidimensional protein identification technology. Nature biotechnology
2001, 19, 242-247.
[48] Nesvizhskii, A. I., 2006, pp. 87-119.
[49] Aebersold, R., Mann, M., Mass spectrometry-based proteomics. Nature 2003, 422,
198-207.
[50] Nilsson, T., Mann, M., Aebersold, R., Yates, J. R., 3rd, et al., Mass spectrometry in
high-throughput proteomics: ready for the big time. Nature methods 2010, 7, 681-685.
[51] Ong, S. E., Mann, M., Stable isotope labeling by amino acids in cell culture for
quantitative proteomics. Methods in molecular biology (Clifton, N.J 2007, 359, 37-52.
[52] Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., et al., Quantitative analysis of
complex protein mixtures using isotope-coded affinity tags. Nature biotechnology
1999, 17, 994-999.
[53] Shadforth, I. P., Dunkley, T. P., Lilley, K. S., Bessant, C., i-Tracker: for quantitative
proteomics using iTRAQ. BMC genomics 2005, 6, 145.
[54] Ye, X., Luke, B., Andresson, T., Blonder, J., 18O stable isotope labeling in MS-based
proteomics. Briefings in functional genomics & proteomics 2009, 8, 136-144.
[55] Chen, H., Wang, D., Bai, C., Wang, X., Proteomics-based biomarkers in chronic
obstructive pulmonary disease. Journal of proteome research 2010, 9, 2798-2808.
[56] Ohlmeier, S., Vuolanto, M., Toljamo, T., Vuopala, K., et al., Proteomics of Human
Lung Tissue Identifies Surfactant Protein A as a Marker of Chronic Obstructive
Pulmonary Disease. Journal of proteome research 2008.
[57] Lee, E. J., In, K. H., Kim, J. H., Lee, S. Y., et al., Proteomic analysis in lung tissue of
smokers and COPD patients. Chest 2009, 135, 344-352.
[58] Merkel, D., Rist, W., Seither, P., Weith, A., Lenter, M. C., Proteomic study of
human bronchoalveolar lavage fluids from smokers with chronic obstructive
pulmonary disease by combining surface-enhanced laser desorption/ionization-mass
spectrometry profiling with mass spectrometric protein identification. Proteomics
2005, 5, 2972-2980.
[59] Gray, R. D., MacGregor, G., Noble, D., Imrie, M., et al., Sputum proteomics in
inflammatory and suppurative respiratory diseases. American journal of respiratory
and critical care medicine 2008, 178, 444-452.