CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/6502/6/06...1 CHAPTER...

1

CHAPTER 1

INTRODUCTION

1.1 GENERAL INTRODUCTION

Medicinal Chemistry is the science that deals with the discovery and

design of new therapeutic chemicals and their development into useful

medicines. It had its beginning when chemist, pharmacist and physician

isolated and purified active principles of plant and animal tissue and later from

microorganism and their fermented products. During the latter decades of the

20th century, the traditional dividing line between biological, chemical and

physical science were erased and new borderline investigation such as

molecular biology, molecular pharmacology, biomedicine and other begin to

capture the interest of medicinal scientists. Medicinal chemistry which had

organic chemistry, biology and some area of physics extended new root into

these emerging topics [1].

Drug designing is a multi-disciplinary activity involving chemists,

biologists, biochemists, pharmacologists and many others. The chemist’s role is

central in inventing new compounds, which exert a beneficial effect. However

once a lead for a new active drug has been established, its effective

toxicological studies were undertaken to demonstrate its safety and efficacy

before clinical trials commences.

With the accident discovery of penicillin came the screening of

microorganism and the large number of antibiotics from bacterial and fungal

2

sources. Many of these antibiotics provide the prototype structure, that the

medicinal chemist modifies to obtain antibacterial drug with better therapeutic

profile. Thousand of chemicals are prepared annually throughout the world and

many of them are entered into pharmacological screening to determine whether

they have useful biological activity or not. This process of random screening

has been considered inefficient, but it has result in the identification of new

prototype compounds, whose structure have been optimized to produce clinical

observation of the pharmacological behavior of an existing drug.

The term “drug design” represents mainly the efforts to develop new

drugs on rational basis. The various approaches used in drug design include.

Random screening of synthetic compounds or chemicals and

natural products by bioassay procedures.

Novel compounds preparation based on the known structures of

biologically active, natural substances of plant and animal

origin, i.e., lead skeleton.

Preparation of structural analogues of lead with increasing

biological activity.

Application of the bio-isosteric principle.

In the course of drug design the two major types of chemical

modification are achieved through the formation of analogues and prodrugs.

An analogue is normally accepted as being that modification which brings

about a carbon-skeletal transformation or substitute synthesis. eg.

Oxytetracycline, Demclocycline with regard to Oestradiol. The term prodrug is

applied to an appropriate derivative of a drug, that undergoes in vivo hydrolysis

to the parent drug, e.g., Testosterone propionate, Chloramphenicol palmitate.

3

More recently automated High-Throughput Screening (HTPS) system

utilizing cell culture system with linked enzyme assay and receptor molecule

derived from gene cloning have greatly increased the efficiency of random

screening. It is now practical to screen enormous libraries of peptides and

nucleic acid obtained from combinatorial chemistry procedures.

Rational design is another approach which is also flourishing.

Significant advance in X-Ray crystallography and NMR have made it possible

to obtain detailed representation of enzyme and other drug receptor. The

technique of molecular graphics and computational chemistry has provided

novel chemical structure that has lead to new drug with potent medicinal

activities. Development of HIV protease inhibitor and ACE inhibitor came

from an understanding of the geometric and chemical character of the

respective enzyme’s active site. Even if the receptor structure is not known in

detail rational approaches based on the physicochemical properties of lead

compound can provide new drugs.

1.2 NUCLEUS INTRODUCTION

1.2.1 Gallic Acid

Chemical Name: 3, 4, 5 –Trihydroxybenzoic acid

HO

HO

HO

COOH

4

Molecular Formula : C6H2(OH)3COOH

Molecular weight : 188.14

Description : White or Pale power

Solubility : Soluble in acetone and ethylacetate but insoluble in

benzene.

Category : Antioxidant

Gallic acid is a naturally occurring poly phenol. It is known as

3, 4, 5 –trihydroxybenzoic acid monohydrate. It is obtained by the hydrolysis

of tannic acid with sulphuric acid. Gallic acid is found in almost all the plants.

Plants known for their high gallic acid content include

Areca nut, (Areca catechu)

Barberry (Berberis Vulgaris),

Blackberry, Hot chocolate, (Robus argatus),

Common walnut (Juglans regia),

Mango peels and leaves (Magnifera indica),

Indian gooseberry (Phyllanthus emblica),

Clove (Syzygium aromaticum)

Golden root (Rhodiola rosea),

Witch hazel (Hamamelis virginiana).

5

It is wide spread in plant foods and beverages such as tea and wine

and was proven to be one of the anti -carcinogenic polyphenol present in green

tea. The consumption of a high diet in saturated fat coupled with gallic acid

apparently in France produced low incidence of coronary heart disease [2]. It is

a strong natural antioxidant; able to scavenge hypochlorous acid also decreases

the peroxidation of brain phospholipids. Antioxidant capacity of galloyl ester

against hydroxyl, azide and super oxide radicals have been reported [3]. Gallic

acid acts as an antioxidant and helps to protect our cells against oxidative

damage. It was found to show cytotoxicity against cancer cells, without

harming healthy cells [4]. It is also present in red wine and found to have a

protective role against oxidation of low-density lipoproteins (LDL) [5].

Synthesis of gallic acid

1. With the elaboration of high-yielding, high-titer synthesis of

3-Dehydroshikimic acid from glucose using recombinant

Escherichia coli, oxidation of this hydroaromatic becomes a

potential route for synthesis of gallic acid. Conversion of

3-Dehydroshikimic acid into gallic acid likely proceeds via initial

enolization of an -hydroxycarbonyl and oxidation of the resulting

enediol. 3-Dehydroshikimate enolization in water was catalyzed by

inorganic phosphate while Zn2+ was used to catalyze enolization in

acetic acid. Enediol oxidation employed Cu2+ as either the

stoichiometric oxidant or as a catalyst in the presence of a

co-oxidant. Gallic acid was produced in a yield of 36% when

3-Dehydroshikimic acid in phosphate-buffered water reacted for 35 h

with H2O2 and catalytic amounts of CuSO4 [6].

6

COOH

OH

O

COOH

OH

OH

HOOH

PO4 / ZnO

3-Dehydroshikimic acid Gallic acid

2. Gallic acid may result from the dehydration of 3-Dehydroshikimic

acid followed by hydroxylation of the intermediate protocatechuic

acid. 3-Dehydroshikimic acid is obtained from erythrose-4-phosphate

[7].

O

OH

HO

PO

O

OH

OHCOOH

OH

OH

HO

(i) Dehydration

(ii) Hydroxylation

Erythrose-4-phosphate Gallic acid

3. In tea seedlings flavan-3-ols are produced by a naringenin-

chalcone naringenin dihydrokaempferol pathway. Dihydrokaempferol

is a branch point in the synthesis of ( )-epigallocatechin-3-O-gallate

and other flavan-3-ols, which can be formed by routes beginning with

either a flavonoid-3 -hydroxylase mediated conversion of the flavonol

to dihydroquercetin or a flavonoid 3 ,5 -hydroxylase-catalysed

conversion to dihydromyricetin with subsequent steps involving

7

sequential reactions catalysed by dihydroflavanol-4-reductase,

anthocyanidin synthase, anthocyanidin reductase and flavan-3-ol

gallate synthase [8].

NH2

O

O

COOH

OH

OH

HO

DihydrokaempferolPathway

Naringenin Gallic acid

4. Series of reactions was elaborated for the transformation of 1, 2, 3-tri

methoxybenzene into gallic acid. The intermediate 1-Bromo-3, 4, 5-

trimethoxybenzene was prepared by nitration, reduction, diazotization

and decomposition of the diazonium salt in the presence of cuprous

bromide. The halide-exchange reaction of the aryl bromide with butyl

lithium, decomposition of the inter-mediate lithio derivative with CO2

and demethylation, lead to gallic acid [9].

OCH3H3CO

H3CO

COOH

HO

OH

OH

Multistep reaction

1, 2, 3-Trimethoxybenzene Gallic acid

8

5. In plants gallic acid is obtained by hydrolysis of tannins [10,11]

O

O

OH

OHO

O

HO

HO COOH

HO

OH

OH

Hydrolysis

Acid / Alkali

Tannin Gallic acid

Some natural products possessing gallic acid nucleus and its

derivatives are

NH2

O

CH3

O

H3C

O

CH3

O

O OH

OH

OH

HO

HO

OH

Mescaline Myrcetin

O CH3

O

H3C

OHHOO

OH

HO

O

O

OH

OH

OH

OH

OH

OH

Sinapyl alcohol Gallocatechin gallate

9

O

O

OH

HO

HO

O OH

OH

HO

OH

O

O

O

O

OH

OH

HO

HO

Ellagic acid Epicatechingallate

1.2.2 Thiazolidinones

Chemical Name: 4 – Oxothiazolidine.

Thiazolidinones are the derivatives of thiazolidine which belong to an

important group of heterocyclic compounds containing sulfur and nitrogen in a

five member ring. A lot of research work on thiazolidinones have been done in

the past. The nucleus is also known as wonder nucleus because it gives out

different derivatives with all different types of biological activities. The 3-

unsubstituted-4-thiazolidinones are usually solids, often melt with

decomposition, but the attachment of an alkyl group to the nitrogen lowers the

melting point. The 4-thiazolidinones that do not contain aryl or higher alkyl

substituents are somewhat soluble in water.

Thiazolidinones are reported to possess variety of pharmacological

activities such as antiHIV [12-14], anticancer [15, 16], anticonvulsant [17],

anti-inflammatory [18], antimicrobial [19, 20] and follicle stimulating hormone

(FSH) receptor agonist activity [21] etc.

HN

SO

10

Some of the drugs with thiazolidinone nucleus are

N

S

O

OH

O

HN

OH2N

H

S

NH

O

O

ON

Ampicillin Pioglitazone

1.2.3 Azetidinones

HN

O

Chemical Name: Azetidin-2-one

The name lactam is given to cyclic amides. In older nomenclature

second carbon in an aliphatic carboxylic acid was designated as , the third

as ß and so on. Thus a -lactam is a cyclic amide with four atoms in its ring.

The contemporary name for this ring system is azetidinone. ß- lactam came

to be a generic descriptor for penicillin family. The ring ultimately proved to

be the main component of the pharmacophore. So the term possesses

medicinal as well as chemical significance.

The chemistry of -lactams has taken an important place in organic

chemistry since the discovery of Penicillin by Sir Alexander Fleming in 1928

and shortly thereafter Cephalosporin which were both used as successful

antibiotics. The 2-azetidinone ( -lactams) ring is a common structural feature

11

of a number of broad spectrum -lactam antibiotics including penicillins,

cephalosporins, carbapenems, nocardicin and monobactams which have been

widely used as chemotherapeutic agents to treat bacterial infection and

microbial diseases. These molecules operate by forming a covalent adduct

with membrane bound bacterial transpeptidases which are also known as

penicillin binding proteins (PBPs) involved in the bio- synthesis of cell wall

[22]. Apart from antibiotic activity, -lactam also possess cholesterol

inhibition [23], antithrombotic [24], antiviral [25] and antifungal activities

[26].

Some of the drugs with azetidinone nucleus are

N

O

HO

HO

F

F

N

S

O

OH

O

H3C HN

O OCH3

H3CO

H

H3C

Ezetimibe Methicillin

1.3 BIOLOGICAL ACTIVITIES

Biological screening is an important part of any research. The

modification of pharmacophore is an important part of drug design. When a

drug is a complex chemical mixture, this activity is exerted by the substance's

active ingredient or pharmacophore but can be modified by other constituents.

Activity is generally dosage-dependent and it is not uncommon to have effects

ranging from beneficial to adverse for one substance when going from low to

high doses.

12

1.3.1 Antimicrobial activity

Microbial infection cause many diseases like pneumonia, meningitis,

bacteraemia, otitis media, sinusitis, tuberculosis, plague, petrusis, cholera,

diptheria, pneumonia, tetanus, leprosy, leptospirosis, etc. The upsurge of

widespread multi-drug resistance microorganisms such as Bacillus subtilis,

Staphylococcus aureus, Streptococcus mutans, Escherichia coli, Klebsiella

pneumonia, Pseudomonas aeruginosa, etc, had been reported as a major threat

to human health. In view of this resistance to drugs currently in use and

emergence of new diseases, there is a continuous need for the synthesis of new

organic compounds as potential antimicrobial agents using a fast and efficient

approach.

The fungal infection also causes number of diseases like athelets foot,

candidiasis, mycosis, tinea, white nose syndrome, zeaspora etc. Primary and

opportunistic fungal infections continue to increase rapidly because of the

increased number of immune compromised patients. As known, not only

biochemical similarity of the human cell and fungi forms a handicap for

selective activity, but also the easily gained resistance is the main problem

encountered in developing safe and efficient antifungals. The ideal antifungal

agents should be fungicidal with broad spectrum of activity and also be suitable

for oral or intraveneous administration and possess good pharmacodynamic

properties without development of resistance during therapy. At present none

of the clinically used drugs satisfies all these criteria. So there is a need to

develop antifungal drugs [27].

1.3.2 Antioxidant activity

Antioxidant compounds in food play an important role as a health

protecting factor. Scientific evidence suggests, that antioxidants reduce the risk

13

for chronic diseases including cancer and heart disease. Primary sources of

naturally occurring antioxidants are whole grains, fruits and vegetables. Plant

sourced food antioxidants like vitamin C, vitamin E, carotenes, phenolic acids,

phytate and phytoestrogens have been recognized as having the potential to

reduce disease risk. Most of the antioxidant compounds in a typical diet are

derived from plant sources and belong to various classes of compounds with a

wide variety of physical and chemical properties. Some compounds, such as

gallates, have strong antioxidant activity, while others, such as mono-phenols

are weak antioxidants. The main characteristic of an antioxidant is its ability to

trap free radicals. Highly reactive free radicals and oxygen species are present

in biological systems from a wide variety of sources. These free radicals may

oxidize nucleic acids, proteins, lipids or DNA and can initiate degenerative

disease. Antioxidant compounds like phenolic acids, polyphenols and

flavonoids scavenge free radicals such as peroxide, hydroperoxide or lipid

peroxide and thus inhibit the oxidative mechanisms that lead to degenerative

diseases [28].

There are a number of clinical studies suggesting that the antioxidants

in fruits, vegetables, tea and red wine are the main factors for the observed

efficacy of these foods in reducing the incidence of chronic diseases including

heart disease and some cancers. The free radical scavenging activity of

antioxidants in food materials has been substantially investigated and reported

in the literature [29, 30]. Various antioxidant activity methods have been used

to monitor and compare the antioxidant activity of food. In recent years,

oxygen radical absorbance capacity assays and enhanced chemiluminescence

assays have been used to evaluate antioxidant activity of foods, serum and

other biological fluids. These methods require special equipment and technical

skills for the analysis. The different types of methods published in the literature

for the determinations of antioxidant activity of foods involve electron spin

resonance (ESR) and chemiluminescence methods. These analytical methods

14

measure the radicalscavenging activity of antioxidants against free radicals

like the 1,1-Diphenyl-2-picrylhydrazyl (DPPH) radical, the superoxide anion

radical (O2.), the hydroxyl radical (OH.) or the peroxide radical (ROO.).

The various methods used to measure antioxidant activity of food

products can give varying results depending on the specific free radical being

used as a reactant. There are other methods which determine the resistance of

lipid or lipid emulsions to oxidation in the presence of the antioxidant being

tested. The malondialdehyde (MDA) or thiobarbituric acid-reactive-substance

(TBARS) assays have been used extensively since 1950’s to estimate the

peroxidation of lipids in membrane and biological systems. These methods are

time consuming, because they depend on the oxidation of a substrate which is

influenced by temperature, pressure, matrix etc. and may not be practical when

large numbers of samples are involved.

Antioxidant activity methods using free radical traps are relatively

straightforward to perform. The ABTS [2,2’-Azinobis(3-ethylbenzothiazolin-6-

sulfonic acid)] radical cation [30] has been used to screen the relative radical-

scavenging abilities of flavonoids and phenolics. The Oxygen Radical

Absorbance Capacity (ORAC) procedure to determine antioxidant capacity of

fruits and vegetables are also reported [31]. Phenolic and polyphenolic

compounds constitute the main class of natural antioxidants present in plants,

food and beverages and are usually quantified employing Folin’s reagent. A

rapid, simple and inexpensive method to measure antioxidant capacity of food

involves the use of the free radical, DPPH [32].

1.3.3 Antitubercular activity

Tuberculosis (TB) is a common and often deadly infectious disease

caused by various strains of mycobacteria, usually Mycobacterium tuberculosis

15

in humans [33]. TB usually attacks the lungs but can also affect other parts of

the body. It is spread through the air when people who have the disease when

cough, sneeze or spit [34]. Most infections in humans result in an

asymptomatic, latent infection and about one in ten latent infections eventually

progresses to active disease, which, if left untreated, kills more than 50% of its

victims.

The classic symptoms are chronic cough with blood-tinged sputum,

fever, night sweats and weight loss (the last giving rise to the formerly

prevalent colloquial term "consumption"). Infection of other organs causes a

wide range of symptoms. Diagnosis relies on radiology (commonly chest X-

rays), a tuberculin skin test, blood tests, as well as microscopic examination

and microbiological culture of bodily fluids. Treatment is difficult and requires

long courses of multiple antibiotics. Contacts are also screened and treated if

necessary. Antibiotic resistance is a growing problem in (extensively) multi-

drug-resistant tuberculosis. Prevention relies on screening programs and

vaccination, usually with Bacillus Calmette-Guérin vaccine.

One third of the world's population is thought to be infected with M.

tuberculosis, [35, 36] and new infections occur at a rate of about one per

second [37]. The proportion of people who become sick with tuberculosis each

year is stable or falling worldwide but, because of population growth, the

absolute number of new cases is still increasing. In 2007 there were an

estimated 13.7 million chronic active cases, 9.3 million new cases, and 1.8

million deaths, mostly in developing countries [38]. In addition, more people in

the developed world are infected with tuberculosis, because their immune

systems are compromised by immunosuppressive drugs, substance abuse or

AIDS. The distribution of tuberculosis is not uniform across the globe; about

80% of the population in many Asian and African countries test positive in

tuberculin tests, while only 5-10% of the US population test positive.

16

Treatment for TB uses antibiotics to kill the bacteria. Effective TB treatment is

difficult, due to the unusual structure and chemical composition of the

mycobacterial cell wall, which makes many antibiotics ineffective and hinders

the entry of drugs [39]. The two most commonly used drugs are Rifampicin

and Isoniazid. However, instead of the short course of antibiotics typically used

to cure other bacterial infections, TB requires much longer periods of treatment

(around 6 to 24 months) to entirely eliminate mycobacteria from the body.

Latent TB treatment usually uses a single antibiotic, while active TB disease is

best treated with combinations of several antibiotics, to reduce the risk of the

bacteria developing antibiotic resistance People with latent infections are

treated to prevent them from progressing to active TB disease later in life.

Drug-resistant tuberculosis is transmitted in the same way as regular

TB. Primary resistance occurs in persons infected with a resistant strain of TB.

A patient with fully susceptible TB develops secondary resistance (acquired

resistance) during TB therapy because of inadequate treatment, not taking the

prescribed regimen appropriately or using low-quality medication [40]. Drug-

resistant TB is a public health issue in many developing countries, as treatment

is longer and requires more expensive drugs. Multi-drug-resistant tuberculosis

(MDR-TB) is defined as resistance to the two most effective first-line TB

drugs: Rifampicin and Isoniazid. Extensively drug-resistant TB (XDR-TB) is

also resistant to three or more of the six classes of second-line drugs [41]. So

there is an urgent need to develop drugs for treating tuberculosis.

1.4 QUANTITATIVE STRUCTURE ACTIVITY RELATIONSHIP

(QSAR)

QSAR represent an attempt to correlate structural descriptors of

compounds with activities. These structural descriptors, which include

parameters to account for hydrophobicity, topology, electronic properties, and

17

steric effects, are determined empirically or, more recently, by computational

methods. Activities used in QSAR include chemical measurements and

biological assays. QSAR currently are being applied in many disciplines, with

many pertaining to drug design and environmental risk assessment. In the

1890's, Hans Horst Meyer of the University of Marburg and Charles Ernest

Overton of the University of Zurich, working independently, noted that the

toxicity of organic compounds depended on their lipophilicity [42, 43].

QSAR based on Hammett's relationship utilizes electronic properties

as the descriptors of structures. Difficulties were encountered when

investigators attempted to apply Hammett-type relationships to biological

systems, indicating that other structural descriptors were necessary.

Robert Muir, a botanist at Pomona College, California, has studied

the biological activity of compounds that resembled indole acetic acid and

phenoxy acetic acid, which function as plant growth regulators. In his attempt

to correlate the structures of the compounds with their activities, he consulted

his colleague in chemistry, Corwin Hansch. Using Hammett sigma parameters

to account for the electronic effect of substituents did not lead to meaningful

QSAR [44]. However, Hansch recognized the importance of the lipophilicity,

expressed as the octanol-water partition coefficient, on biological activity [45].

We now recognize this parameter to provide a measure of the bioavailability of

compounds, which will determine, in part, the amount of the compound that

gets to the target site. Relationships were developed to correlate a structural

parameter (i.e., lipophilicity) with activity. In some cases, a univariate

relationship correlating structure and activity was adequate. The form of the

equation is:

Log (1/C) =a log P + b (1.1)

18

where log P – Partition Coefficent

b - Constant

where C is the molar concentration of compound that produces a standard

response (e.g., LD50, ED50). Carbonic anhydrase catalyzes the reaction

CO2 + H2O HCO3- + H+ (1.2)

the hydration of some aldehydes and ketones, and the hydrolysis of alkyl and

aryl esters. It is a zinc-containing enzyme of about 30,000 daltons, and the

three-dimensional structure has been characterized by X-ray diffraction.

Physiologically, carbonic anhydrase is involved in gastric, urinary, pancreatic,

lacrimal, and cerebrospinal secretions. Inhibitors of carbonic anhydrase include

aromatic and heterocyclic sulfonamides, and some of these compounds have

found application as diuretics.

Both traditional QSAR and computer graphical methods have been

applied to the development of sulfonamides and other compounds as inhibitors

of carbonic anhydrase. For example, Hansch et al. [46] developed a QSAR

based on the binding constants of 29 phenylsulfonamides to the enzyme. The

equation that was derived was the following

log K = 1.55 + 0.64 log P – 2. 07 I1 – 3.28 I2 + 6.94 (1.3)

where K is the binding constant, I1=1 if X is meta and I1= 0 if X is ortho or para

and, I2 = 1 if X is ortho and I2 = 0 if X is ortho or para .

The negative coefficients of I1 and I2 suggest that, they account for

unfavorable steric effects when substituents are in the meta or ortho positions.

Binding is favored by electron-withdrawing substituents, which is consistent

with the hypothesis that the ionized form of -SO2NH2 binds to the zinc in the

19

active site of carbonic anhydrase [47]. Interactive computer graphics also

applied to understand better interaction of carbonic anhydrase inhibitors with

the enzyme as illustrated in Figure 1.1.

Fig. 1.1 Interactive computer graphics of carbonic anhydrase inhibitors

The active site is a cavity approximately 12 Angstroms deep with a

zinc atom (magenta) near the bottom of the cavity. The active site is divided

into a hydrophilic half (blue) and a hydrophobic half (red). In the complex, the

inhibitor appears to be bound such that the sulfonamide moiety occupies the

fourth coordination site of the zinc atom, with the other three sites being

occupied by histidine residues.

The QSAR approach uses parameters which have been assigned to

the various chemical groups that can be used to modify the structure of the

drug. The parameter is a measure of the potential contribution of its group to a

particular property of the parent drug. The selection of parameters is an

important step in QSAR study. The various parameters used in QSAR study are

as follows.

20

1.4.1 Thermodynamic Parameters

(i) Heat of Formation: The enthalpy for forming a molecule from its

constituent atom is a measure of the relative thermal stability of a molecule. It

is calculated by quantum-chemical technique and has a wide range of

applicability in conformational analysis, intermolecular modeling and chemical

reaction modeling. The atom limit is 300 atoms or 300 atomic orbitals

(whichever is less) per molecule.

(ii) Partition Coefficient Log P: Log P (the octanol/water partition

coefficient) and molar refractivity are molecular descriptors that can be used to

relate chemical structure to observe chemical behavior. Log P is related to the

hydrophobic character of the molecule. The molecular refractivity index of a

substituent is a combined measure of its size and polarizability.

waterionizedun

octwat/oct ]Solute[

]Solute[LogPLog (1.4)

The partition coefficient is a ratio of concentrations of un-ionized

compound between the two solutions. To measure the partition coefficient of

ionizable solutes, the pH of the aqueous phase is adjusted such that, the

predominant form of the compound is un-ionized. The logarithm of the ratio of

the concentrations of the un-ionized solute in the solvents is called log P.

(iii) Melting Point: The melting point of a solid is the temperature at

which the vapor pressure of the solid and the liquid are equal. At the melting

point, the solid and liquid phase exists in equilibrium. When considered as the

temperature of the reverse change from liquid to solid, it is referred as the

freezing point. When the "characteristic freezing point" of a substance is

determined, in fact the actual methodology is almost always "the principle of

21

observing the disappearance rather than the formation of ice", that is, the

melting point.

(iv) Molar Refractivity (MR): The molar refractivity is a measure of

both the volume of a compound and how easily it is polarized. It is expressed

as:

2

2

(n 1)MMR(n 1)d

(1.5)

where n is the refractive index

M is the molecular weight

d is the density.

The term mol.wt/density define a volume, while the term

(n2 – 1) / (n2 + 1) provide a correction factor by defining how easily the

substituent can be polarized. This is particularly significant if the substituent

has a electron or lone pair of electrons.The positive sign of MR in QSAR

equation explains that, the substituent binds to polar surface while a negative

sign or non-linear relationship indicates steric hindrance at the binding site.

(v) Energy Stretching: Energy stretching is the bond stretching energy.

The value of the E stretching bond energy for pair of atoms joined by a single

bond can be estimated by considering the bond to be a mechanical spring that

obeys Hooke’s law. If r is the stretched length of the bond and r0 is the ideal

bond length, then

E stretching = 1/2 K (r – r0)2 (1.6)

where ro is Ideal bond

22

r is Stretched bond

K is the force constant.

If a molecule consist of three atoms, (a-b-c), then

E stretching = E a-b + E b-c

=K(a-b) [r(a-b) – r0(a-b)] +½ k(b-c) [r(b-c) – r0(b-c)]2 (1.7)

(vi) Torsion Energy: E Torsion is the bond enery due to changes in the

conformation of the bond and given by

1 (1 cos( ( ))2TorsionE k m offset

(1.8)

where k is the energy barrier to the rotation about the torsion angle , m is

the periodicity of the rotation

is offset of the ideal torsion angle relative to staggered

arrangement of two atoms.

(vii) Energy VDW: The Van der Waals interaction energy of the

molecule with the receptor. EvdW is the total energy contribution due to Van der

Waal’s force and it is calculated from the Leonard – Jone potential equation

r)r(2

r)r(E

6min

12min

vdw (1.9)

The6

min(r )r

term in this equation represents attractive force, while

12min(r )r

term represents the short range of repulsive forces between the atoms.

23

The r min is the distance between two atoms when the energy is at a minimum .

The actual distance between the atoms is represented as r.

1.4.2 Electronic Parameters

(i) Energy Bend: E bend is the bond energy due to the changes in the

bond angle and estimated as:

E bend = ½ k ( 0)2 (1.10)

Where, is the actual bond length

0 is the ideal bond length that is the minimum energy position of the

3 atoms.

(ii) Highest Occupied Molecular Orbital (HOMO) Energy: HOMO is

the highest energy level in the molecule that contains electrons. It is crucially

important in governing molecular reactivity and properties. When a molecule

acts as a Lewis base (an electron-pair donor) in bond formation, the electrons

are supplied from the molecule's HOMO. How readily this occurs is reflected

in the energy of the HOMO. Molecules with high HOMOs are more able to

donate their electrons hence relatively reactive when compared to molecules

with low-lying HOMOs, thus the HOMO descriptor measures the

nucleophilicity of a molecule.

(iii) Lowest Unoccupied Molecular Orbital (LUMO) Energy: LUMO

is the lowest energy level in the molecule that contains no electrons. It is

important in governing molecular reactivity and properties.When a molecule

acts as a Lewis acid (an electron-pair acceptor) in bond formation, incoming

electron pairs are received in its LUMO. Molecules with low-lying LUMOs are

24

more able to accept electrons more than those with high LUMOs, thus the

LUMO descriptor measures the electrophilicity of a molecule.

1.4.3 Steric Parameters

(i) Ovality: Ovality or non-circularity is the degree of deviation from

perfect circularity of cross section of the core or cladding of the fibre.

Quantitatively, the ovality of either the core or lading is expressed as,

(a b)2(a b)

(1.11)

where a is the length of major axis

b is the length of minor axis.

(ii) Dipole Moment: The dipole moment descriptor is a 3D electronic

descriptor that indicates the strength and orientation behavior of a molecule in

an electrostatic field. Both the magnitude and the components (X, Y and Z) of

the dipole moment are calculated. It is estimated by utilizing partial atomic

charges and atomic co-ordinates. The descriptor uses Debye units. Dipole

properties have been correlated to long-range ligand-receptor recognition and

subsequent binding.

(iii) Balaban Index: The Balaban Index ‘J’ is a graph index defined for a

graph on n nodes and m edges. This is a highly discriminating descriptor,

whose values do not substantially increase with molecule size and the number

of rings present. Its evaluation begins with the D-matrix modified as follows:

Each edge contributes length 1/b to overall path lengths, where

b is the edge (bond) order.

25

For aromatic bonds, the number b is set to 1.5 by definition

(thus contributing 2/3 to overall path lengths).

n n 1/2i 1 j 1

mJ (DiDj)1

(1.12)

where = m – n +1 is the circuit tank of the graph

Di is the sum of all entries in the ith (or column) of the graph

distance matrix.

Dj is the sum of all entries in the jth (or column) of the graph

distance matrix.

Balaban Index helps to differentiate the molecule according to their

shape

(iv) Connolly Solvent Accessible Area (Angstrom2): The locus of the

center of a spherical probe as it is rolled over the molecular model. Connolly’s

solvent accessible area, a steric descriptor, represents the surface area, that is in

contact with the solvent. The descriptor bears negative coefficient in the model,

suggesting increase in the bulkiness of the substituents and molecular solvent

accessible surface area is not conducive to the activity.

(v) Connolly Molecular Surface Area (Angstrom2): The contact

surface created when a spherical probe is rolled over the molecular model. The

molecular surface (MS) is a continuous sheet consisting of two parts: the

contact surface and the reentrant surface.The contact surface is part of the van

der Waals surface that is accessible to a probe sphere. The reentrant surface is

the inward-facing surface of the probe when it touches two or more atoms.

Molecular surface is also called the Connolly surface.

26

(vi) Connolly Solvent Excluded Volume (Angstrom3): The volume

contained within the contact molecular surface. The molecular surface is also

called the solvent-excluded surface (SES), which is the boundary of the union

of all possible probes which do not overlap with the molecule

(vii) Principle Moment of Inertia(X,Y,Z): The moment of inertia of the

whole body with respect to one of the principal axes is known as Principle

Moment of Inertia. The moments of inertia are computed for a series of straight

lines through the center of mass.

(viii) Wiener Index (W): The Wiener index is the sum of the chemical

bonds existing between all pairs of heavy atoms in the molecule. In graph-

theoretical terms: the sum of lengths of minimal paths between all pairs of

vertices representing heavy atoms. This is equal to half the sum of all D-matrix

entries

Di j ij

1W a2

(1.13)

aij is the ij element of distance matrix of molecule. The summation is made over

all the atoms I and j in the molecule.

1.4.4 QSAR equations

QSAR equations determine the functional relationship between

activity and the selected descriptors; that is, search for mathematical function f,

that has a property that, activity= f (descriptor) to a suitably high level of

accuracy. i.e after identifying the dependent and independent variables a

suitable statistical method is used to generate a QSAR equation [48]. The

statistical methods can be broadly divided into two: linear and non-linear

27

methods. In statistics a correlation is established between dependent variable(s)

(biological activity) and independent variable(s) (molecular descriptors).

The linear method fits a line between the selected descriptors and

activity as compared to non-linear method which fits a curve between the

selected descriptors and activity. The statistical method to build QSAR model

is decided based on the type of biological activity data.

Following are few commonly used statistical methods:

Categorical Dependent Variable - Discriminant analysis, Logistic

regression, k-Nearest neighbour classification, Decision trees.

Continuous Dependent Variable - Multiple regression, Principle

component regression, Continuum regression, Partial least

squares regression, Canonical correlation analysis, k-Nearest

neighbor method, Neural networks.

Multiple regression is the widely used method for building QSAR

model. It is simple to interpret a regression model, in which contribution of

each descriptor could be seen by the magnitude and sign of its regression

coefficient. Multiple linear regression attempts to maximize the fit of the data

to a regression equation for the biological activity by adjusting each of the

parameters upon down. Successive regression equations will be derived in

which parameters will be either added or removed until the r2 and S values are

optimized. The magnitude of coefficients derived in this manner that indicates

the relative contribution of the associated parameter to bioactivity.

There are various statistical measures available for evaluation of the

significance of the model; following are most commonly used [49].

28

n - number of molecules

k - number of descriptors in a model

df - degree of freedom (n-k-1) (higher is better)

r2 - coefficient of determination (> 0.7)

Q2 - cross-validated r2 (>0.5)

pred_r2 - for external test set (>0.5)

SEE - standard error of estimate (smaller is better)

F-test - F-test for statistical significance of the model

(higher is better, for same set of descriptors and

compounds)

Z score - Z score calculated by the randomization test (higher

is better)

SDEP - Standard deviation error of predictivity.

Correlation Coefficient (r) and Coefficient of Determination (r2): The

quantity r, called linear correlation coefficient, measures the strength and the

direction of a linear relationship between two variables. The coefficient of

determination, r2, is useful because it gives the proportion of the variance

(fluctuation) of one variable that is predictable from the other variable. It is a

measure that allows us to determine how one can be in making predictions

from certain model/graph. It can be calculated as:

2 Sum of Squares of the deviation from the regressionlinerSum of Squares of the deviations from the mean

Regression VarianceOriginal Variance

(1.14)

29

Regression variance is defined as the original variance minus the

variance around the regression line. The original variance is the sum of square

distances of the original data from the mean. If

0 < r2 < 1, it indicates positive correlation

r2 = 0, it shows that there is no linear correlation or weak correlation

r2 = 1, it means perfect correlation.

The higher of the r2 value, less likely the relationship is due to

chance.

F or Variance Ratio: F-statistic value is a ratio between explained and

unexplained variance for a given number of degree of freedom. The larger the

value of F, greater the probability that the QSAR model is significant.

Z-Score: Z score can be defined as an absolute difference between the values

of the model and the activity field, divided by the square root of the mean

square error of the data set. Any compounds which show Z-score higher than

2.5 in QSAR model is considered as outlier.

1.4.5 Validation of equation

Validation technique is used to identify outlines (data that is not

modeled well by the equation). Graphic analysis and cross validation are used

to characterize the robustness the QSAR .There is no single method that works

better for predictiveness, interpretability and computational efficiency.

Cross Validation Technique: As opposed to traditional regression methods,

cross validation [45] evaluates the validity of a model by how well it will

predict data rather than how well it will fit data. The analysis uses Leave-One-

Out (LOO) scheme. Each compound is left out of the model derivation and

predicted inturn. An indication of the performance of the model is obtained

from the cross validated r2 which is defined as

30

r2 =SD-Press/SD (1.15)

where SD is sum of squares of deviation for each activity from the mean,

Press is predictive sum of squares which is the sum of the squared differences

between the actual and predicted value.

Once a model is developed which has the highest cross-validated r2

that is used to derive the conventional QSAR equation and conventional r2 and

S values. The final model results are then visualized as contour maps of the

coefficients.

1.4.6 Predict of Activity

From the QSAR equations obtained, the biological activity of new

compounds may be predicted

QSAR methods are useful in elucidating the mechanism of chemical-

biological interaction in various biomolecules, particularly enzymes,

membranes, organelles and cells. It has also utilized for the evaluation of

absorption, distribution, metabolism and excretion phenomena in organism and

whole animal study. Potential use of QSAR model for screening of chemical

database or virtual libraries before their synthesis appears equally attractive to

chemical manufacturers and pharmaceutical companies.

Date post:	25-Jan-2020
Category:	Documents
Upload:	others
View:	41 times
Download:	0 times

CHAPTER 1 INTRODUCTION - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/6502/6/06...1 CHAPTER...

Documents