The End-to-End Distance of RNAas a Randomly Self-Paired Polymer
Li Tai Fang
Department of Chemistry & BiochemistryUCLA
RNA
a biopolymer consisting of 4 different species of monomers (bases): G, C, A, U
GAG
secondarystructure
–––
CUU
5'3'
generic vs. sequence-specific properties
● Regardless of sequence or length, we can predict● Pairing fraction: 60%
● Average loop size: 8
● Average duplex length: 4
● 5' – 3' distance
Association of 5' – 3' required for:
● Efficient replication of viral RNA
● Efficient translation of mRNA
e.g., HIV-1, Influenza, Sindbis, etc.
complementary sequence
RNA bindingprotein
Question:How do the 5' and 3' ends of long RNAs find each other?Answer:The ends of RNA are always in close proximity, regardless of sequence or length !
Yoffe A. et al, 2010
Circle diagram
Circle diagram
Circle Diagram
● 60% of bases are paired
● duplex length ≈ 4
● Inspired the “randomly self-paired polymer” model
randomly self-paired polymer
e.g., N
T = 1000
Np = 600
NT,eff
= 550
Np,eff
= 150
general approach
1) pi = probability that the ith set of “base-pair(s)”
-------will bring the ends to less than/equal to X
2) P(X) = at least one of those sets will occur
= 1 – (1 – pi)·(1 – p
j)·(1 – p
k)· … ·(1 – p
z)
(X) = P(X) – P(X–1) = probability Ree
is X
X = X (X) · X
preview of the results:
Fang, L. T., J. Theor. Biol., 2011
Let's start the grunt work
RNA:N
T = 1000
Np = 600
Model:N
T,eff = 550
Np,eff
= 150
Reminder:
Now, the 1st challenge:
probability of a particular set of pairs
i j k l m n
p(i) = 150/550p(ij) = 1 /549p(k) = 148/548p(kl) = 1 /547p(m) = 146/546p(mn) = 1 /545
= p (this partial set)
= p(i) p(i – j) p(k) p(k – l) p(m) p(m – n)
depends on NT,eff
, Np,eff
, and B
Next challenge:
● We have pi = p(N
T,eff, N
p,eff, B)
● We want P(X) = 1 – (1 – pi)·(1 – p
j)·(1 – p
k)· … ·(1 – p
z)
Let (B) = number of ways to make a set of pairs
Then, P(X) = 1 – (1 – pB=1
)B=1 · (1 – pB=2
)B=2 · … · (1 – pBmax
)Bmax
i j k l m n
B = 3: x
1 x
2 x
3 x
4
Task: find (B)
● 1st, find the number of sets {x1, x
2, …, x
B+1},
such that X = x1+ x
2+ … + x
B+1
● for B = 3, X = 10: # of ways to arrange these:
X + B ( X + B ) !
B X! B!=
For each {xi}, how many ways to move the
middle regions?
i j k l i j k l
vs.
Navailable
B – 1N
T,eff – X – B – 1
B – 1=
Consider all X's
X + B B
NT,eff
– X – B – 1
B – 1
X
X
i=0
Missing something...... base-pairing “crossovers:”
vs.
i j k l i j k l
(a) (b) (c) (a) (b) (c)
Crossovers are also known as pseudoknots
● X = xa + x
b + x
c
as long as xb j – i
____ and xb l – k
● 2 ways to connect each middle region
● undercount by 2(B – 1)
Now, let's put it all together
X + B B
NT,eff
– X – B – 1
B – 1
X
X
i=0
= 2(B – 1)
( NT,eff
, X, B )
Once again, the general approach
where end-to-end distance X
P(X) = at least one of these pairs will occur
P(X) = 1 – (1 – pi)·(1 – p
j)·(1 – p
k)· … ·(1 – p
z)
P(X) = 1 – (1 – pB=1
)B=1 · (1 – pB=2
)B=2 · … · (1 – pBmax
)Bmax
● (X) = P(X) – P(X–1)
Probability distribution of end-to-end distances
Fang, L. T., J. Theor. Biol., 2011
<X> = 14.4
end-to-end distance vs. sequence length
X = X (X) · X
Fang, L. T., J. Theor. Biol., 2011
Scaling law: <X> ~ N1/4
Fang, L. T., J. Theor. Biol., 2011
Once again:
● The ends of a self-paired polymer, such as RNA, are always in close proximity. ● This is a generic feature.
● Comparison of end-to-end distances:● random or worm-like polymers: X N1/2
● randomly branching polymers: X N1/4
● randomly self-paired polymers: X N1/8
Acknowledgment
● Thesis advisors
● Professors Bill Gelbart and Chuck Knobler
● Thanks to
● Professor Avinoam Ben-Shaul @ Hebrew University of Jerusalem