IELM 511: Information System design
Introduction
Part 1. ISD for well structured data – relational and other DBMS
Part 2. ISD for systems with non-uniformly structured data
Part III: (subset of)
Basics of web-based IS (www, web2.0, …)Markup’s, HTML, XMLDesign tools for Info Sys: UML
API’s for mobile appsSecurity, CryptographyIS product lifecyclesAlgorithm analysis, P, NP, NPC
Info storage (modeling, normalization)Info retrieval (Relational algebra, Calculus, SQL)DB integrated API’s
Agenda
The mathematical basis for RSA encryption
Modulo mathematics: +; *; ^
Proof of correctness of RSA
Concluding remarks
How RSA is implemented
Need for RSA
Shared key cryptography does not solve all communication problems:Examples: Secure E-commerce (how did you exchange password withAmazon? with Yahoo shopping ?)
We also saw the need for a public-key private-keyencryption systems (digital signatures, secure transmission)
In the last lecture, we saw the use of (shared) private key cryptographyExample: E-banking (you may need to physically get password)
In this lecture, we look at the theoretical basis for the RSA algorithm,which is used (in some form or other) in public-private key cryptography
The theoretical basis for the RSA algorithm: Number theory, Algorithms
Modulo mathematics
Given an integer m and positive integer n,m mod n is the smallest nonnegative integer r such that for some integer qm = nq + r
Examples:
27 mod 3 = 0 [since 27 = 3*9 + 0]27 mod 4 = 3 [since 27 = 4*6 + 3]-27 mod 4 = 1 [since -27 = 4+(-7)+ 1]
Note: this definition works for positive and negative m
Modulo ring
Zn is the set of integers {0, 1, . . . , n − 1} with two operators:
addition modulo n, denoted +n: i +n j = (i + j) mod n
multiplication modulo n, denoted: *n: i *n j = (i * j) mod n
Exercises:
Prove that +n and *n satisfy the commutative property;
Prove that *n distributes over +n
An insecure private key scheme: +n
In all discussion, we will assume that a message is a lower-case English text message (with 26 characters)
In most encoding/decoding, we will use the notation a = 0; b = 1; … z =25
Scheme:Secret key: integer kEncode: Replace each letter x by x' = (x +26 k) = (x + k) mod 26.Decode: Replace each letter x' by (x' –26 k) = (x' – 26) mod 26.
Notes:1. (x' – k) can be negative [hence the usefulness of our mod definition!]2. Exercise: show that indeed ( (x +26 k) –26 k ) = x
An insecure private key scheme: +n
Scheme:Secret key: integer kEncode: Replace each letter x by x' = (x +26 k) = (x + k) mod 26.Decode: Replace each letter x' by (x' –26 k) = (x' – 26) mod 26.
Q: Why is this scheme insecure ?
Answer:A scheme is insecure if an efficient algorithm exists that can decrypt anencrypted message without knowledge of the key, k
In our scheme, k can have any value (infinite possibilities), BUTTo decipher k, how many values do we need to try ?Why ? i mod n = (i + kn) mod n for all integers k.
So +n does not work, how about *n
Scheme:1. Code the message into (a series of) number(s): Message = M2. Private key: integers a,n3. Encode: fa,n( M) = (a *n M) = (a * M) mod n.4. Decode: ??
For this scheme, we need an inverse for multiplication mod n, namely
some function, ga,n(X) = a-1 *n X such that ga,n(fa,n( M)) = M,
Question: Is there some such function g( ) ?
In other words, we are looking for a definition of a multiplicative inverse.
Crypto scheme using *n …
Suppose:(a, n, M) = (4, 12, 3)
4 * 3 mod 12 = 0
Impossible to decrypt!
Recipient gets message = 0;From the Z12 table, row a=4there are four possible values.
M
a
fa,n( M) = (a *n M)
Crypto scheme using *n …
Second try:(a, n, M) = (5, 12, 7)
5 * 7 mod 12 = 11
Only one entry = 11 inthe Z12 table, row a=5
Recipient decrypts M = 7 !
M
a
fa,n( M) = (a *n M)
Conclusion: This scheme works iff all entries in some row of Zn table areunique (and indeed, are a permutation of the set {0, 1, …, n-1}
Question: which combination of values n, a have this property ?
Primes, Relative primes, and GCD's in *n
A number > 1 is called a prime if it can only be divided by itself or 1with no remainder.
Given two numbers, a and b, we define gcd( a, b) as the largest integer thatdivides both a and b without remainder.
Two numbers, a and b, are called relatively prime if gcd( a, b) = 1.
Examples:
2, 3, 5, 7 .. are prime numbers How many prime numbers are there?
gcd( 12, 3) = 3gcd( 12, 5) = 1Given prime number p, what is gcd( p, n) = ?
Primes, Relative primes, and GCD's in *n
A useful theorem and corollary
Theorem 1. Given two positive integers j, k, gcd(j, k) = 1 iff there are integers x and y such that jx + ky = 1.
Corollary 2. For any positive integer n, an element a Zn
has a multiplicative inverse if and only if gcd(a, n) = 1.
How to compute gcd( a, b): Euclid's method
Lemma 3. Let j, k, q, and r be nonnegative integers such that k = jq + r, then gcd(j, k) = gcd(r, j).
Proof:case 1. r = 0gcd( r, j) = gcd( 0, j) = j (since everything divides 0), andk = jq, therefore gcd( k, j) = j
case 2. r > 0(i) let d be a common factor of j and k integers x, y > 0 such thatj = xd and k = yd;yd = xdq + r r = d( y – dq) d is a factor of r.
(ii) let d be a common factor if r, j integers x, y > 0 such thatr = dx and j = dy;k = dyq + dx = d( yq + x) d is a common factor of k, j.
From (i) and (ii) , d is a common factor of r, j iff it is a common factor of j, k, which implies that gcd( j, k) = gcd( r, j).
How to compute gcd( a, b): Euclid's method
Lemma 3. Let j, k, q, and r be nonnegative integers such that k = jq + r, then gcd(j, k) = gcd(r, j).
Algorithm gcd( k, j)
1. gcd(k, j) where 0 ≤ j < k2. If (j = 0) return( k)3. Else 4. r = k mod j; // therefore k = jq + r5. return gcd(j, r)
Example:gcd( 235, 141)
iteration 1: gcd( 235, 141): k = 235; j = 141; r = k mod j = 235 – 1 * 141 = 94iteration 2: gcd( 141, 94): k = 141; j = 94; r = 141 - 1 * 94 = 47iteration 3: gcd( 94, 47) : k = 94; j = 47; r = 94 – 2 * 47 = 0iteration 4. gcd( 47, 0): returns 47.
Not quite – such a mechanism is not secure.First, let's look at the scheme that works: RSA
RSA (named after Profs. Rivest, Shamir & Adelman) was proposed in 1970's at MIT
It is the basis of almost all eCommerce security today
Main idea:
- The public key, Kp, provides a mechanism to encode the Message
- Given Kp and encrypted message M* = rsa( Kp, M) we cannot efficiently compute Kp-1
- The secret key, Ks, provides an efficient means to compute Kp-1
Can we use *n and its inverse to design Asymmetric keys?
Before studying the theory behind RSA, let's first see how RSA functions.
1. Select two large prime numbers, p and q
2. Let n = pq; let T = ( p - 1)( q - 1)
3. Select a large prime, e (e != 1), such that gcd( e, T) = 1
4. Calculate d = e-1 mod T
5. The public key, Kp is (n ,e)
6. The secret key, Ks is d
The RSA scheme
Notes:Large prime: a prime number with 150 digits or more (later we shall see why)Is T prime ?In step 3, e is selected so that e, T are relatively prime.
Suppose Alice wants to send Bob a message, x ( 0 < x < n)
1. Alice gets Bob's public key, (e, n)2. Alice computes x* = xe mod n3. Alice sends x* to Bob.
Bob wants to decrypt the message received from Alice:
1. Bob looks up his secret key, d2. Bob computes x** = x*d mod n
Claim: x** = x = original message that Alice wants to send.
RSA: usage and security
To prove that RSA works, we need to prove the following:
1. Correctness: (xe mod n)d mod n = x2. Security:2.1. A party who knows n, e, and Me mod n, but not p, q, or d cannot compute M2.2. A party who knows n (public key) cannot find its factors p, q (otherwise they could easily calculate d!)
RSA involves the following step:… 4. Calculate d = e-1 mod T
Multiplicative inverse modulo n
What is e-1 ?In Zn, we say that a-1 is the multiplicative inverse of a (!= 0) iff a *n a-1 = a-1 *n a = 1
Does such an inverse always exist ? If so, how can we compute it ?
a a-1
_______________
1 12 -3 -4 -5 56 -7 78 -9 -10 -11 11
Computing the multiplicative inverse
We need a solution to: a *n x = 1, which is the same as ax mod n = 1 ax = qn + r (for some integer q, and r = 1), ax + (-q)n = 1
Claim: If a Zn, and x, y are integers such that ax + ny = 1, then a-1 = x mod nProof (sketch):a *n x = a *n x + n *n y = a *n x +n n *n y = (ax + ny) mod n =1
Recall Theorem 1.Given two positive integers j, k, gcd(j, k) = 1 iffthere are integers x and y such that jx + ky = 1.
since n *n y = 0 since (s + t) mod n = (s mod n + t mod n ) mod n
Exercise: prove this
Computing the multiplicative inverse..
To solve: a *n x = 1, we need to find two integers x, y such that (ax + ny) mod n =1
The following algorithm, with inputs a, n, solves for x (if it exists):
Algorithm gcd_xy( k, j) // 0 ≤ j < k// returns: [x, y, gcd( j, k)] such that jx + ky = gcd( j, k)
1. If k = jq, return [x = 1, y = 0, gcd( k, j) = j];2. Else 3. r = k mod j; // therefore k = jq + r4. q = (k – r)/j5. [x', y', gcd(j, k)] = gcd( r, j)6. return [x = y' – qx', y = x', gcd(r, j)]
Exercise: prove that step 6 returns the correct values of x, y
Correctness of RSA
We need to prove that: (xe mod n)d mod n = x
1. Select two large prime numbers, p and q2. Let n = pq; let T = ( p - 1)( q - 1)3. Select a large prime, e (e != 1), such that gcd( e, T) = 14. Calculate d = e-1 mod T5. The public key, Kp is (n ,e)6. The secret key, Ks is d
We will use the following: For any a Zn and non-negative integers i, j(a) (ai mod n) *n (aj mod n) = ai +j mod n(b) (ai mod n)j mod n = aij mod n
and
Fermat's little thoerem:Let p be a prime number. Then, for every nonzero a Zp, ap−1 mod p = 1.
Correctness of RSA…
We first prove that for prime, p (or q), x mod p = xed mod p
ed mod T = 1 there is some integer k such that ed = 1 + kT xed mod p= x1 + k(p-1)(q-1) mod p= x (xk(q-1))(p-1) mod p
case 1. xk(q-1) is a multiple of p x is a multiple of p (since p is prime) xed mod p = 0 = x mod p
case 2. xk(q-1) is not a multiple of p (xk(q-1))(p-1) = 1 (Fermat's little theorem) xed mod p = x * 1 mod p = x mod p
primes: p, qn = pqT = ( p - 1)( q - 1)e chosen such that gcd( e, T) = 1d = e-1 mod T
xed mod p = x mod p (for prime numbers, p, q) xed – x divides p (and q) xed – x = ip = jq xed – x is also divisible by pq [why?] xed – x = k (pq) = k n for some integer k xed = kn + x. Therefore, for 0 ≤ x < n, xed = x
Security of RSA
To show that RSA is secure, we need some guarantee that2.1. A party who knows n, e, and Me mod n, but not p, q, or d cannot compute M2.2. A party who knows n (public key) cannot find its factors p, q (otherwise they could easily calculate d!)
primes: p, qn = pqT = ( p - 1)( q - 1)e chosen such that gcd( e, T) = 1d = e-1 mod T
Given n, e, and Me mod n,Can we work backwards and compute M ?
There is no known efficient algorithm to compute e-th root of a number mod n.[note: if n was always fixed, we could use a computer to build up a look-up decrypting sheet!]
Given n (public key) can we find its factors p, q, and use them to compute T, and then use e to compute d ?
So far, there is no known efficient algorithm to factorize a number.
Discussion
RSA is currently the basis for almost all secure eCommerce
Examples:banks (e.g. try hsbc.com, standardchartered.com.hk, …)signed emails (e.g. HKUST's ITSC)
Once RSA has established a secure communication channel, two waysymmetric encryption is used, usually some variant of DES,which is a block cipher algorithm.
Three important mathematicians whose works were used in this lecture:Euclid (300 BC )Fermat (17th century)Euler (18th century)
References and Further Reading
Simon Singh, The Code Book, pub. Anchor press, 2000
PDF article giving brief introduction to RSA maths (Utah State, Prof Moon)
Wikipedia cryptography portal
Prof Deng Xiaotie/Prof Frances Yao’s lecture notes (City Univ, HK)
Prof M. Golin's lecture notes (CSE, HKUST)
Next: final exams