Understanding the RSA public-key cryptosystem

Posted on 06/07/20

If you are an advanced computer user you have probably used ssh-keygen to generate an RSA key pair to access an online service. In this blog post we explain the principles behind the RSA encryption scheme, explaining mathematics behind these keys and the assumptions that underpin it.

Before we get technical, note that there are more applications for RSA than just SSH authentication. For example, many websites use the RSA cryptosystem to prove to visitors that their content can be trusted (i.e. genuinely served by the company’s web servers as opposed to a dodgy man or woman in a van down the road or an meddling government).¹ Moreover, modern browsers use the RSA cryptosystem to keep your credit card information secure when you shop online.

Background & terminology

To explain some of the mathematical properties of RSA keys we need to understand a number of mathematical concepts. If you are reasonably comfortable with mathematics you’re probably okay to skip this section.

Integer – With integers we mean whole numbers (e.g. $0$ , $1$ , $2$ , …). The integers in this blog post are all non-negative, unless stated otherwise.

Divides – We say an integer, $a$ , divides another integer, $b$ , if there is another integer, $i$ , such that ${a}\cdot{i} = b$ . The important part here is that there is no remainder and $a$ , $b$ and $i$ are all whole numbers.

Greatest common divisor – For two integers, $a$ and $b$ , we say the greatest common divisor of $a$ and $b$ , denoted by $\text{gcd}(a,b)$ , is the largest integer that divides both $a$ and $b$ .

Least common multiple – For two integers, $a$ and $b$ , we say the least common multiple of $a$ and $b$ , denoted by $\text{lcm}(a,b)$ , is the smallest integer that is divisible by both $a$ and $b$ . By definition, both $a$ and $b$ divide $\text{lcm}(a,b)$ .

Prime – We say an integer, $a$ , is a prime number when the only integers that divide $a$ are $1$ and $a$ itself.

Co-prime – We say two integers, $a$ and $b$ , are co-prime (also known as relatively prime) when $\text{gcd}(a,b) = 1$ . In other words, there is no integer greater than $1$ that divides both $a$ and $b$ .

Congruent – Finally, we say that two integers, $a$ and $b$ , are congruent modulo an integer $n$ , denoted by $a \equiv b \text{ mod } n$ , if and only if there is a (possibly negative) integer, $k$ , such that $a - b = {k}\cdot{n}$ . This implies the remainder of dividing $a$ by $n$ is equal to the remainder of dividing $b$ by $n$ .

Fermat’s little theorem – For any prime number, $p$ , and any integer $a$ not divisible by $p$ , the following congruence holds: $a^{(p-1)} \equiv 1 \text{ mod } p \text{ }$ . For this blog post we intend to just use this theorem as is, but you can find more information about it here.

Chinese remainder theorem – Let $a$ and $b$ be any integers and let $p$ and $q$ be two co-prime integers. If the congruences $a \equiv b\text{ mod }p$ and $a \equiv b\text{ mod }{q}$ both hold then the congruence $a \equiv b\text{ mod }(p\cdot{q})$ holds also. Note that this theorem, as stated, is not really an accurate representation of the Chinese remainder theorem – it is more of a consequence. We point any unsatisfied readers to this article to find out more.

The mathematical properties of RSA integers

We start by looking at the integer fields contained within RSA key files and the mathematical properties that make them special. We list both the field names by which these numbers appear in RFC 3447 as well as commonly used mathematical symbols that are often used to represent them.²

Field Name (RFC 3447)	Math Symbol	Description
`modulus`	$n$	A base for modular arithmetic.
`publicExponent`	$e$	See below.
`privateExponent`	$d$	See below.
`prime1`	$p$	A large prime number.
`prime2`	$q$	A large prime number.

In this section, we are interested in the abstract mathematical properties of these integers.³ The integer values are different for every key pair but they always satisfy the following mathematical properties:

The modulus, $n$ , is equal to $p\cdot{q}$ .
The public exponent, $e$ , is co-prime to $\text{lcm}(p-1,q-1)$ .
The private exponent, $d$ , is such that ${e}\cdot{d} \equiv 1 \text{ mod } (\text{lcm}(p-1,q-1))$ .⁴

These properties are met by construction. That is, ssh-keygen makes sure these properties hold for every RSA key pair that it generates. In this blog post we are not interested in how ssh-keygen is able to achieve this –that is a whole other discussion– we are only interested in what it means for us. To this end, we will derive further properties from the ones we have already.

To start with, using the definition of congruent (see here) we can expand property 3. The definition suggests that if ${e}\cdot{d} \equiv 1 \text{ mod } (\text{lcm}(p-1,q-1))$ then there must be an integer, $k$ , such that $e\cdot{d} - 1$ is equal to $k\cdot{\text{lcm}(p-1, q-1)}$ . This equality allows us to state the following property is guaranteed to hold for some integer $k$ :

$\begin{aligned} e\cdot{d} &= k\cdot{\text{lcm}(p-1, q-1)} +1 &\text{(expansion of property 3)} \\ \end{aligned}$

We’ll need this equality for deriving our next property. The remaining properties are all quantified and hold for all integers, $m$ . The first one is this:

$\begin{aligned} m^{e{d}} &\equiv m \text{ mod } p &\text{for all integers } m{} \\ \end{aligned}$

This property is non-trivial to demonstrate. First consider an $m$ such that $p$ divides $m$ . In this case, both the left-hand side and the right-hand side of the congruence are equivalent to $0 \text{ mod } p$ meaning the congruence is trivially true. When $p$ does not divide $m$ things get more difficult. We will derive the congruence step-by-step:

$\begin{aligned} m^{e{d}} &\equiv m^{(k \cdot \text{lcm}(p-1,q-1) + 1)} \text{ mod } p &\text{(using property 4 for some }k\text{)} \\ &\equiv m\cdot{m^{k \cdot \text{lcm}(p-1,q-1)}} \text{ mod } p &\text{(definition of exponentiation)} \\ &\equiv m\cdot{m^{k \cdot({(p-1)}\cdot{i})}} \text{ mod } p &\text{(for some } i \text{ as } (p-1)\text{ divides }\text{lcm}(p-1,q-1)\text{)} \\ &\equiv m\cdot{(m^{(p-1)})^{(k \cdot{i})}} \text{ mod } p &\text{(reshuffling of brackets)} \\ &\equiv m\cdot{1^{(k \cdot{i})}} \text{ mod } p &\text{(Fermat's little theorem))} \\ &\equiv m\cdot{1} \text{ mod } p &\text{(exponentiation of }1\text{)} \\ &\equiv m \text{ mod } p &\text{(multiplicative identity)} \\ \end{aligned}$

If you are struggling to follow some of the steps – don’t worry. The main thing to remember is that $m^{e{d}} \equiv m \text{ mod } p$ holds. Property 5 can be restated to hold modulo $q$ to give us the next property:

$\begin{aligned} m^{e{d}} &\equiv m \text{ mod } q &\text{for all integers } m{} \\ \end{aligned}$

The argument for property 6 is entirely symmetrical to that of property 5 as with $q$ substituted for $p$ – we won’t repeat it here.

Next comes the magical property we were really interested in. I’m going to call it the big cheese property, if you will, as it underpins the security of much of our current digital world:

$\begin{aligned} m^{e{d}} &\equiv m \text{ mod } n &\text{for all integers } m{} \\ \end{aligned}$

This property uses the fact that both $p$ and $q$ are primes and hence are co-prime to each other. This, combined with the fact that both $m^{e{d}} \equiv m \text{ mod } p \text{ }$ (property 5) and $m^{e{d}} \equiv m \text{ mod } q \text{ }$ (property 6) we can apply the Chinese remainder theorem to get the ‘big cheese’ property:

$\begin{aligned} m^{e{d}} &\equiv m \text{ mod } (p\cdot{q}) &\text{(Chinese remainder theorem and properties 5 & 6)}\\ &\equiv m \text{ mod } n &\text{(property 1)} \end{aligned}$

That pretty much concludes the difficult part of this blog post. We will see why this property is so useful in the following section.

NOTE: As a small aside, you may have noticed we haven’t used property 2 in our derivations. As it turns out, property 2 is not useful in establishing the ‘big cheese’ property above but it is extremely relevant if you are in the business of generating RSA keys. As it turns out property 2 is a necessary and sufficient condition on public exponent $e$ that guarantees exactly one private exponent $d$ exists for which property 3 holds (see here).

How RSA is used in SSH authentication

It’s time to talk about the information contained the private key and public key.

Public key – The public key, $\langle n, e\rangle$ , comprises the modulus, $n$ , and the public exponent, $e$ .

Private key – The private key, $\langle n, e, d, p, q\rangle$ , comprises all the integer fields detailed above, i.e. the modulus, $n$ , the public exponent, $e$ , the private exponent, $d$ , and both prime numbers, $p$ and $q$ .

With SSH you install the public key, $\langle n, e\rangle$ , on the SSH server. This tells the server to only allow access to the server to SSH clients with access to the matching private key, $\langle n, e, d, p, q\rangle$ .

NOTE: As you can see, a private key is much like a physical key to your house. Really, anybody with access to your house key can get into your house. Similarly, an SSH server will let in anybody that can demonstrate they have access to the private key that matches the installed public key. In contast, a public key can be shared with others without compromising security.

To determine whether the client has access to the private key the SSH server will pose a the client a cryptographical challenge. This cryptographical challenge involves the encryption and decryption of an integer which we define next.

Encrypt and decrypt functions, $\textbf{Encrypt} : \mathbb{Z}_n \rightarrow \mathbb{Z}_n$ and $\textbf{Decrypt} : \mathbb{Z}_n \rightarrow \mathbb{Z}_n$ , are defined as follows:

$\begin{aligned} \textbf{Encrypt}(m) &= m^e \text{ mod } n &\text{(for any } m < n\text{)}\\ \textbf{Decrypt}(m) &= m^d \text{ mod } n &\text{(for any } m < n\text{)} \end{aligned}$

Note that to implement the $\mathbf{Decrypt}$ function you need access to the private exponent, $d$ , which is part of the private key, but not the public key. In contrast, anyone with access to either the private or public key can implement the $\mathbf{Encrypt}$ function.

Cryptographical challenge – To determine whether an SSH client is allowed to access to the private key the SSH server poses the client with a challenge. If the client answers this challenge correctly, he is granted access and, conversely, is denied access if it’s wrong. The challenge is as follows:

“Using public key $\langle n, e \rangle$ I have calculated the value of $\textbf{Encrypt}(m)$ for some integer $m < n$ that I am not willing to share. The value of $\textbf{Encrypt}(m)$ was $\phi$ . To demonstrate you have access to the private key that matches $\langle n, e \rangle$ please tell me: what was $m$ ?”

For someone with access to the private key it is easy to answer this challenge. Simply compute $\textbf{Decrypt}(\phi)$ and use the result as the answer. To see why this works please follow the derivation below:

$\begin{aligned} \textbf{Decrypt}(\phi) &= \textbf{Decrypt}(\textbf{Encrypt}(m)) &\text{(expanding } \phi \text{)} \\ &= \textbf{Decrypt}(m^e \text{ mod } n) &\text{(expanding }\textbf{Encrypt}\text{)} \\ &= (m^e \text{ mod } n)^d \text{ mod } n &\text{(expanding }\textbf{Decrypt}\text{)} \\ &= (m^e)^d \text{ mod } n &\text{(property of modular arithmetic)} \\ &= m^{ed} \text{ mod } n &\text{(reshuffling of brackets)} \\ &= m \text{ mod } n &\text{(property 7, see above)} \\ &= m &\text{(because } m < n\text{)} \\ \end{aligned}$

It is clear from the result above that someone with access to the private key can easily answer the challenge and hence successfully authenticate with the SSH server.

However, what I think is really fascinating is this: even though the RSA cryptosystem has been around since 1976 and underpins much of the security of our digital infrastructure the jury is still out on whether someone without access to the private key and with malicious intent is able to do the same.

To the best of our knowledge nobody to date has demonstrated that it is either practical or intractible to answer the crypographical challenge without access to the private key.⁵

Why intractibility of integer factorisation is important

As explained in painstaking detail above, the RSA cryptosystem relies on the intractibility of figuring out the private exponent, $d$ , from the public key $\langle n, e \rangle$ . As suggested, whether or not this is tractible or intractible is not known at this time. However, what we can show you is that if the integer factorisation of $n$ into it’s prime factors, $p$ and $q$ , was known to be tractible, then so is finding $d$ .

The easiest way of demonstrating by assuming we have got such a prime factorisation algorithm and all that remains is to calculate te private exponent, $d$ , given the public exponent, $e$ , the modulus, $n$ , and the modulus’ prime factors, $p$ and $q$ . We will make some code available below that does just that.

As it turns out, it really isn’t that hard. Computing a multiplicative inverse –which is what $d$ is to $e$ – can be done with the extended Euclid’s algorithm for calculating greatest common multiples. The extension also returns Bézout’s identity which, as it happens, is exactly what we need:

def compute_gcd(a, b):
    """ Compute greatest common divider (and first
        part of Bezout identity). """
    s, old_s = 0, 1
    r, old_r = b, a
    while r != 0:
        quotient = old_r // r
        old_r, r = (r, old_r - quotient * r)
        old_s, s = (s, old_s - quotient * s)
    return old_r, old_s

def compute_lcm(a, b):
    """ Compute least common multiple. """
    gcd, _ = compute_gcd(a, b)
    return (a * b) // gcd

def compute_private_exponent(n, e, p, q):
    """ Compute private exponent for RSA key. """
    lcm = compute_lcm(q-1, p-1)
    _, d = compute_gcd(e, lcm)
    return d

To confirm this works we’ve taken $n$ , $e$ , $p$ and $q$ from a previous blog post to try and see if the code computes the private exponent, $d$ , in a reasonable amount of time:

e = compute_private_exponent(n=27028282097021007843530736295809028972287130804943291526394511671226637062421093678344947754892846146059686005906984121188582642418038460041083600507140145943897000826762732168113824111207488605430021018578532026899183882944211773425219978820952105786648393593320324798767475547131763638447797531944528917873480907614201192767300205075705600683404678612905224986079088992401745551569587852046445566476066671216315954347018322178278064801182356599994103339896616982382386482818232381955997033858646828579104353888626693549547004894914282809654656967807891475124321438211744767717141904033565635881130868926714032028313, p=169120792137309513187088856430138902645863513771076998494915397454071219964124181055031031165311658272227307267767371083617556160033536507720058349381562591635026476767457778619502442710933957759620012577158733592776880327171553745375315805972329811597033879177316362633155792077305751221624112410311659706367, q=159816435078406509257707045364609414478299157015763194002855248882211736305449832157704322103217393768557341636407284696675346058014066187166106557637304470530755860929184546547739382766642456301610668794816746806065634935204886954691281821915894398132764445150442192848203128267403133882872125564960906440039, e=65537)
print(f'e={e}')

On my PC this code returns in about 13ms with the following value:

e=2884825873455634982765422591653328008013007613723214737127570824728478969919824683461600463028143778196858318374648731673926722061036956650249168951087863662931771682915075629277449374519681749164334605260491501413854635720200212934821761018242519187292758490398945206026801523600205786821831083753482456940700642886567334632806866154561643229048204968416590653083011456592885100845684068581744708201271429057369752682800424896682884664678389361219311676426749175151672276654798395520657684706739514313327224914994205349484686588943683805431564995082748257037012514787348880091998443183294215049247894220140272729053

As you can confirm yourself, this is indeed the matching private exponent from the previous blog post. As a result we can conclude that the intractibility of calculating $n$ ’s prime factors indeed underpins the RSA cryptosystem’s security.

NOTE: Note that the implication only goes one way. It may well be possible to obtain the private component, $d$ , even if the integer factorisation of the modulus, $n$ , is found to be intractible. That is, there may be other ways of breaking RSA.

You can check this yourself. For example, if you head to, e.g., google.com and click on the little padlock symbol next to the address bar you can inspect the details of Google’s SSL certificate and confirm it was signed using the very same maths described in this blog post. ↩
It’s worth noting that the private key file has additional fields we are not discussing here, notably exponent1, exponent2 and coefficient. These fields exist to help make decryption computationally more efficient but have little bearing on what we are discussing in this post. ↩
If you would like to get a flavour of what actual values may look like please have a look at this blog post. Note of caution: these integers are huge. ↩
Note that in properties 2 and 3 we could have used $(p-1)(q-1)$ instead of $\text{lcm}(p-1, q-1)$ , although the former has slightly nicer properties, see this page for details. ↩
For the types of keys commonly used in practiced and with reasonable resources. ↩