If you use ssh-keygen with default settings it is likely that the defaults are such that a 2048-bit RSA key pair will be generated.1 After ssh-keygen has finished you will end up with a private key file, id_rsa, and a matching public key file, id_rsa.pub.
The private key file, id_rsa, may look like this:2
The public key file, id_rsa.pub, may look like this:
These files may look somewhat imposing but it is actually relatively easy to decode them. In this blog post we will detail two approaches to extracting the contents of both id_rsa and id_rsa.pub using Python:
Our first approach (“using the mxklabs.rsa package”) is targeted to developers that want to get things done quickly and do not really care how things work – i.e. all the real work is done in a library.
The second approach (“decode the key files manually”) is more manual and explains the individual steps required to decode the key files, providing more insight as to how information is encoded in the key files and essentially revealing how the mxklabs.rsa package is implemented.
Using the mxklabs.rsa package
We have developed a package in the mxklabs PyPI library, named mxklabs.rsa (docs), that is capable of decoding the id_rsa and id_rsa.pub files shown above.
To use mxklabs.rsa, you first need to install the mxklabs library:
With mxklabs installed are now able to import the mxklabs.rsa package. With this package you can decode the key files as follows:
After executing the code above the id_rsa and id_rsa_pub variables are dictionaries containing all the relevant fields.3 To see what these dictionary looks like please see the JSON listings at the bottom of this blog post.
Decode the key files manually
Let’s start with the public key file, id_rsa.pub (see listing above).
The key information is all stored in the character sequence between the ssh-rsa and the mxk@krypton markers, separated by a space character.4 The character string
is a binary blob of data that is encoded using the base64 binary-to-text encoding scheme.
Using Python, we can obtain the underlying binary data as follows:
The binary_data bytes actually encodes three fields: 1) a string that should read ‘ssh-rsa’ 2)
an arbitrary precision integer public exponent and 3) an arbitrary precision integer modulus.
The encoding of these fields is explained in Section 6.6 of RFC 4253
and Section 5 of RFC 4253
and essentially boils down to each of the three fields having a 4-byte length field followed by a variable amount of data.
We can write a bit of Python code to get these fields:
This is our public key decoded! Now let’s turn our attention to the private key, id_rsa (see listing above).
The way in which information is encoded in the private key file, id_rsa, is a little convoluted to explain.
Firstly, the file has a PEM wrapper which wraps the main file content in a header (-----BEGIN RSA PRIVATE KEY-----) and a footer (-----END RSA PRIVATE KEY-----). The key’s information is hiding in the sequence of characters in between the
header and footer which is a blob of binary data that is again text-encoded using the base64 binary-to-text scheme but, this time, the character stream is split over multiple lines. Let’s write a bit of code to get the binary data:
The method by which the key information is encoded within binary_data in the private key file differs from the public key file.
The binary blob is the result of encoding an ASN.1 schema using BER. The relevant schema is listed below (taken from RFC 3447):
For the purpose of this blog post, let’s assume the ASN.1 schema is saved in a file, rfc3447.asn, in the same directory as the python code. With this schema in place we can decode the binary blob as follows:
We now bring together all the code we have so far into a single class:
Again, after executing the code above the id_rsa and id_rsa_pub variables are dictionaries containing all the relevant fields.3 To see what these dictionary looks like please continue reading the next section.
Results
Finally, time to look at the results! As it turns out it is just as well that Python supports arbitrary precision integers by default because the numbers involved are absolutely enormous.
The result of decoding our private key file, id_rsa:
The result of decoding our public key file id_rsa.pub:
For the purpose of this post we only look at the case where the private key file does not have a passphrase applied. ↩
Note that the code above assumes your key files are in your current directory. Typically ssh-keygen generates the files in ~/.ssh. You may want to move the key files to your current directly or pass in an absolute path to to the files instead. ↩↩2
In my case this is mxk@krypton but generally it will be username@hostname. ↩