Decoding OpenSSH RSA certificates with Python

If you use ssh-keygen with default settings it is likely that the defaults are such that a 2048-bit RSA key pair will be generated.1 After ssh-keygen has finished you will end up with a private key file, id_rsa, and a matching public key file, id_rsa.pub.

The private key file, id_rsa, may look like this:2

-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEA1hrwwcOZoc6EUjlHDxzY0rV8UU6NdDogeeFUfD/3KqXkaeDX
cq/fCkQa57tjWdqfFAifk9GjfZD+wKmM2NByPk+hAsdrx3I8D9eWMbkhsYPGj9Dg
vT+wEiN+Bl8GUt4ix3Tk69evvVf9yeVgKglc4DSfni7SO58Idm6qyfgH7ICsWG1v
eArBUjgdryWzsG1fbWioGmrcwPVL6YBlCRMuQShrrwCzwiJxPrebPA2fiSfcrjCF
oPBNipZT8VFHB3iThQ9TVFR/tB9C7f6n3DvrWfaIfkDqIm3LTLLKFS+Y5u00GLZi
EaGDWsOgixJqpeE2pV3rR9jUfs9h6FQY5iH6mQIDAQABAoIBABbaK0ZTLUuy8jag
fHAlgRMEYe9/teNo7Nx1a4IThbscl8OhRv2rvd+no0OGobUOe5o0zWuGna+iUT6Z
GjpuDTOPZj0Yse1IyRZbyWEnRGxhB0mEXuh0KsPU2/esHs2rfgTR+jkd/Vj1UlZB
UEFMXIhltX+5uaC5ebrCVyJVgesBIuTMTqm7OiO+W9gIwkC3w0tXqwptYhqdTuIZ
+bud5vb/b2yLdlM3RttKCo4vFZubujsiBLeopzT6soZPVkEAy4O9n5u8DGBfBTIB
eyR/TKeZ5JQsL2VorUELzxVLEWM0CFdnlb22hDBSyJ1X6XErz3dJCEGNw9CDQ6uI
11kQu90CgYEA8NYEPlC/jQkpZ+G0w4cFUjnq191giRMWYJ3LbW8ikaT3iNLlONd6
M/NQxKTm5+zztk40S/kfzWK0VLzBUbEdjhov40R3sSh3M3zulodmyCElCAZkqj6L
6XoqpVyUKFLrHKVpyxV69sEfnr2irbi3LCj+mMwJ6RWWt9f7j2JxV/8CgYEA45YO
cyMy6H4ty+39ro+7kRhNA80JSF7gzdHXmDTVi3xyO+cEgGsE+Zrb2KsS0NZO8fjq
2MsndhrbiYs6/qhCVeA8dLu8vkfmXYDZIRsQv9z5my6ZxH9+F8cAUzue9v25Ifl3
89vTVHchUcN2cXWN2xamQ9wYgPmh6YgJrHfwbWcCgYEAo5HHugcfwgtJ6vsZyX7H
t2wMu2XordCf7yjcxEup3996G5yZAH0gy23jGluhVD3T3KrKzBq8ZcM3FSJJ7lDr
8NqKUcHrxQ/lvbuJVAVMYnpYa1XkQthOMFm/4yW4npaKhp819i91n2fVMPw9I94D
0mNZX6+cv4jhH6X6fgzvTEMCgYB7/Y/HyMB+i+f1d6bDCMm+pgenb3iENjSxzYZx
BS/me5lc62K3eBbQyj7GT4XDw05lZCDGlf/cx4sd533vqcniMXWef324CUIHZSBm
efFpJkHS+tOJi5At8hxKPGxB0j+fs+NXN0dueCzt99i6vbnYSAGzbODou5grvBLR
JNMXNwKBgQCmRPLPwcpFXi0T4dCC8FAnYgYZwCt3J9khmZQ/c3FokJYO5z3gQtcZ
CYY80q7kXa+l/CqJ2wgjV8qKqEW2zWZ3/krqwU4Tf+RjOcDTHJ79Ufa80hjAljzQ
0ccKbQNLilbW2G0LOOYIGTgOfddTvoQ7KOaFeGBKh7Nwmbs6FWDRCQ==
-----END RSA PRIVATE KEY-----

The public key file, id_rsa.pub, may look like this:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDWGvDBw5mhzoRSOUcPHNjStXxRTo10OiB54VR8P/cqpeRp4Ndyr98KRBrnu2NZ2p8UCJ+T0aN9kP7AqYzY0HI+T6ECx2vHcjwP15YxuSGxg8aP0OC9P7ASI34GXwZS3iLHdOTr16+9V/3J5WAqCVzgNJ+eLtI7nwh2bqrJ+AfsgKxYbW94CsFSOB2vJbOwbV9taKgaatzA9UvpgGUJEy5BKGuvALPCInE+t5s8DZ+JJ9yuMIWg8E2KllPxUUcHeJOFD1NUVH+0H0Lt/qfcO+tZ9oh+QOoibctMssoVL5jm7TQYtmIRoYNaw6CLEmql4TalXetH2NR+z2HoVBjmIfqZ mxk@krypton

These files may look somewhat imposing but it is actually relatively easy to decode them. In this blog post we will detail two approaches to extracting the contents of both id_rsa and id_rsa.pub using Python:

  • Our first approach (“using the mxklabs.rsa package”) is targeted to developers that want to get things done quickly and do not really care how things work – i.e. all the real work is done in a library.
  • The second approach (“decode the key files manually”) is more manual and explains the individual steps required to decode the key files, providing more insight as to how information is encoded in the key files and essentially revealing how the mxklabs.rsa package is implemented.

Using the mxklabs.rsa package

We have developed a package in the mxklabs PyPI library, named mxklabs.rsa (docs), that is capable of decoding the id_rsa and id_rsa.pub files shown above.

To use mxklabs.rsa, you first need to install the mxklabs library:

pip install mxklabs

With mxklabs installed are now able to import the mxklabs.rsa package. With this package you can decode the key files as follows:

import mxklabs.rsa as mxkrsa

# Decode the key files.
id_rsa = mxkrsa.RsaUtils.private_key_from_file('id_rsa')
id_rsa_pub = mxkrsa.RsaUtils.public_key_from_file('id_rsa.pub')

After executing the code above the id_rsa and id_rsa_pub variables are dictionaries containing all the relevant fields.3 To see what these dictionary looks like please see the JSON listings at the bottom of this blog post.

Decode the key files manually

Let’s start with the public key file, id_rsa.pub (see listing above).

The key information is all stored in the character sequence between the ssh-rsa and the mxk@krypton markers, separated by a space character.4 The character string is a binary blob of data that is encoded using the base64 binary-to-text encoding scheme.

Using Python, we can obtain the underlying binary data as follows:

file_as_str = open('id_rsa.pub', 'r').read()
file_parts = file_as_str.split(' ')
binary_data = base64.b64decode(file_parts[1])

The binary_data bytes actually encodes three fields: 1) a string that should read ‘ssh-rsa’ 2) an arbitrary precision integer public exponent and 3) an arbitrary precision integer modulus. The encoding of these fields is explained in Section 6.6 of RFC 4253 and Section 5 of RFC 4253 and essentially boils down to each of the three fields having a 4-byte length field followed by a variable amount of data.

We can write a bit of Python code to get these fields:

def _get_chunk(binary_data):
    """ Convenience function for extracting a field. """
    # Get the first four bytes to see how big the field is.
    chunk_len = struct.unpack('>I', binary_data[:4])[0]
    # Get the field data and return the field data and remaining bytes.
    chunk = binary_data[4:4+chunk_len]
    remaining_binary_data = binary_data[4+chunk_len:]
    return chunk, remaining_binary_data

# Discard the key type.
_, binary_data = _get_chunk(binary_data)
# Get the public exponent.
public_exponent_as_bytes, binary_data = _get_chunk(binary_data)
public_exponent = int.from_bytes(public_exponent_as_bytes, byteorder='big')
# Get the modulus.
modulus_as_bytes, binary_data = _get_chunk(binary_data)
modulus = int.from_bytes(modulus_as_bytes, byteorder='big')

id_rsa_pub = { 'publicExponent': public_exponent, 'modulus' : modulus }

This is our public key decoded! Now let’s turn our attention to the private key, id_rsa (see listing above).

The way in which information is encoded in the private key file, id_rsa, is a little convoluted to explain. Firstly, the file has a PEM wrapper which wraps the main file content in a header (-----BEGIN RSA PRIVATE KEY-----) and a footer (-----END RSA PRIVATE KEY-----). The key’s information is hiding in the sequence of characters in between the header and footer which is a blob of binary data that is again text-encoded using the base64 binary-to-text scheme but, this time, the character stream is split over multiple lines. Let’s write a bit of code to get the binary data:

# Read the file.
file_as_str = open('id_rsa', 'r').read()
# Tokenise the file on spaces. 
lines = file_as_str.split('\n')
binary_data = base64.b64decode("".join(lines[1:-2]))

The method by which the key information is encoded within binary_data in the private key file differs from the public key file. The binary blob is the result of encoding an ASN.1 schema using BER. The relevant schema is listed below (taken from RFC 3447):

Foo DEFINITIONS ::= BEGIN

    RSAPrivateKey ::= SEQUENCE {
        version           Version,
        modulus           INTEGER,  -- n
        publicExponent    INTEGER,  -- e
        privateExponent   INTEGER,  -- d
        prime1            INTEGER,  -- p
        prime2            INTEGER,  -- q
        exponent1         INTEGER,  -- d mod (p-1)
        exponent2         INTEGER,  -- d mod (q-1)
        coefficient       INTEGER,  -- (inverse of q) mod p
        otherPrimeInfos   OtherPrimeInfos OPTIONAL
    }

    Version ::= INTEGER { two-prime(0), multi(1) }
        (CONSTRAINED BY {-- version must be multi if otherPrimeInfos present --})

    OtherPrimeInfos ::= SEQUENCE SIZE(1..MAX) OF OtherPrimeInfo

    OtherPrimeInfo ::= SEQUENCE {
        prime             INTEGER,  -- ri
        exponent          INTEGER,  -- di
        coefficient       INTEGER   -- ti
    }

END

For the purpose of this blog post, let’s assume the ASN.1 schema is saved in a file, rfc3447.asn, in the same directory as the python code. With this schema in place we can decode the binary blob as follows:

# Get the ASN.1 schema filename.
asn1_filename = os.path.join(os.path.dirname(__file__), 'rfc3447.asn')
# Create an ASN.1 parser and decode the binary_data.
asn1_parser = asn1tools.compile_files(asn1_filename)
id_rsa = asn1_parser.decode('RSAPrivateKey', binary_data)

We now bring together all the code we have so far into a single class:

import asn1tools
import base64 
import os
import struct

class RsaUtils:

  # Get the ASN.1 schema filename.
  _asn1_filename = os.path.join(os.path.dirname(__file__), 'rfc3447.asn')
  # Create an ASN.1 parser and decode the binary_data.
  _asn1_parser = asn1tools.compile_files(RsaUtils._asn1_filename)

  @staticmethod
  def _get_chunk(binary_data):
    """ Convenience function for extracting a field. """

    # Get the first four bytes to see how big the field is.
    chunk_len = struct.unpack('>I', binary_data[:4])[0]
    # Get the field data and return the field data and remaining bytes.
    chunk = binary_data[4:4+chunk_len]
    remaining_binary_data = binary_data[4+chunk_len:]
    return chunk, remaining_binary_data

  @staticmethod
  def public_key_from_file(filename):
    """ Extract public key file contents as a dictionary. """

    # Read file and extract base64.
    file_as_str = open(filename, 'r').read()
    file_parts = file_as_str.split(' ')
    binary_data = base64.b64decode(file_parts[1])

    # Discard the key type.
    _, binary_data = RsaUtils._get_chunk(binary_data)
    # Get the public exponent.
    public_exponent_as_bytes, binary_data = RsaUtils._get_chunk(binary_data)
    public_exponent = int.from_bytes(public_exponent_as_bytes, byteorder='big')
    # Get the modulus.
    modulus_as_bytes, binary_data = RsaUtils._get_chunk(binary_data)
    modulus = int.from_bytes(modulus_as_bytes, byteorder='big')

    return { 'publicExponent': public_exponent, 'modulus' : modulus }

  @staticmethod
  def private_key_from_file(filename):
    """ Extract private key file contents as a dictionary. """

    # Read the file.
    file_as_str = open(filename, 'r').read()
    # Tokenise the file on spaces.
    lines = file_as_str.split('\n')
    binary_data = base64.b64decode("".join(lines[1:-2]))

    # Decode the ASN.1.
    return RsaUtils._asn1_parser.decode('RSAPrivateKey', binary_data)

id_rsa_pub = RsaUtils.public_key_from_file('id_rsa.pub')
id_rsa = RsaUtils.private_key_from_file('id_rsa')

Again, after executing the code above the id_rsa and id_rsa_pub variables are dictionaries containing all the relevant fields.3 To see what these dictionary looks like please continue reading the next section.

Results

Finally, time to look at the results! As it turns out it is just as well that Python supports arbitrary precision integers by default because the numbers involved are absolutely enormous.

The result of decoding our private key file, id_rsa:

{'coefficient': 116758294178732838885105744341633644921763211062663132745544055895619456777096222431423722773941703797717936012123913292225114521067723059265149148444079134017533481951565287978590993333728671240319821133827525821450859742654286893385432995774549014485190657668683181165825048880021553787513381923543389884681,
 'exponent1': 114862376654771865380937670149105279394388373918006748554361341777593787811818231273028765860493877067230872236989692147991226974674653150054587747216423280227454163409925892001963367677897697389847664370079686145964748466404214852074411124092258910905222622275379230315156285169491375751494741405547740744771,
 'exponent2': 87069072653226489052694356695353451393070653237832443381173179445799625322887624657686998500013382432920943636845172957181336207049334470799179618237636085269399545515915196522377201606466101625784044575109204643339998708538542940892201618192883554104251865573592604111343099238409278656147660761049928505143,
 'modulus': 27028282097021007843530736295809028972287130804943291526394511671226637062421093678344947754892846146059686005906984121188582642418038460041083600507140145943897000826762732168113824111207488605430021018578532026899183882944211773425219978820952105786648393593320324798767475547131763638447797531944528917873480907614201192767300205075705600683404678612905224986079088992401745551569587852046445566476066671216315954347018322178278064801182356599994103339896616982382386482818232381955997033858646828579104353888626693549547004894914282809654656967807891475124321438211744767717141904033565635881130868926714032028313,
 'prime1': 169120792137309513187088856430138902645863513771076998494915397454071219964124181055031031165311658272227307267767371083617556160033536507720058349381562591635026476767457778619502442710933957759620012577158733592776880327171553745375315805972329811597033879177316362633155792077305751221624112410311659706367,
 'prime2': 159816435078406509257707045364609414478299157015763194002855248882211736305449832157704322103217393768557341636407284696675346058014066187166106557637304470530755860929184546547739382766642456301610668794816746806065634935204886954691281821915894398132764445150442192848203128267403133882872125564960906440039,
 'privateExponent': 2884825873455634982765422591653328008013007613723214737127570824728478969919824683461600463028143778196858318374648731673926722061036956650249168951087863662931771682915075629277449374519681749164334605260491501413854635720200212934821761018242519187292758490398945206026801523600205786821831083753482456940700642886567334632806866154561643229048204968416590653083011456592885100845684068581744708201271429057369752682800424896682884664678389361219311676426749175151672276654798395520657684706739514313327224914994205349484686588943683805431564995082748257037012514787348880091998443183294215049247894220140272729053,
 'publicExponent': 65537,
 'version': 0}

The result of decoding our public key file id_rsa.pub:

{'modulus': 27028282097021007843530736295809028972287130804943291526394511671226637062421093678344947754892846146059686005906984121188582642418038460041083600507140145943897000826762732168113824111207488605430021018578532026899183882944211773425219978820952105786648393593320324798767475547131763638447797531944528917873480907614201192767300205075705600683404678612905224986079088992401745551569587852046445566476066671216315954347018322178278064801182356599994103339896616982382386482818232381955997033858646828579104353888626693549547004894914282809654656967807891475124321438211744767717141904033565635881130868926714032028313,
 'publicExponent': 65537}
  1. This is the default on Ubuntu 18.04. 

  2. For the purpose of this post we only look at the case where the private key file does not have a passphrase applied. 

  3. Note that the code above assumes your key files are in your current directory. Typically ssh-keygen generates the files in ~/.ssh. You may want to move the key files to your current directly or pass in an absolute path to to the files instead.  2

  4. In my case this is mxk@krypton but generally it will be username@hostname