July 2024 - Michael's Programming Bytes

Hello everybody,

Michael here, and in today’s post, we’re going to cover file encryption with Python. The previous two posts simply covered text encryption, but today, we’re going to explore something a little different-encrypting and decrypting files!

This will be the file we’ll work with for this tutorial-

2020 NBA playoffs Download

Yes, this is an old file and if you want to read the post where I originally used this dataset, here’s the link-R Analysis 10: Linear Regression, K-Means Clustering, & the 2020 NBA Playoffs (written in November 2020).

And now, to start the encryption!

To start with our encryption, let’s import the Fernet class from the cryptography.fernet module like so:

from cryptography.fernet import fernet

Next, let’s create our Fernet key and a file that will store the key:

key = Fernet.generate_key()

with open('C:/Users/mof39/OneDrive/Documents/filekey.key', 'wb') as fileKey:
    fileKey.write(key)

Using the with open() method, we store our Fernet key into a file in (this case) the Documents directory. This method takes two parameters-the file path (where we will store the key in this case) and the mode you wish to open the file in. The mode is a two-character string value with the following options for modes:

First string (denotes method to open the file)

r-reads file into the IDE, errors out if the file doesn’t exist or path provided is incorrect
a-appends contents to an existing file or creates the file to append content to if the file provided doesn’t exist
w-writes content to an existing file or create the file to write content to if the file provided doesn’t exist
x-creates the file in the specified file path, errors out if file already exists

SECOND STRING (denotes method to handle the file)

t-handles file in text mode
b-handles file in binary mode (this is good for handling images)

Now, what does the key look like?

In this example, we wrote our encryption key into a file called filekey.key and stored in the Documents folder.

Something to note: The encryption keys should be saved as a .key file, but if you want to view the key file, opening it with a text editor like Notepad will work.

The actual file encryption

Now that we have the encryption key file, let’s encrypt the file! Here’s how to do so:

with open('C:/Users/mof39/OneDrive/Documents/filekey.key', 'rb') as fileKey:
    key = fileKey.read()
    
fernetKey = Fernet(key)

with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs.xlsx', 'rb') as testFile:
    originalFile = testFile.read()
    
encryption = fernetKey.encrypt(originalFile)

with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs-encrypted.xlsx', 'wb') as encryptedFile:
    encryptedFile.write(encryption)

with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs-encrypted.xlsx', 'rb') as encryptedFile:
    encryptedFile.read()

So, what exactly am I doing here? Let me explain

I first read the file key that we generated in the previous section into the IDE.
I then created a Fernet key object from the file key we generated.
I then read the dataset we’re using into the IDE-note the originalFile variable.
I encrypted the originalFile using the Fernet key we created earlier-note the encryption variable.
Finally, I encrypted the file using the encryption variable and saved it to my Documents folder.

Now what does the encrypted file look like:

In this example, our test Excel file looks like a bunch of gibberish after being encrypted-and that’s the point of the encryption as its supposed to make the file unreadable during transmission from point A to point B.

Excel files such as this one might not open after they are encrypted as the encryption process could also possibly corrupt the file. In this case, if you want to see the contents of the Excel file, opening it with Notepad (as I did here) should do the trick.

It’s decryption time!

Now that we have successfully encrypted our file, assume we want to prepare it before it reaches its intended recipient. In this case, it’s time to decrypt the file! Here’s how to do so:

decryption = fernetKey.decrypt(encryption)

with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs-decrypted.xlsx', 'wb') as decryptedFile:
    decryptedFile.write(decryption)
    
with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs-decrypted.xlsx', 'rb') as decryptedFile:
    decryptedFile.read()

How did I decrypt the file? Let me explain:

I used the Fernet key we generated earlier for file encryption to decrypt the file.
I then created a decrypted file (which is the same thing as our original file) and read that file into the IDE.

What does our decrypted file look like? Let’s take a look:

Ta-da! Our decrypted file is the same as our original file, just with the -decrypted at the end of the file name

My advice: Although you don’t absolutely need to use different file names for the encrypted and decrypted versions of the file, I like to do so to be able to tell the difference between the encrypted and decrypted files.

Notice a familiar concept?

If you read my 6th anniversary post, you may recall that I discussed the concepts of symmetric and asymmetric-key encryption.

What does this type of encryption/decryption look like to you? If you guessed symmetric-key encryption, you’d be correct! Fernet key encryption-the method we used to encrypt/decrypt this file-is symmetric key encryption because it uses the same key to encrypt and decrypt the file. Granted, I also mentioned that symmetric-key encryption is less secure than asymmetric-key encryption; there are likely many ways to encrypt/decrypt the file using asymmetric-key encryption, but I thought Fernet key encryption would be an easy enough method to utilize to demonstrate basic file encryption/decryption with Python.

Just one more thing…

Six years into this blogging journey, I still strive to find ways to improve how I get my content to you-the readers. With that said, I will now upload scripts I use in my posts to my GitHub so that you can download and play along with the scripts too!

Here’s the link to the repo with the scripts-mfletcher2021/blogcode: Various scripts from Michael’s Programming Bytes (github.com). The script for this lesson is fileencryption.py.

Thanks for reading,

Michael

Hello readers,

Michael here, and in today’s post, we’re exploring a special topic-the mathematics behind encryption algorithms. More specifically, we’ll explore the mathematics behind the RSA asymmetric-key encryption algorithm that we discussed in the post 6 (honestly wanted to write this post because I love math and think the mathematics behind encryption algorithms are interesting).

The mathematical public exponent

Before we dive into the mathematics of encryption, let’s review what the public exponent does! As I mentioned in 6, the public exponent is a crucial part of the RSA algorithm’s public key that is utilized to verify both data encryption and access signatures for anyone trying to access the data.

I also mentioned the use of the number 65537 as a public exponent. Why is that such a common value for the public exponent? Well, 65537 is what’s known as a Fermat number.

The fun Fermat numbers

What exactly is a Fermat number? A simple explanation would be that a Fermat number is an integer that can be derived from the following expression:

The x in this case represents any positive integer, including 0. In simple terms, a Fermat number can be derived from 2 to the power of 2^x plus 1. The first five Fermat numbers are 3, 5, 17, 257 and 65537-pretty impressive range if I do say so myself.

Now, fun historical nerdy fact for this post-Fermat numbers were named after 17th century French mathematician Pierre de Fermat who first discovered these numbers. He’s also known for his early contributions to calculus, number theory, and probability, among other fields.

One of his more notable mathematical contributions is Fermat’s last theorem, which can be best described with this equation:

de Fermat stated that there are no three positive integers for a, b, and c that can satisfy this equation if n is greater than 2 (quite the opposite of the Pythagorean theorem where a^2+b^2=c^2).

Fermat numbers aren’t required as public exponents for RSA keys but they are quite practical. They allow for efficient encryption and decryption and are more secure than non-Fermat numbers.

Now, how does this relate to RSA?

Good question. The magic number 65,537 (the default public exponent for RSA keys) is the perfect public exponent for RSA public keys because for one, it’s a prime number (and prime numbers usually make better public exponents), and for two, its neither too small nor too large of a public exponent.

Smaller public exponents like 3, 5 and 17 would make the data more vulnerable to attacks since hackers could use these low public exponents to decrypt the data without knowing the private key. On the other hand, a larger public exponent like 4,294,967,297 utilizes a lot of computing power while providing no significant security advantage. The public exponent 65,537 strikes the perfect balance between secure encryption and computational overhead.

Another thing worth noting about public exponents is that if you’re trying to figure out a good public exponent to use for your RSA encryption algorithm, prime numbers will do the trick (more so if they are Fermat prime numbers), as they provide mathematical efficiency (after all, prime numbers are only divisible by 1 and themselves) and more security during the encryption/decryption process since they make it much harder for hackers to figure out the public exponent and in turn, the RSA keys.