How do you trust a file?

I was asked this question by someone wanting to ensure their files have not been hacked. In the press there are articles where bad guys have replaced some code with code that steals credentials, or it allows an outsider access to your machine. One common solution to trusting a file uses cryptography.

There are several solutions that do not work

Checking the date of a file.

This does not work because there are programs that allow you to change the date and time of files.

Checking the number of bytes

You can list a file’s properties. One property is the size of the file. You could keep a list of file, and file size.

There are two problems

  1. You can change the contents of the file without changing the size of the file. I’ve done this. Programs used to have a patch area where skilled people could write some code to fix problems in the program.
  2. Someone changes the size of the file – but also changes your list to reflect the new size, and then changes the date on file and your list so they look as if they have not changed.

Hashing the file contents

Do a calculation on the contents of the file. A trivial function to implement and easy to exploit, is to treat each character as an unsigned integer, and add up all of the characters.

A better hashing function is to do a calculation cs = mod(c **N,M). For example when the current character is 3, n is 7 and m is 13; find the remainder of 3*3*3*3*3*3*3 when divided by 13, the answer is 3. N and M should be very large. Instead of using one character you take 8 or more. You then apply the algorithm on the file.

cs = 0
do 8 bytes of the file at a time
cs = mod((cs + the 8 bytes)** N,M)
end
display cs

Some numbers N and M are better that others. Knowing the value cs, you cannot go back and recreate the file.

If you just store the checksum value in a file, then the bad guys can change this file, and replace the old checksum with the new checksum of the file with their change. It appears that doing a checkum on the file does not help.

Cryptography to the rescue

To make things secure, there are several bits of technology that are required

  • Public and private keys
  • How do I trust what you’ve sent me

Public and private keys

Cryptography has been around for thousands of years. This typically had a key which was use to encrypt data, and the same key could be used to decrypt the data.

The big leap in cryptography was the discovery of asymmetric keys where you need two keys. One can be used for encryption, and you need another for decryption. You keep the one key very secure (and call it the private key) and make the other key publicly available (the public key). Either key can be used to encrypt, and you need the other key to decrypt.

The keys can be used as follows

  • You encrypt some data with my public key. It can only be decrypted by someone with my private key.
  • I can encrypt some data with my private key and sent it to you. Anyone with my public key can decrypt it. In addition, because they had to use my public key, then they know it came from me (or, to be more accurate, someone with my private key).

How do I trust what you’ve sent me

I would be very suspicious if I received an email saying

This is your freindly bank heree. Please send us your bank account details with this public key. Pubic keys are very safe and we are the only peoples who can decrypt what you send me.

Digital certificates and getting a new passport

A public certificate has

  • Your name
  • You address such as Country=GB, Org=Megabank.com,
  • Your public key
  • Expiry date
  • What the certificate can be used for

I hope the following analogy explains the concepts of digital certificates.

Below are the steps required to get a new passport

  • You turn up at the Passport Office with your birth certificate, a photograph of you, a gas bill, and your public certificate.
  • The person in the office checks
    • that the photo is of you.
    • your name is the same as the birth certificate
    • the name on the gas bill matches your birth certificate
    • the address of the gas bill is the same as you provided for your place of residence.
  • The office creates the passport, with information such as where you live (as evidenced by the gas bill)
  • The checksum of your passport is calculated.
  • The checksum is encrypted with the Passport Office’s PRIVATE key.
  • The encrypted checksum and the Passport Office’s PUBLIC key are printed, and stapled to the back of the passport
  • The passport is returned to you. It has been digitally signed by the Passport Office.

How do I check your identity?

At the back of MY passport is the printout of the Passport Offices’ public key. I compare this with the one attached yo your passport – they match!

I take the encrypted checksum from your passport, and decrypt it using the Passport Office’s public key (yours or mine – they are the same). I write this on a piece of paper.

I do the same checksum calculation on your passport. If the value matches what is on the piece of paper, then you can be confident that the passport has not been changed, since it was issued by the Passport Office. Because I trust the Passport Office, I trust they have seen your birth certificate, and checked where you live, and so I trust you are who you say you are.

But..

Your passport was issued by the London Passport Office, and my passport was issued by the Scottish Passport Office, and the two public certificates do not match.

This problem is solved by use of a Certificate Authority(CA)

Consider a UK wide Certificate Authority office. The Scottish Passport Office sent their certificate (containing, name address and public key) to the UKCA. The UKCA did a checksum of it, encrypts the checksum with the UKCA PRIVATE key, attached the encrypted checksum, and the UKCA public certificate to the certificate sent in – the same process as getting a passport.

Now when the Scottish Passport office process my passport, they do the checksum as before, and affix the Scottish Passport Offices’ public certificate as before… but this certificate has a copy of the UKCA’s certificate, and the encrypted checksum stuck to it. The passport now has two bits of paper stapled to it, the Scottish Passport Office’s public certificate, and the UKCA’s public certificate.

When I validate your passport I see that the London Passport office’s certificate does not match the Scottish Passport Offices certificate, but they have both been signed by the UKCA.

  • I compare the UKCA’s public certificates – they match!
  • I decrypt the checksum from the London office using the UKCA’s certificate and write it down
  • I do the same checksum calculation on the London offices’s certificate and compare with what is written down. They match – I am confident that the UKCA has checked the credentials of the London office
  • I can now trust the London certificate, and use it to check your passport as before.

What happens if I do not have the UKCA certificate

Many “root” certificates from corporations, are shipped on Windows, Linux, z/OS, Macs etc. The UKCA goes to one of these corporations, gets their certificate signed, and includes the corporations certificate attached to the UKCA certificate. Of course it costs money to get your certificate signed by one of these companies

You could email the UKCA certificate with the public key to every one you know. This has the risk that the bad guys who are intercepting your email, change the official UKCA certificate with their certificate. Plan b) would be to ship a memory stick with the certificate on it – but the same bad guys could be monitoring your mail, and replace the memory stick with one of theirs.

How does this help me trust a file?

The process is similar to that of getting a passport.

My “package” has two files abx.txt and xyz.txt

At build time

  • Create the files abc.txt and xyz.txt
  • Calculate the checksum of each file, and encrypt the value – this produces a binary file for each abc.txt.signature
  • Create a directory with
    • Your public certificate/public key
    • A directory containing all of the signature files
    • A list of all of the files in the signature directory
    • A checksum of the directory listing. directory.list.signature

You ship this file as part of your product.

When you install the package

  • Validate the certificate in the package against the CA stored in your system.
  • Decrypt the list of files in the directory (directory.list.signature). Check the list of files is valid
  • For each line in the directory list, go through the same validations process with the file and it’s signature.

For the paranoid

Every week calculate the checksum of each file in the package and sent it to a remote site.

At the remote site compare the filename, checksum combination against last week’s values.

If they do not match, the files have been changed.

Of course if your system has been hacked, the bad guys may be intercepting this traffic and changing it.

How do I do it?

I have a certificate mycert.pem, and my private key mycert.private.pem. It was signed by ca256.

Build

Run the command against the first file

openssl dgst -sign mycert.key.pem abc.txt   > abc.txt.signature

Move the abc.txt.signature to the package’s directory,

Create the trust package

/
.. /mycert.pem
.. /directory.list.txt
.. /directory.list.txt.signature
.. /signatures/
.. .. /abc.txt.signature
.. .. /xyz.txt.signature

Validate the package

Validate the certificate in the package.

openssl verify -CAfile ca256.pem mycert.pem 

extract the public key from the certificate.

openssl x509 -pubkey -noout -in mycert.pem > mycert.pubkey

validate the checksum of the abc file using the public key.

openssl dgst -verify ./mycert.pubkey  -signature abc.txt.signature  abc.txt

Does it work with data sets ?

On z/OS I created a signature file with

openssl dgst -sign tempcert.key.pem  "//'COLIN.JCL(ALIAS)'"  > jcl.alias.signature

and validated it with

openssl dgst -verify tempcert.pubkey -signature jcl.alias.signature  "//'COLIN.JCL(ALIAS)'"