logosecureDATA

Encrypted File Archiving Format

This is a file archive that automatically encrypts all the files that are added too it using AES 128 bit encryption. This format does not include compression options. Nor does it natively support organizing files into directories. The focus of this format is encryption with the idea that users will add, remove, and read files individually and not all at once.

The EFA format was developed expressly for use in secureDATA however others may find it useful. You may download EFA here.

How the EFA format works

The EFA format has three constituent parts. The first is the Control Length Mark (CLM). This mark is 12 bytes long and appears at the tail end of the archive. It tells the program how long the Control Index is. The Control Index keeps track of where the start of each files is as well as the length of said file. It is located immediately behind the CLM in the file. Neither the CLM nor the Control Index are encrypted. However the user can obfuscate the location of files by using a secure hash of their names instead of their actual names. Finally stored in front of the Control Index are the actual files themselves.

When a file is added to the EFA it doesn't read the entire file in at once. Instead the EFA reads the file in 1,048,576 byte blocks*. Each block of the file is encrypted using a rolling key. The user has to supply the rolling key generator. The generator should be a python interable object, failing that it should at least have a meathod titled "next" that returns the next key. Each key needs to be 128 bits long. Here is an example of such an a generator:

>>> from Crypto.Hash import SHA256
>>> def SHA256gen(s, prounds=0):
    sha = SHA256.new()
    sha.update(s)
    for x in range(prounds+1): sha.update(sha.digest())
    while True:
        yield sha.hexdigest()
        #you will want to use yield sha.digest() instead,
            i am using hexdigest so it is more readable.

        sha.update(sha.digest())

>>> key_gen = SHA256gen('my favorite password', prounds=32)
>>> key_gen.next()
'cbec70dae0e8063deb01e39b107148455d8935b16fe531be98c19dcf8c7e890e' >>> key_gen.next()
'630c05fafd662226d2cfd5df7b1f5e7fc57c876ad6c84ebb62d7e4a8d8828061' >>> #now the next five in list form
>>> import itertools
>>> list(itertools.islice(key_gen, 5))
[
'8a98ca9276d01adb629fcf5db27d62663ad431e234d9ee9abaecc567e24ae92b',
'28f4657e252a560e23968e8c32d5b346d0285f336877ec6d795a12accd037e8e',
'a8710a9d496a2f3589113135ac861c99675c29e2c3aecdc45330967f2d30d1b6',
'dac8f6ee835a33c55508f9ffab04713e0c28ec4030410d3d0f32ace28a7c9d4a',
'b28ea20f8635682fbe3bd44561f13ed414968477862a52c23e6aaa0b6b001a40'
]

There are three public meathods available to the user. readFile(outputFileObject, filename, keyGen, added), deleteFile(filename), and addFile(inputFileObject, filename, fileLength, keyGen). The addFile meathod returns the number of bytes added to pad the file to the appropriate size. It is necessary to pad files when using AES encryption because it is a block encryption algorithm which requires the file to be evenly divisible by 16. Perhaps in the next version the padding will be handled seemlessly.

Download

The current version avaible is EncryptedFA version is .5 alpha. This is a very early release and therefore the code is still undocumented and a little messy. However if you want to take a look at it you may download source code.

*note the block size is not set in stone and is subject change, however this will be the block size for version 1.6 (final) of secureData.