IC211 (Spring 2020)

"Full" shift+ hashing

You may have noticed that shift+ hashing with the vigenere cipher, which should be our best hashing option, actually ignores everything in the message/password that comes after the first 16 characters. This is highly undesirable, although many systems in the past have actually done exactly this (with different hashing algorithms). For our password application, it means that there's no increased security in passwords of more than 16 characters. More generally, it means that we cannot meaningfully hash messages of more than 16 character. For this extra credit we will correct that problem in much the same way that many "real" hashing algorithms do. Namely, we will take the first 16 characters of the message/password and hash as we've done previously. The result of this, however, becomes the "initialization vector" iv that is used to hash the second 16 characters. This result becomes, in turn, the iv value for hashing the 16 characters after that. And so we continue until we run out of characters int he message. The last computed result is "full shift+ hash" of the input message/password.


initialization vector           message/password
GO_NAVY_2018^mid        TheRainInSpainFallsMainlyOnThePlain
    |                   \______________/\______________/\_/
    |                       block 0        block 1      block 2
    |                           |              |         |
    |                           |              |         |
    |                     .----msg-------.     |         |
    `-------------------iv|shift+vigenere|     |         |
                          |  of block 0  |     |         |
                          `--------------'     |         |
                                |              |         |
                     .----------'              |         |
                     |          .--------------'         |
        TgrTfU<O;oaNSQy.        |                        |
                     |    .----msg-------.               |
                     `--iv|shift+vigenere|               |
                          |  of block 1  |               |
                          `--------------'               |
                                |                        |
                     .----------'                        |
                     |          .------------------------'
        biq>n8JC-J8[;A0+        |
                     |    .----msg-------
                     `--iv|shift+vigenere|
                          |  of block 2  |
                          `--------------'
                                |
                                |
                                V
                         v-K2JJX?B@,MIJ0b
                               hash

What you need to do

To get this extra credit, you will replace your implementation of shift+ hashing with "full" shift+ hashing, as described above, so that for longer messages the entire message gets hashed, rather than just the first 16 characters. Since all examples seen so far involve passwords of 16 characters or fewer, there is no output difference for any of them -- i.e. the replacement will be invisible as far as all earlier tests are concerned. However, if properly implemented your Vault program will perform correctly on, for example, the following

vaultEC2

~/$ java Vault vaultEC2
username: flipper
password: ThisIsExactly_16
Access granted!
$ quit

~/$ java Vault vaultEC2
username: flipper
password: ThisIsExactly_16_plus_some
Access denied!

user flipper shift+vigenere 4rgZDs.8u1bjr5A/
user shamu shift+vigenere K>0F_<6vvHfYP:c^

~/$ java Vault vaultEC2
username: shamu
password: ThisIsExactly_16
Access denied!

~/$ java Vault vaultEC2
username: shamu
password: ThisIsExactly_16_plus_some
Access granted!
$ quit

An in-depth example of full shift+ hashing

To help you with debugging, here's an example of "full" shift+ hashing that shows all the intermediate values computed while hashing a longer message.

message/password = i_stay_out_too_late/got_nothing_in_my_brain/thats_what_people_say_mmm-mmm
                   \______________/\______________/\______________/\______________/\_______/
                       block 0         block 1         block 2         block 3      block 4

block 0: iv = GO_NAVY_2018^mid msg-block = i_stay_out_too_l output = pOoM*KD<8R4zgntE

block 1: iv = pOoM*KD<8R4zgntE msg-block = ate/got_nothing_ output = O\n\ek;^Lz.`Sg-.

block 2: iv = O\n\ek;^Lz.`Sg-. msg-block = in_my_brain/that output = N@w\@y/y-;XfaB39

block 3: iv = N@w\@y/y-;XfaB39 msg-block = s_what_people_sa output = S*1qUS;Zl<3<aGg[

block 4: iv = S*1qUS;Zl<3<aGg[ msg-block = y_mmm-mmm        output = 3BA7]rwjY\*8?cdZ

hash = 3BA7]rwjY\*8?cdZ

A little aside, not relevant to the project, but hopefully interesting

The way full shift+ hashing with the vigenere cipher works is not all that dissimilar from how many modern digital hashing algorithms work — we start with some fixed iv, we operate on a block of the message/password at a time, and the output of hashing one block becomes the initialization vector for hashing the next block.

One weakness of full shift+ hashing as we've presented it, is that it is vulnerable to an "extension attack". The idea is this: suppose Alice has a message M for which she computes hash H. She keeps message M secret, and sends H to Bob. Suppose an attacker, Eve, intercepts H. If the length of M is a multiple of 16, Eve can use H as the iv in computing the hash of some message E, and she will have computed the correct hash for ME, i.e. message M concatenated with message E, even though she doesn't know the message M. In many contexts, this weakness is not important, but in some contexts it is. So it would be preferable to have a hash algorithm without it.

I invite you to think about how you might modify full shift+ hashing (assuming the vigenere cipher) to repair this vulnerability.