Final Report for the course 'Block Cipher'

Weiwen Chen

10152510217

School of Computer Science and Software Engineering

East China Normal University

## Symmetric Cryptographic Algorithm

In network communications security, there are two cryptographic algorithms. One is Symmetric Cryptographic Algorithm(SCA), and anoter is asymmetric cryptographic algorithm. The encryption key in SCA can be computed by its decryption key. The other way around. In most SCA, the encryption key and decryption key is the same. it needs sender and receiver agreed a key before their secure communication. The security of SCA relied on its key, the leak of key means any one can encrypt and decrypt the message. The key must keep secret if the communication need to keep secret.

There are two SCA. One is Stream Algorithm(cipher) which computes one single bit(sometimes byte) at one time, while another is Block Cipher which computes a group of bits of the message. We call these groups 'block', corresponding to Block Cipher/Block Algorithm. A classic grouping length of Modern cryptographic algorithm is 64 bit, considering the difficulty of cryptanalysis and ease of use. Afterwards, as the development of cryananlysis ability, the block length grows to 128 bit or more. There 5 general parts in SCA(as figure1).

- Plain Text(明文): The original message.
- Encrypt(加密算法): Use key as parameter, do some replace transition roles and steps. The result of this step is cipher text.
- Secret Key(密钥): The parameter of Decryption and Decryption Algorithm. This part directly affects the result of transition for plaintext.
- Ciphertext(密文): The result of conversion for plaintext.
- Decryption(解密算法): The inverse trasformation of Encryption, whose input is ciphertext and parameter is secret key. The result of this step is plaintext.

## Common Math Operation in SCA

There are some mathenmatic operations in Symmetric Crytographic Algorithm. The only purpose of these operations is to confuse the plaintext as chaos as possible, thus increase the difficulty of cryptanalysis.

### Shift & Cyclic Shift

`Shift`

is move left or right a length of digits in a integer number. When we do `Cyclic Right Shift`

operation, we need to move the last digits to the front of this number, `Cyclic Left Shift`

is opposite. For example, when we cyclic right shift the decimalism integer `12345678`

for one digit, the result is `81234567`

, and the result of `cyclic left shift`

one digit is `23456781`

.

### Replace

Replace some digits with the another digit, as the role of `Replace Table`

. It does not look as neat as `Shift`

operation, but looks chaos. This is what encryption needs and frequently used.

### Extend

Extend a seg of digits to a longer digits. There are a number of Extend methods such as replacement: Extend each digit by extending the `Replace Table`

.

### Compress

Compress a seg of digits to a shorter digits. There are a number of Compressing methods such as Replacement: Use Table to role each value of replacement after compressing.

### Xor

This is a bool operation. Role is following:

`1 xor 1 = 0`

`0 xor 0 = 0`

`1 xor 0 = 1`

`0 xor 1 = 1`

This can be easily considered as if the two digits in operation are the same, it returns `0`

, else return `1`

.

### Iteration

Iteration is repeats the same operation for a number of times. This is largely used in cryptography algorithms because this make it harder to cryptanalysis the ciphertext.

## DES Algorithm

Now we introduce a popular SCA: Data Encryption Standard(DES).

DES was a SCA developed by IBM. American National Standards Institute (ANSI) published it for a data encryption standard for Non-essential departments in 1977. For the past 30 years, it played a significant role in international secret communication.

DES is a SCA, classic DES encrypt data for 64 bits a block. Encryption and Decryption is one the same algorithm. The length of secret key is 56 bit(each 8th bit is used for odd-even check), the secret key can be any 56-bit-long number, and can be modified at any time. There are few keys which are considered as `weak key`

. They are considered can be easily cryptanalysis, but we can easily avoid them. As a result, the security is based on the secret key.

### DES Framework

Firstly, we need to generate a secret key to encrypt the data. We get a 64-bit-long password from the user. Then we generate a group of 16 keys by dividing, shifting, selectingm and iteration, which is used in each rounds operation.

DES do operation on plaintext group `M`

whose length is 64 bit. With a initial transformation, `M`

is transformed to `m0`

. Then we divide `m0`

to the left half part and right half part `m0 = (L0, R0)`

, with each length is 32 bit. Then we do 16 rounds totally the same operation(Iteration). We call the operation function `F, and in each round, the data are combined with the secret key in this round.

In each round, shift the secret key, and then select 48 bit in 56 bit of the secret key. Expand the right part of data to 48 bit length and generate a new 48 bit data by a `xor`

operation. After that, compress it into 32-bit. Thess 4 steps form the function `F`

. Next, we use another `xor`

operation to combine the output of f-function and the left half part. The result of it becomes new right half part. The original right half part becomes new left half part. We repeat this operationg for 16 times.

After 16-round iteration, the left and right part combine together by a end-transform(data processing). In this way we finish the encrytion process. The steps is as fig2 following.

### DES Decryption

As we know the replace, transform, xor and iteration operations in Encryption Process. Maybe you think the Decryption Algorithm is the inverse operation of Encryption, which is totally different with Encryption. On the contrary, we have some cryptology methods so that DES have a novel performance: The algorithm for Encryption and Decryption is the same!

The only difference between Encrytion and Decrytion is the order of their keys. If the keys for Encrytion is `k1, k2, k3, ..., k16`

, then the keys for decryption is `k16, k15, k14, ..., k1`

. Maybe this is why we call DES a Symmetric Cryptographic Algorithm.

### Security and Development of DES

The security of DES first depends on the length of its secret key. The longer the key is, harder coderbreakers do the cipheranalysis will be. For now, on the basis of processing and computing speed of mordern computer, 56-bit-long keys can already can be broken. 128-bit-long keys, however, are considered scure. But as time goes on, this number will be broken in some time.

In addition, some modification and improvement is two other ways to increase the scurity of DES.

For example, `3-DES`

which was developed later, uses 3 independent secret keys to encrypt data for 3 times, and this largely improves the security of pure `DES`

. if 56-bit DES need \(2^{56}\) operations, then `3-DES`

need \(2^{112}\) operations.

For another example, as independent child secret keys are different in different rounds, this means the length of the key turns from 56-bit to 768-bit. There are a number of other forms of `DES`

such as `DESX`

, `CRYPT`

, `GDES`

, `RDES`

, etc. The transform and improvement of these all are increase the difficulty of cipheranalysis and increase the effciency of computing.

### F-Function and S-box

- Each row of a S-box is a transfomation (integer 0~15)
- The output of each S-box is not a linear function or affine function of input
- Change one bit input of S-box, the output of it must change two bit at least
- For any S-box, input any x, \(S(x)\) and \(S(x \oplus 001100)\) must have 2 bits which is different.
- For any S-box, any input \(x\), and any \(e\), (\(f \in \{0, 1\}\)), \(S(x) \ne S(x \oplus 11ef00)\)
- For any S-box, When its any one input digit keeps, other 5 input digits changes, the total number of
`0`

and`1`

almost equal.

What S-box is

- Each S-box is a 4-row, 16-col table. Each item in the box is a 4-bit integer. The 6-bit input of S-box determine which row and which colomn it belongs to.
- Suppose the 6-bit input of S-box is \(b_1, b_2, b_3, b_4, b_5, b_6\), then \(b_1\) and \(b_6\) correspond \(0\) to \(3\), we can select one row in the table. \(b_2\) and \(b_5\) correspond \(0\) to \(15\), from this we can select one colomn in the table.

Why S-box

- The other parts of
`DES`

are all linear, but`S-box`

is not.`S-box`

is hard analysised, and provides better security, so S-box is where the shoe pinches. - Provides confusing role which a cryptology algorithm need.
- Modify one bit of
`S-box`

will affect at least two bits of output.

## Differential Cryptanalysis

- Difference Uniformity: One of require of designing DES is to ensure the possibility distributing uniformly. As a matter of fact, for a settled secret key, the distribution of its difference is not pseudorandom(for all
`S-box`

). - Basic idea: By analysising the effort of differecn of plaintext to the difference of ciphertext, we can recover some bits of the secret key.

**Differential Cryptanalysis(DC)** is a select-plaintext attack, the basic idea of it is: Try to get the secret key as long as possible by analysising the effect of some special difference of plaintext to the difference of ciphertext. It can be used to attack some block cipher(including `DES`

) and any cipher which iterate a settled round function. It was proposed by **Biham** and **Shamir** in 1991.

DC refers to ciphertext with some fetures and comparing it with plaintext. Analyzer try to find some ciphertext pair which have some difference, in which some of them have a high probability to reappear. DC uses these features to calculate the probability of possible key which may be the most-possible key at last. It is said that this kind of attack largely relied on the structure of `S-box`

, but some optimization of `DES`

can resist this kind of DC.

In addition, the number of rounds in block cipher has large influence on DC. If we just repeat 8 rounds, it can be broken in few minutes in a personal computer. In 16 rounds, however, Differential Crytanalysis is just a litte more effective than exhaustive search. What's more, if we increse to 17 or 18 rounds, the two methods almost spend the same time. Moreover, if the rounds turn to 19, it seems that exhaustive search becomes easier.

Althrouth Differential Cryptanalysis can break the cipher, but due to it needs a great many time and data, it is not largely used.

**Difference**: \(X^*\), \(X'\): At any intermediate point during the encrytion of pairs of messages, \(X\) and \(X^*\) are the corresponding intermediate values of the two executions of the algorithm, and \(X'\) is defined to be \(X' = X \oplus X^*\)

As the definition in slices of this course, we calculate the difference by `xor`

operation, in `unsigned int`

, `xor`

has the same impact as `minus`

, which is what difference means.

**r-round difference**:

\[ \triangle{X}\stackrel{r-rounds}{\longrightarrow}\triangle{Y}\] \[ DES: (\alpha, 0)\stackrel{2-rounds}{\longrightarrow}\triangle{(\alpha, *)}\]

By analysising the effect of some special plaintext difference to plaintext difference, we may get the most possible key, as we metioned before.

### General steps in 1-R Attack

- Find a \(r-1\) round high-probability difference \((X', Y')\), suppose the probability is \(P\);
- Ensure some bits in the r-th round, donote as \(l-bit\). For each candidate key, set corresponding counter.
- Select plaintext pairs which satisfy the difference, and get the corresponding ciphertext pair. Donote the number of plaintext pairs is \(m \approx c \cdot \frac 1 p\), \(c\) is constant.
- Filtration the plaintext pairs, and try to decrypt using each guessed round key. Compute the difference after encoding. If it catches, the
`counter++`

. - After processing all processing all plaintext pairs, the ans is key with max counter
- Sampling, Denosing, Fetching information.

### An example of S-box

### An Example of construct a Difference Distribution Table

Consider the input xor is `34`

, the possible output xor is:

output: | 1 | 2 | 4 | 7 | 8 | D | F |
---|---|---|---|---|---|---|---|

counter: | 8 | 16 | 6 | 2 | 12 | 6 | 8 |

Among them, \(34 \oplus 4\) has two possibilities, this kind of input is in-pair(\((\alpha, \beta)\) and \((\beta, \alpha)\)). Construct Difference Table of `S1`

Distribution, when the input is `13`

and `27`

, the difference value is computed following:

List the possible input which xor value is `34`

in `S1`

:

### Key Determine Theory

Suppose `01`

and `35`

are two inputs of the known `S1`

, their xor result is `34`

. After `S1`

, xor value is `D`

. Seeing the Distribution Table of `S1`

Difference, the input xor is `34`

when output xor is `D`

.

As a matter of fact, input xor is `34`

, is none of key's business, because:

Then we have:

So the possible keys are: `{07, 11, 17, 1D, 23, 25, 29, 33}`

Therefore, suppose the two input of `S1`

is `21`

and `15`

, their xor result is `34`

, output xor is `3`

. By scanning the table, when input xor is `34`

, output xor is `3`

:

Then we have:

The possible keys are: `{00, 14, 17, 20, 23, 34}`

, `{07, 11, 17, 1D, 23, 25, 29, 33}`

### Difference Cipheranalysis of 2-R DES

Question: How to find a 'good' difference feature ?

In frist round, the difference of input of F-funtion `a' = 60 00 00 00`

. After E(Extend) in F, we put `a'`

into every S-box middle 4 bits. Their order is: `S1: 6 = 0110, S2: 0 = 0000, S3, ..., S8`

. Because all edge bit is `0`

, so `S1`

si the only S-box whose difference is non-zero we get. The input of `S1`

difference is `0 0110 0 = 0C`

, while all other S-boxes' input are all `0`

.

查`S1`

的差分分布表，发现当输入异或`x' = 0C`

时，最可能的差分输出`y' = E = 1110`

，出现的概率是`14/64`

；所有其他S盒的输入一定是`x' = 0`

且 `y' = 0`

。 S盒的输出通过置换`Ｐ`

成为F函数的输出。如前所述， F函数的差分输出是`A' = P(E0 00 00 00) = 00 80 82 00`

. 而 A'与L'异或后得到 `00 00 00 00`

，因为 `A' = 00 80 82 00`

, `L' = 00 80 82 00`

.所以, 在第2轮后所有S盒都得到差分输入0，产生的差分输出也是0； F函数的输出在2轮后是0，差分输出则是 `(00 00 00 00 , 60 00 00 00)`

Attack Process of 2-R DES Algorithm:

- 产生明文对(P, P∗)，使得$P' = P P^* = $
`00 80 82 00 60 00 00 00`

. - 对于产生的明文对(P, P∗)，计算密文对(T, T∗)(选择明文攻击）
- 计算T' = T⊕T∗，检查结果是否等于
`00 00 00 00 60 00 00 00`

,如果不相等，就说明特征不符，这个明文对就不能用。重返第1步，产生新的明文对；如果相等，则特征相符，进入第4步 - 因为S2, . . . , S8的差分输入都为0，所以没有信息可以从S2K, . . . , S8K得到。在S1的差分分布表中，使用前边讲到的方法得到S1K的可能的密钥集合
- 计算这些集合的交集，因为正确的密钥必定同时出现在每张表中如果有不止一个S1K值，就说明还需要更多的明文和密文差分对才能唯一确定密钥S1K，转到第1步，计算更多的数据(需要的明文密文差分对的数量，大致等于使用的特征概率的倒数，本例中需要64/14 ≈ 5)对如果只得到一个S1K ，就是正确的，转到第6步
- 恢复S1K的6个比特。同样，恢复S2K~ S8K的密钥
- 其余的8比特密钥，采用穷举法搜寻