*Created* on 07 Nov 2020
;
*Modified* on 12 Nov 2020
;
**Translation**:
italian

I think every programmer has a dream of implementing from scratch the basics of matrix calculus :-) And we are no exception. The study of encryption techniques, in particular Hill encryption, gave us the opportunity to pursue this childhood dream ...

Those involved in cryptography know that one possibility to encrypt a text, consists of:

- convert it to numbers [1],
- segment it into pieces of equal length,
- and multiply it, piece by piece, by a square matrix [2]
- of the same size as the segments.

To decipher it, just reverse the process, with the foresight to use
the inverse (always modulo `m`) of the predicted matrix.

*Warning*. Do not use this algorithm for real cases, where
assailants can try to analyze the process. In fact there is a weakness:
if the analyst gets a fragment of ciphertext and its unencrypted counterpart, going back to encryption key to decrypt any text is trivial:
a matrix division is enough.

Given the premise, it is natural to observe that the focal point of the Hill's cipher is given by matrix operations, in particular the product between two matrices [3] and the ability to invert a matrix.

Well. Matrix calculus is an extremely popular tool in the technical and scientific fields. There are plethora of libraries that give us the tools to implement it. For example in Python you can use Numpy or Sympy.

But the indicated libraries lack a fundamental tool in this context.
The inversion **modulo m** of the matrix that interests us. And this gives us
the starting point to invent hot water, developing our tools from scratch
for the matrix calculation.

Using Python, and by keeping us to an educational level, we do not face a difficult effort.

For those in a hurry, here is <https://github.com/l-dfa/py_naive_cryptology/blob/main/source/nmatrix.py> _ a minimum of code, from which to start experimenting and, if you want, improve.

In practice we have the `NMatrix` class (the *N* serves to remind us that we are moving
on a Naive context) which instantiates a matrix object starting from a list of rows,
each of which in turn represented by a list. For example, let's create two arrays
3 x 3 and multiply them:

>>> import nmatrix as nm >>> A = nm.NMatrix([[1,2,3], [10,11,12], [21,22,23]]) >>> print(A) [[1, 2, 3], [10, 11, 12], [21, 22, 23]] >>> B = nm.NMatrix([[10,20,30], [10,11,12], [21,22,23]]) >>> print(B) [[10, 20, 30], [10, 11, 12], [21, 22, 23]] >>> C = A * B >>> print(C) [[93, 108, 123], [462, 585, 708], [913, 1168, 1423]]

Most of the code is simple. The internal representation of the matrix is the same list of lists we used to create it:

def __init__(self, lol): # ... <cut>... self.__m__ = lol self.shape = (lr, lc,)

And all the operations will develop on this list of lists. In particular those for index access:

def __getitem__(self, key): '''access by indices params key int or tuple return requested element remark use: - M[i] to get a row, i starts from 0 - M[i, j] to pick element at row i, col j - you can use M[i][j] too, but this isn't homogeneous with writing by indices ''' if isinstance(key, tuple): x, y = key return self.__m__[x][y] else: return self.__m__.__getitem__(key) def __setitem__(self, key, value): '''write by indices params - key int or tuple - value list or number return None remark use: - M[i] = [row] to write a row, i starts from 0 - M[i, j] = number to write a number at row i, col j ''' # ...<cut>... if isinstance(key, tuple): x, y = key self.__m__[x][y] = value else: self.__m__[key] = value

Method `__getitem__` is used by Python when we attempt to access an object
by index. For example using a list:

>>> l = [1, 2, 3] >>> l[0] 1

Expression `l[0]` requests Python to access element with index `0`
of list `l`. Python executes, calling in turn `l.__getitem__(0)`.

Similarly, `__setitem__` is the method called to *write* the content
of the indicated index element. For example:

>>> l = [1, 2, 3] >>> l[0] 1 >>> l[0] = 10 >>> l[0] 10

In this case when we ask `l[0] = 10`, we are requesting to do so
that the element of index `0` of `l` is `10`. Python executes `l.__setitem__(0, 10)`.
Here there is a complication due to the fact that we access list of
lists, so we may have to use two indices for access: row and column.
For this reason we have to check if our index is a composite element
(a tuple). If so, we need to unpack it and use the expression
`self.__m__[x][y] = value`. In case of `__getitem__` this would not be necessary,
but we would have two different notations for reading versus writing. We would have:

>>> A = nm.NMatrix([[1,2,3],[10,11,12]]) >>> A[1][0] 10 >>> A[1, 0] = 100 >>> A[1][0] 100

To have a homogeneous notation, we implement the tuple checking as well
in `__getitem__`.

Good. Operations can be implemented using method `__add__`
for addition, `__sub__` for subtraction, and so on. In this way
we have the ability to use Python operators directly
on the instances of our class: very elegant.

In matrix calculus, things get complex when we need to calculate
the inverse of a matrix or its determinant. In these cases there are
disturbing elements, which can complicate our life. In particular if
we have to use arithmetic *modulo m*. Indeed, while for real numbers
other than 0, the inverse is an operation that always makes sense,
in the case of modular arithmetic this is not said.

For example, the inverse modulo 26 of the number 12 does not exist. Indeed it must
there exists a number *n* such that 12 * n = 1 (mod 26) or (12 * n) % 26
must give remainder 1. If you try integers from 1 to 25, no remainder of
this expression gives 1.

For this reason some calculation algorithms *jam*, even if
the reverse exists. Which is why in `nmatrix.py` there are two different ways of calculating the determinant and,
above all, of the inverse matrix modulus `m`: `inv_mod0` and `inv_mod`. The first
it is more elegant, but fails when it finds elements on the diagonal for which it cannot
calculate the inverse modulus `m`. We tried to move these elements
with elementary line operations, but it still doesn't work when
you come across an entire column like this (and it can happen).

The second algorithm (`inv_mod`) is more robust, albeit personally
we consider it less elegant.

Enjoy. ldfa

[1] | All subsequent steps are to be carried out using arithmetic
module m. Where m is dimension of the alphabet of the
text to encrypt. |

[2] | This matrix is the encryption key. |

[3] | In this case between a matrix and a vector. But a vector is can be considered as a particular case of matrix, formed by a single column (or row: it depends on how you see it). |