加密学之贰 - Group & Number Theory

Posted on 2023-06-02 繁/简： set

COMP3357 digest 的 P2 部分，主要介绍群论 (group theory)，数论 (number theory) 的相关知识以及它们在加密学中的应用。

该部分的内容主要是为了 P3 部分做铺垫：P3 中的困难问题 (hard problems) 与单向函数 (one-way function) 的构建与这一节里提到的数论/群论知识存在紧密的联系。

同样，与 CnBlogs 中的笔记不同，这里不会详细的记载每个相关的知识点；只会尝试简略地总结一些能让未来的自己也会感到有趣，巧妙的内容 (Eureka!)。

This article is a self-administered course note.

It will NOT cover any exam or assignment related content.

Multiplicative Inverse

Definition. If for a given integer \(b\), there exists an integer \(c\) such that \(bc=1\mod{N}\), we say that \(b\) is invertible modulo \(N\), and call \(c\) a multiplicative inverse of \(b\) modulo \(N\).

The multiplicative inverse is NOT unique. For an invertible \(b\), if \(c\) is an inverse of \(b\), every \(x\) congruent with \(c\) modulo \(N\) is also an inverse of \(b\). (\(x\equiv c\mod{N}\)). We use \(b^{-1}\) to denote the inverse of \(b\) that \(b^{-1}\in\{1,2,...N-1\}\).

Theorem. Let \(b,N\) be integers, with \(b\geq 1\), and \(N>1\). Then \(b\) is invertible modulo \(N\) if and only if \(b\) is relatively prime to \(N\). Note that integers \(b,N\) are relatively prime (or coprime) means \(\gcd(b,N)=1\).

Therefore, for a prime \(p\), every \(b\in \{1,2,...p-1\}\) is invertible modulo \(p\).

The above theorem could be proved by applying Bezout's identity: it states that for every positive integer \(a,b\), there exists a pair of integers \(\langle X,Y \rangle\) that \(Xa+Yb=\gcd(a,b)\).

Eureka moment!

逆元这个概念算是老熟人了；它将模意义下的除法转化为乘法。

求逆元的常见方法：扩展欧几里得算法，欧拉定理，费马小定理。这些方法在接下来都会进行介绍。

Euler's Totient Function

Definition. Euler's Totient Function \(\varphi(N)\) is defined as the number of positive integers \(\leq N\) that are relatively prime to \(N\), where \(1\) is counted as being relatively prime to all integers.

The general case of Euler's Totient is \(\varphi(n)=n\Pi_{p|n}(1-\frac{1}{p})=n(1-\frac{1}{p_1})(1-\frac{1}{p_2})...(1-\frac{1}{p_r})\). This is essentially the Inclusion-Exclusion principle.

Code Implementation: Linear Sieve

To calculate a specific totient of \(n\), we could design a trivial \(O(n\log n)\) algorithm based on the above formula. However, there exists a sieve algorithm that could calculate totients of number \(1\) to \(n\) in linear time \(O(n)\). Below is the code in C++:

// the Euler sieve
// bool array: is_prime, int array: prime, phi

void sieve() {
    for (int i = 1; i < N; ++i)    is_prime[i] = 1;
    cnt = 0;
    is_prime[1] = 0;
    phi[1] = 1;
    for (int i = 2; i <= N; ++i) {
        if (is_prime[i]) {
            prime[++cnt] = i;
            phi[i] = i - 1;     // for a prime p, phi[p] = p - 1
        }
        for (int j = 1; j <= cnt && i * prime[j] <= N; ++j) {
            is_prime[i * prime[j]] = 0;
            if (i % prime[j])
                phi[i * prime[j]] = phi[i] * phi[prime[j]];
               else {
                phi[i * prime[j]] = phi[i] * prime[j];
                break;
            }
        }
    }
}

Reference & Explanation: (OI Wiki) Linear Sieve for Euler's totient function

Euler's Theorem

Theorem. Let \(n\) be a positive integer, and let \(a\) be an integer that is relatively prime to \(n\). Then \(a^{\varphi(n)}\equiv 1\pmod{n}\), where \(\varphi(n)\) is Euler's totient function of \(n\).

Since \(\gcd(a,n)=1\), \(a\) is invertible modulo \(n\). According to Euler's Theorem, \(a^{\varphi(n)}\equiv 1\pmod{n}\). Therefore \(a\times a^{\varphi(n)-1}\equiv 1\pmod{n}\). \(a^{\varphi(n)-1}\) is a multiplicative inverse of \(a\) modulo \(n\).

An informal proof. (need the knowledge of \(\mathbb{Z}_n^{*}\), multiplicative group of integers modulo \(n\)).

For \(a\in\mathbb{Z}_n^{*}\), \(\{r_1,r_2,...r_{\varphi(n)}\}=(\mathbb{Z}/n)^{*}=\{ar_1,ar_2,...ar_{\varphi(n)}\}\).

Therefore \(r_1r_2...r_{\varphi(n)}\equiv (ar_1)(ar_2)...(ar_{\varphi(n)})\), which reduces to \(1\equiv a^{\varphi(n)}\). This cancellation is allowed since all \(r_i\) have multiplicative inverses modulo \(n\).

Reference & Explanation: (Brilliant Math Wiki) Proof of Euler's Theorem

Eureka moment!

欧拉函数也接触了很长一段时间了，刻在 DNA 里的 \(\varphi * \mathbf{1}=\mathbf{Id}\)。

这里主要想讲下关于欧拉函数的一般公式：\(\varphi(n)=n\Pi_{p|n}(1-\frac{1}{p})\)。本质上这是一个容斥原理的应用。

举个例子就容易理解了：假设 \(n\) 只有两个质因子 \(p_1,p_2\)，那么展开上式可得 \(\varphi(n)=n-\frac{n}{p_1}-\frac{n}{p_2}+\frac{n}{p_1p_2}\). 用自然语言描述，即在 \(1-n\) 中与 \(n\) 互质的数的个数为 \(n\) 减去是 \(p_1\) 倍数的数的个数，再减去是 \(p_2\) 倍数的数的个数。此时，是 \(p_1p_2\) 倍数的数将会被重复减去两次；根据容斥原理，我们将多减的个数重新加回来，因此最后再加上是 \(p_1p_2\) 倍数的数的个数。

关于欧拉函数线性筛算法的设计也十分巧妙，它是建立在质数线性筛的基础上的。具体可见 OI Wiki 中的相关解释，讲的很清楚。

Extended Euclidean Algorithm (eGCD)

Euclidean Algorithm (辗转相除法) is one of the oldest and most widely known algorithms. It is a \(O(\log n)\) method of computing the greatest common divisor (GCD) of two integers \(a,b\).

According to Bezout's Identity, there exists integers \(X\) and \(Y\) that \(Xa+Yb=\gcd(a,b)\) given \(a,b\). The extended Euclidean algorithm is the method of computing such \(X\) and \(Y\).

Assuming two adjacent levels in the GCD recursive process:

current \(X\)	current \(Y\)	current \(a\)	current \(b\)
unknown \(X\)	unknown \(Y\)	\(a\)	\(b\)
known \(X'\)	known \(Y'\)	\(b\)	\(a\% b\)

\[ \begin{aligned} Xa+Yb&=X'b+Y'(a\% b) \\ Xa+Yb&=X'b+Y'(a-\lfloor \frac{a}{b}\rfloor b) \\ Xa+Yb&=Y'a+(X'-Y'\lfloor \frac{a}{b} \rfloor)b \\ \end{aligned} \]

Therefore \(X=Y', Y=X'-Y'\lfloor \frac{a}{b} \rfloor\).

Code Implementation

int exgcd(int a, int b, int& x, int& y) {
    if (b == 0) {
        x = 1, y = 0;
        return a;
    }
    int gcd = exgcd(b, a % b, x, y); 
    int _x = x;
    x = y, y = _x - y * a / b;       // from the above equities
    return gcd;
}

Using eGCD to get Inverse

Given \(b,N\), if \(\gcd(b,N)=1\), \(b\) is invertible modulo \(N\). We could apply eGCD algorithm to find multiplicative inverse of \(b\) modulo \(N\).

Run the above function exgcd(b,N,x,y), we could obtain \(x\) and \(y\) that \(xb+yN=\gcd(b,N)=1\). Therefore \(xb\equiv 1 \pmod{N}\). \(x\) is a multiplicative inverse of \(b\) modulo \(N\).

Eureka moment!

不得不感叹聪明的人与时代无关。这一这么古老的算法也花了我一段时间去弄懂。

无论是 GCD 还是 exGCD，其核心在于欧几里得公式 \(\gcd(a,b)=\gcd(b,a\%b)\). exGCD 只是在这一基础上结合了裴蜀定理递推计算对应的 \(x\) 和 \(y\)。

这里提供一个非正式的证明：首先是一个基本事实：对于互质的 \(\langle k_1,k_2 \rangle\)，\(\langle k_1\% k_2, k_2\rangle\) 也一定互质。这一为了方便用群论的知识做个解释：\(\gcd(k_1,k_2)=1\), 所以 \(k_1 \in (\mathbb{Z}/k_2)^{*}\)。自然 \(k_1\% k_2 \in (\mathbb{Z}/k_2)^{*}\)，所以 \(\gcd(k_1\% k_2,k_2)=1\)。

那么对于任意正整数 \(a,b\)，设 \(d=\gcd(a,b)\)，则有 \(a=k_1d, b=k_2d\)，且 \(\gcd(k_1,k_2)=1\). 那么 \(a\%b=a-\lfloor \frac{a}{b} \rfloor b=a-\lfloor \frac{k_1}{k_2} \rfloor b=k_1d- \lfloor \frac{k_1}{k_2} \rfloor k_2d=(k_1\% k_2)d\)。根据上面的事实，我们有 \(\gcd(k_2,k_1\% k_2)=1\)，所以 \(\gcd(b, a\%b)=d\gcd(k_2, k_1\% k_2)=d\)。

Chinese Remainder Theorem (CRT)

今有物不知其数，三三数之剩二，五五数之剩三，七七数之剩二，问物几何？ —— 孙子算经

The Chinese remainder theorem (CRT) is a theorem which gives a unique solution to a simultaneous linear congruences with coprime moduli. In its basic form, the CRT will determine a number \(p\) that, when divided by some given divisors, leaves given remainders.

If \(m_1,...,m_k\) are pairwise coprime, i.e., \(\gcd(m_i,m_j)=1\) for \(i\neq j\), then the following system \[ \begin{cases} \ x\equiv a_1 \pmod{m_1} \\ \ x\equiv a_2 \pmod{m_2} \\ \ ..., \\ \ x\equiv a_k \pmod{m_k} \end{cases} \] has exactly one solution in the sense of module \(M=\Pi_{i=1}^k m_i\) for any \(a_1,a_2,...,a_k\in \mathbb{Z}\).

We define \(M_i=M/m_i\) and \(t_i\equiv M_i^{-1}\pmod{m_i}\), then the solution to this congruences is \(x\equiv \sum_{i=1}^k a_iM_it_i \pmod{M}\). Note that \(x\) is unique modulo \(M\).

The proof of correctness and uniqueness of CRT: My CnBlogs or OI Wiki-CRT

Code Implementation

long long CRT(int k, long long* a, long long* m) {
    long long M = 1, ans = 0;
    for (int i = 1; i <= k; ++i)
        M = M * m[i];
       for (int i = 1; i <= k; ++i) {
        long long Mi = M / m[i], ti, _;
        exgcd(Mi, m[i], ti, _); // get inverse: ti * Mi mod m[i] = 1
        ans = (ans + a[i] * Mi % M * ti % M) % M;
    }
    return (ans % M + M) % M;
}

Eureka moment!

数论的定理多如牛毛，在众多熟悉或不熟悉的外国名字中看到一个「中国」剩余定理还是非常之亲切的。

CRT 提供了互质模数的线性同余方程组的解。在密码学中 CRT 经常被运用于简化模意义下求幂的运算。具体来说，对于合数模数 \(M\) 且 \(\gcd(M,a)\neq 1\) 的情况，可以应用 CRT 简化 \(a^e\mod{M}\) 的计算。此外，CRT 还能用于加速 RSA 的解密过程 (RSA decryption)。 \[ m=c^d\mod{N} \Rightarrow \begin{cases} m\equiv c^d\mod{p} \\m\equiv c^d \mod{q} \end{cases} \Rightarrow \begin{cases} m\equiv (c\mod p)^{d\mod{(p-1)}}\mod{p}\\ m\equiv (c\mod q)^{d\mod{(q - 1)}}\mod{q} \end{cases} \] Bob (Receiver) 是知道 \(N\) 的两个 factor \(p\) 与 \(q\) 的，因此其先将朴素的解密过程转化为这一线性同余方程组，并应用 CRT 得到方程的解，也即消息 \(m\)。这一解密方式比直接计算 \(c^d\mod{N}\) 快了 \(4\) 倍左右。

P.S. RSA 问题是一个著名的困难问题 (hard problems)，我将会在 P3 中进行介绍。

Quadratic Residue

If there is an integer \(0<x<p\) such that \(x^2\equiv a\pmod{p}\). We call \(a\) a quadratic residue modulo \(p\). Otherwise (such \(x\) doesn't exist), we call \(a\) a quadratic nonresidue modulo \(p\).

Note that \(p \nmid a\) to exclude the trivial cases \(x=0\).

The quadratic residue problem requires us to solve the congrence \(x^2\equiv a\pmod{p}\). Essentially it is to find the square root of \(a\) in the sense of modulo \(p\). Usually, the moduli \(p\) is an odd prime.

Legendre Symbol

The legendre symbol is a number theoretic function \((\frac{a}{p})\) which is defined to be equal to \(\pm 1\) depending on whether \(a\) is a quadratic residue modulo \(p\). \[ (\frac{a}{p})=(a|p)=\begin{cases} \ 0, \ if \ \ p|a \\ \ 1, \ if \ a \ is \ a \ quadratic \ residue \ modulo \ p \\ -1 , \ if \ a \ is \ a \ quadratic \ nonresidue \ modulo \ p \end{cases} \]

### Euler's Criterion

For an odd prime \(p\) and \(p \nmid a\), the Legendre symbol \((\frac{a}{p})\equiv a^{(p-1)/2} \pmod{p}\).

For an odd prime \(p\) and \(p \nmid a\), according to Euler's critetion, we could determine whether \(a\) is a quadratic residue modulo \(p\) by checking whether \(a^{(p-1)/2}\mod{p}\) equals to \(1\) or not.

See the prove here.

Eureka moment!

虽然在本课程中关于二次剩余问题的内容只是浅尝辄止，但我觉得还是非常有必要将其单独拿出来进行介绍，因为这个问题的研究在数论上占有举足轻重的地位。

二次剩余问题本质上是模意义下的开根问题，实际上是 RSA 问题的铺垫：RSA 问题本质上是模意义下的开 \(e\) 次根问题。因此，当 \(e=2\) 时，RSA 问题与二次剩余问题等价。

Group Theory: Introduction

Group theory is the study of groups. Groups are sets equipped with an operation that satisfies certain basic properties. They are closure, identity, inverse, and associativity.

Set \(\mathbb{G}\) and a binary operation \(\circ\) defined on \(\mathbb{G}\) that:

Closure: For all \(g_1,g_2\in \mathbb{G}\), it holds that \(g_1 \circ g_2\in \mathbb{G}\)
Identity: There exists an identity element \(e\in \mathbb{G}\) s.t. \(e\circ g=g=g \circ e\) for all \(g\in \mathbb{G}\)
Inverse: For every element \(g\in \mathbb{G}\) there exists an inverse element \(h\in \mathbb{G}\) s.t. \(g\circ h=e=h\circ g\)
Associativity: For all \(g_1,g_2,g_3\in\mathbb{G}\), \(g_1\circ (g_2\circ g_3)=(g_1\circ g_2)\circ g_3\)

Order of a finite group \(\mathbb{G}\), denoted \(|\mathbb{G}|\), is the number of elements in \(\mathbb{G}\).

If group \(\mathbb{G}\) is an Abelian group, it will satisfiy the additionally commutativity: For all \(g_1,g_2\in\mathbb{G}\), \(g_1\circ g_2=g_2\circ g_1\).

Subgroups

Definition. If \(\mathbb{G}\) is a group, a set \(\mathbb{H}\subseteq \mathbb{G}\) is a subgroup of \(\mathbb{G}\) if \(\mathbb{H}\) itself forms a group under the same operation. \(\mathbb{H}\) is a strict subgroup of \(\mathbb{G}\) if \(\mathbb{H} \neq \mathbb{G}\).

Note that every group \(\mathbb{G}\) always has the trivial subgroups \(\mathbb{G}\) and \(\{e\}\).

Modulo Multiplication Group

A modulo multiplication group \((\mathbb{Z}/n)^{*}\) is a finite group of residue classes prime to \(n\) under multiplication mod \(n\). \((\mathbb{Z}/n)^{*}\) is Abelian of group order \(\varphi(n)\), where \(\varphi(n)\) is the totient function.

The multiplicative group of integers modulo \(n\), \((\mathbb{Z}/n)^{*}\), is very worth researching.

\((\mathbb{Z}/n)^{*}:=\) invertible elements in \(\{1,2,...n-1\}\) \(:=\) \(\{b\in \{1,...,n-1\}|\gcd(b,n)=1 \}\).

Therefore \(|(\mathbb{Z}/n)^{*}|=\varphi(n)\). If \(n\) is prime (which is usually denoted \(p\)), \(|(\mathbb{Z}/p)^{*}|=p-1\).

\((\mathbb{Z}/n)^{*}\) is also denoted \(\mathbb{Z}^{*}_n\).

Eureka moment!

群论虽然是第一次接触学习，但它的定义与相关的一些定理意外的容易理解 (遵循直觉)。可能是由于 "群" 这一概念在日常生活中已经渗透得很深，以致于抽象成为数学模型后也有较强的既视感。

这里想要重点提一下整数模 \(n\) 乘法群 \((\mathbb{Z}/n)^{*}\)，这是一个很重要的群。对于它的研究起源于对模 \(n\) 剩余类 (residue classes) 的研究。我们知道，对于模 \(n\) 剩余类 \(\{0,1,2,...,n-1\}\)，若群运算 (group operation) 是加法 \(+\)，剩余类能够很自然的形成一个群 \((\mathbb{Z}/n)^{+}\)。然而，若我们将群运算设为乘法 \(\times\)，剩余类并不能形成一个群。这是因为群性质要求每个群元素都具有逆元 (inverse)；而在模 \(n\) 剩余类中，只有与 \(n\) 互质的群元素才有逆元。因此，我们将与 \(n\) 不互质的元素剔除，剩下的就是整数模 \(n\) 乘法群 \((\mathbb{Z}/n)^{*}\) 了。

Fermat's Little Theorem

Theorem. Let \(\mathbb{G}\) be a finite group of order \(m=|\mathbb{G}|\). Then for any \(g\in \mathbb{G}\), it holds that \(g^m=1\).

Proof. \(\mathbb{G}=\{g_1,g_2,...,g_m\}\), \(g\in \mathbb{G}\). \(g_i\neq g_j\) for all \(i,j\), therefore \(gg_i\neq gg_j\) for all \(i,j\). With the closure property of the group \(\mathbb{G}\), \(\{gg_1,gg_2,...gg_m\}\) is a permutation of \(\{g_1,g_2,...,g_m\}\). Then we have \((gg_1)(gg_2)...(gg_m)=g_1g_2...g_m\), which could be reduced to \(g^m=1\).

We could see the similarity between the proof of Fermat's little theorem and Euler's theorem. In fact, Fermat's little theorem is the general form of Euler's theorem. Let \(\mathbb{G}\) be the modulo multiplication group \((\mathbb{Z}/n)^{*}\), one could derive the Euler's theorem.

Corollary 1

Corollary (Fermat's Little Theorem). Let \(\mathbb{G}\) be a finite group of order \(m\). Then for any \(g\in \mathbb{G}\), and any integer \(x\), we have \(g^x=g^{[x\mod m]}\).

Many exponents calculation could be simplified using this corollary.

For example, if computing \(5^{1000}\mod 11\), we could see \(5\in (\mathbb{Z}/11)^{*}\) since \(\gcd(5,11)=1\). Therefore \(5^{1000}\mod{11}=5^{[1000\mod \varphi(11)]}\mod{11}=5^{[1000\mod{10}]}\mod{11}=1\).

Corollary 2

Corollary (Fermat's Little Theorem). Let \(\mathbb{G}\) be a finite group of order \(m\). Let \(e>0\) be an integer, and define the function, \(f_e:\mathbb{G}\to \mathbb{G}\) by \(f_e(g)=g^e\).

If \(\gcd(e,m)=1\), then \(f_e\) is a permutation, i.e., a bijection.
Moreover, if \(d=e^{-1}\mod{m}\), then \(f_d\) is the inverse of \(f_e\).

Eureka moment!

注意，我们这里介绍的是费马小定理的群论表述；事实上，其拥有一个更为广为人知的数学表述。对于任意整数 \(a\), 质数 \(p\)，有 \(a^{p}\equiv a\pmod{p}\)。

当 \(\gcd(a,p)=1\) 时，我们能够由此推导出欧拉定理。

当 \(\gcd(a,p)\neq 1\) 时，由于 \(p\) 是质数，\(a\) 一定是 \(p\) 的倍数 \(p|a\). 那么显然此时 \(a^p \equiv a \equiv 0\pmod{p}\).

Cyclic Groups and Generators

Before stepping into the definition of cyclic groups, we introduce the definition of order of a group element.

Definition. The order of a group element \(g\in \mathbb{G}\) is the smallest positive integer \(r\) such that \(g^r=1\), and is denoted by \(ord(g)\).

Note the distinction between the two concepts the order of a group and the order of a group element.

Definition. If an element \(g\in \mathbb{G}\) has order \(|\mathbb{G}|\), then \(\mathbb{G}\) is a cyclic group and can be denoted by \(\mathbb{G}=\langle g \rangle\). We call \(g\) the generator of \(\mathbb{G}\).

Therefore, if \(\mathbb{G}\) is a cyclic group and \(|\mathbb{G}|=m\), there must exists a generator \(g\in\mathbb{G}\) that could generate the whole group by constantly multiplying itself, i.e., \(\{g, g^2, ..., g^m\}=\mathbb{G}\).

Eureka moment!

之前搞 OI 的时候在数论里学习到了原根 (primitive root) 的概念，然而并不是很明白。到大学学到生成元 (generator) 的概念突然有了醍醐灌顶之感。实际上，原根和生成元的概念是一体两面。

若一个正整数 \(m\) 有原根 \(g\)，那么就意味着群 \((\mathbb{Z}/m)^{*}\) 是一个循环群，且它的生成元为 \(g\)。

正整数 \(m\) 有原根 (群 \((\mathbb{Z}/m)^{*}\) 为循环群) 的充要条件为：它能表达为下列形式之一 \(2,4,p^n,2p^n\)，其中 \(p\) 为奇素数。且若 \(m\) 有原根，其原根的个数为 \(\varphi(\varphi(m))\)。部分证明 (\(m\) 为质数) 见我的 CnBlogs: 循环群；这里还提到，任意质数阶群 (group of prime order) 一定是循环群。

Reference

This article is a self-administered course note.

References in the article are from corresponding course materials if not specified.

Course info:

Code: COMP3357, Lecturer: Dr. Ravi. Ramanathan, TA: Zhang Chengru

Course textbook:

Introduction to Modern Cryptography (2nd edition), Katz, J. & Lindell, Y., (2015), CRC Press, view

Math content:

OI Wiki, Brilliant Math Wiki, Wolframe MathWorld.

-----------------------------------そして、次の曲が始まるのです。-----------------------------------