Collatz Signatures

A digital signature system based on the Collatz map

Apr 15, 2023

Disclaimer: Don’t take this too seriously. It’s based on some strong and unproven assumptions and is not intended to be practical, but it’s an interesting thought. This won’t be a very rigorous treatment of the idea, it’s more of a sketch of an idea.

In this article, I try to build a digital signature system using the Collatz function. First we go over what the Collatz conjecture is, then present several observations, and finally we construct a digital signature system using the Collatz function. A Collatz signature is a type of group blind signature. In this mechanism, a group of participants agree on a secret key, which is some odd number. Subsequently, anybody with knowledge of this key can use it to build a signature that is only verifiable by others with knowledge of the secret x.

Fig. 1 Overview of two-party Collatz signature and verification

The Collatz Conjecture

Anybody who has dabbled in number theory is bound to have come across the Collatz Conjecture. With a deceptively simple setup, the conjecture’s proof has consumed and eluded mathematicians since it was proposed by Lothar Collatz in 1937. The conjecture starts with the function:

$C(x) = \begin{cases} \frac{x}{2} & : x \in 2 \mathbb{N} \\ 3x + 1 & : x \notin 2 \mathbb{N} \\ \end{cases} $

The conjecture is as follows: for any natural number x, there is some k > 0 such that the map C evaluated k times, starting at x, will always result in the number 1. In more consice terms, it posits the following statement:

$\forall x \in \mathbb{N} \: \exists k \in \mathbb{N} \: s.t. \: C^k(x) = 1$

Visualization of Collatz Conjecture by Lovasoa available through Wikimedia Commons

It is not yet known if the conjecture is true for all numbers, but it has been proved true for all numbers up to 2^100000 - 1. Many have attempted to prove and disprove it, including Terrence Tao, yet the conjecture remains unanswered. In this article, I’m not going to make an attempt to solve the conjecture. In fact, what follows is really an experiment of a scenario where the conjecture is assumed to be true. What follows will not be incredibly rigorous. One note: by the result that the conjecture holds for all x < 2^100000 - 1, we can make a(n) (probably unfounded) assumption that the signature system we will build only has input and output that is less than that proven bound.

Overview and Observations

Ultimately, the basis for the collatz function as a hash function relies on the fact that for any given k and x, it is easy to calculate the Collatz map evaluated k times, starting with x. However, given y, it is very difficult to determine which x produced that output. This is somewhat reminiscent of the discrete log problem, using base 2. Observe that given that x maps to y under C evaluated k times, we have:

$\begin{align} C^k(x) = y &\Rightarrow 2^{-(k-1)} C(x) = y \\ & \Rightarrow k = log_2(C(x) / y) + 1 \end{align}$

Note that since x is unknown, so is C(x) (we only know it is a power of 2 multiplied by y).

We’ll need to make a few observation about the Collatz function before we can describe the signature system.

First, observe that for any odd number x, C(x) is an even number (C(x) = 3x + 1 = 3(2j + 1)+1 = 2(3j + 2) q.e.d.), and that each even number will eventually become an odd number after successive iterations of C. With this, we can say that for each odd number x that there exists some k > 1 such that C evaluated k times starting at x results in an odd number, and every number between 0 and k results in an even number. That is:

$ \forall x \notin 2\mathbb{N} \: \exists k > 1 \: s.t. \: C^k(x) \notin 2\mathbb{N} \: \And \: C^j(x) \in 2\mathbb{N} \: \forall 1 \leq j < k \:\:\:\:\:\:\:\: (1)$

We also observe that:

$\forall x, y \notin 2\mathbb{N}, C(x) = C(y) \iff x = y \:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\: (2)$

Now, we’re going to take a dangerous path and assume that for some very large number that the conjecture holds with some high probability for all numbers less than the large number. The next set of observations we make will address the inverse of the Collatz map. That is, given some y, can we find all numbers that map to it under C?

Collatz Signatures

And finally, we get to the good stuff. Here I propose a signature scheme using the Collatz map. In this scheme, Bob and Alice are able to verify each other’s signatures, but no other party can.

Suppose some Alice and Bob agree on some odd number x. Then, x can be used as follows:

Alice chooses some k > 1 as per equation (1) and calculates $C^k(x) =: y \notin 2\mathbb{N}$, $k^x =: z$ and publishes (y, z).
Bob calculates the natural log of C(x) / y and adds one to recover the value of k.
Bob verifies that $C^k(x) = y$ using the recovered k value. If not, then the value published by Alice is invalid and Bob rejects it.
If the values do match, then Bob calculates k raised to the secret x. If it equals the published z value, then we know that y was produced by x under C evaluated k times.

Since Substack doesn’t render inline latex, the above would be something like:

$\begin{align} Alice:& \: k > 1 \to (C^{k}(x) =: y, k^x =: z) \to (y, z) \\ Bob:& \: k' = log_2(\frac{C(x)}{y}) + 1 \\ Bob:& \: C^{k^{'}}(x) = y? \\ Bob:& \: k'^{x} = z ? \\ \end{align}$

The first check verifies that the value x was indeed in the proper inverse set.

The second check verifies that the value used to produce the output was the same secret known by Bob.

Further Work

There are some issues with this. One big one that jumps out is that the value k^x is unbounded and could be enormous. This most likely isn’t very practical, but I hope you had some fun reading it.
This gets a little stranger when we add more parties who have knowledge of the secret. Suppose that N participants agree on ‘x’. Then, any participant can calculate some (y, z) and publish it. Each of the participants can verify that it was produced by somebody with knowledge of ‘x’, but are unable to determine exactly which party it was. To address this, each participant in the ‘group’ can be assigned some unique index k_i. This k_i acts as their secret key, and we modify the protocol slightly so that instead of using k to find the y value, we use some g^k for a group generator g.

I think it would be interesting to continue to explore if there is a family of functions that have behavior like the 3x+1 map. In particular, the family of piecewise functions {(x → x/a), (x → bx + c) | bx + c | a for all x that do not divide a}.
It would also be interesting to see if there is a generalization of this that allows for usage of bases other than 2, e.g. base a, using some element from the family of piecewise functions above.

Tony's Newsletter

Discussion about this post