Decomposing a Factorial into Large Factors by surprisetalk

Share This Article

Sed ut perspiciatis unde.

I’ve just uploaded to the arXiv the paper “Decomposing a factorial into large factors“. This paper studies the quantity ${t(N)}$ , defined as the largest quantity such that it is possible to factorize ${N!}$ into ${N}$ factors ${a_1, dots, a_N}$ , each of which is at least ${t(N)}$ . The first few values of this sequence are

$displaystyle 1,1,1,2,2,2,2,2,3,3,3,3,3,4, dots$

(OEIS A034258). For instance, we have ${t(9)=3}$ , because on the one hand we can factor

$displaystyle 9! = 3 times 3 times 3 times 3 times 4 times 4 times 5 times 7 times 8$

but on the other hand it is not possible to factorize ${9!}$ into nine factors, each of which is ${4}$ or higher.

This quantity ${t(N)}$ was introduced by Erdös, who asked for upper and lower bounds on ${t(N)}$ ; informally, this asks how equitably one can split up ${N!}$ into ${N}$ factors. When factoring an arbitrary number, this is essentially a variant of the notorious knapsack problem (after taking logarithms), but one can hope that the specific structure of the factorial ${N!}$ can make this particular knapsack-type problem more tractable. Since

$displaystyle N! = a_1 dots a_N geq t(N)^N$

for any putative factorization, we obtain an upper bound

$displaystyle t(N) leq (N!)^{1/N} = frac{N}{e} + O(log N) (1)$

thanks to the Stirling approximation. At one point, Erdös, Selfridge, and Straus claimed that this upper bound was asymptotically sharp, in the sense that

$displaystyle t(N) = frac{N}{e} + o(N) (2)$

as ${N rightarrow infty}$ ; informally, this means we can split ${N!}$ into ${N}$ factors that are (mostly) approximately the same size, when ${N}$ is large. However, as reported in this later paper, Erdös “believed that Straus had written up our proof… Unfortunately Straus suddenly died and no trace was ever found of his notes. Furthermore, we never could reconstruct our proof, so our assertion now can be called only a conjecture”.

Some further exploration of ${t(N)}$ was conducted by Guy and Selfridge. There is a simple construction that gives the lower bound

$displaystyle t(N) geq frac{3}{16} N - o(N)$

that comes from starting with the standard factorization ${N! = 1 times 2 times dots times N}$ and transferring some powers of ${2}$ from the later part of the sequence to the earlier part to rebalance the terms somewhat. More precisely, if one removes one power of two from the even numbers between ${frac{3}{8}N}$ and ${N}$ , and one additional power of two from the multiples of four between ${frac{3}{4}N}$ to ${N}$ , this frees up ${frac{3}{8}N + o(N)}$ powers of two that one can then distribute amongst the numbers up to ${frac{3}{16} N}$ to bring them all up to at least ${frac{3}{16} N - o(N)}$ in size. A more complicated procedure involving transferring both powers of ${2}$ and ${3}$ then gives the improvement ${t(N) geq frac{1}{4} N - o(N)}$ . At this point, however, things got more complicated, and the following conjectures were made by Guy and Selfridge:

In this note we establish the bounds

$displaystyle frac{1}{e} - frac{O(1)}{log N} leq frac{t(N)}{N} leq frac{1}{e} - frac{c_0+o(1)}{log N} (3)$

as ${N rightarrow infty}$ , where ${c_0}$ is the explicit constant

$displaystyle c_0 := frac{1}{e} int_0^1 left lfloor frac{1}{x} rightrfloor log left( ex left lceil frac{1}{ex} rightrceil right) dx approx 0.3044.$

In particular this recovers the lost result (2). An upper bound of the shape

$displaystyle frac{t(N)}{N} leq frac{1}{e} - frac{c+o(1)}{log N} (4)$

for some by Erdös and Graham (Erdös problem #391). We conjecture that the upper bound in (3) is sharp, thus

$displaystyle frac{t(N)}{N} = frac{1}{e} - frac{c_0+o(1)}{log N}, (5)$

which is consistent with the above conjectures (i), (ii), (iii) of Guy and Selfridge, although numerically the convergence is somewhat slow.

The upper bound argument for (3) is simple enough that it could also be modified to establish the first conjecture (i) of Guy and Selfridge; in principle, (ii) and (iii) are now also reducible to a finite computation, but unfortunately the implied constants in the lower bound of (3) are too weak to make this directly feasible. However, it may be possible to now crowdsource the verification of (ii) and (iii) by supplying a suitable set of factorizations to cover medium sized ${N}$ , combined with some effective version of the lower bound argument that can establish ${frac{t(N)}{N} geq frac{1}{3}}$ for all ${N}$ past a certain threshold. The value ${N=300000}$ singled out by Guy and Selfridge appears to be quite a suitable test case: the constructions I tried fell just a little short of the conjectured threshold of ${100000}$ , but it seems barely within reach that a sufficiently efficient rearrangement of factors can work here.

We now describe the proof of the upper and lower bound in (3). To improve upon the trivial upper bound (1), one can use the large prime factors of ${N!}$ . Indeed, every prime ${p}$ between ${N/e}$ and ${N}$ divides ${N!}$ at least once (and the ones between ${N/e}$ and ${N/2}$ divide it twice), and any factor ${a_i}$ that contains such a factor therefore has to be significantly larger than the benchmark value of ${N/e}$ . This observation already readily leads to some upper bound of the shape (4) for some ${c/>0}” src=”https://s0.wp.com/latex.php?latex=%7Bc%3E0%7D&bg=ffffff&fg=000000&s=0&c=20201002″ >; if one also uses the primes <img decoding=$ that are slightly less than ${N/e}$ (noting that any multiple of ${p}$ that exceeds ${N/e}$ , must in fact exceed ${lceil N/ep rceil p}$ ) is what leads to the precise constant ${c_0}$ .

For previous lower bound constructions, one started with the initial factorization ${N! = 1 times dots times N}$ and then tried to “improve” this factorization by moving around some of the prime factors. For the lower bound in (3), we start instead with an approximate factorization roughly of the shape

$displaystyle N! approx (prod_{t leq n < t + 2N/A, hbox{ odd}} n)^A$

where ${t}$ is the target lower bound (so, slightly smaller than ${N/e}$ ), and ${A}$ is a moderately sized natural number parameter (we will take ${A asymp log^3 N}$ , although there is significant flexibility here). If we denote the right-hand side here by ${B}$ , then ${B}$ is basically a product of ${N}$ numbers of size at least ${t}$ . It is not literally equal to ${N!}$ ; however, an easy application of Legendre’s formula shows that for odd small primes ${p}$ , ${N!}$ and ${B}$ have almost exactly the same number of factors of ${p}$ . On the other hand, as ${B}$ is odd, ${B}$ contains no factors of ${2}$ , while ${N!}$ contains about ${N}$ such factors. The prime factorizations of ${B}$ and ${N!}$ differ somewhat at large primes, but ${B}$ has slightly more such prime factors as ${N!}$ (about ${frac{N}{log N} log 2}$ such factors, in fact). By some careful applications of the prime number theorem, one can tweak some of the large primes appearing in ${B}$ to make the prime factorization of ${B}$ and ${N!}$ agree almost exactly, except that ${B}$ is missing most of the powers of ${2}$ in ${N!}$ , while having some additional large prime factors beyond those contained in ${N!}$ to compensate. With a suitable choice of threshold ${t}$ , one can then replace these excess large prime factors with powers of two to obtain a factorization of ${N!}$ into ${N}$ terms that are all at least ${t}$ , giving the lower bound.

The general approach of first locating some approximate factorization of ${N!}$ (where the approximation is in the “adelic” sense of having not just approximately the right magnitude, but also approximately the right number of factors of ${p}$ for various primes ${p}$ ), and then moving factors around to get an exact factorization of ${N!}$ , looks promising for also resolving the conjectures (ii), (iii) mentioned above. For instance, I was numerically able to verify that ${t(300000) geq 90000}$ by the following procedure:

Start with the approximate factorization of ${N!}$ , ${N = 300000}$ by ${B = (prod_{90000 leq n < 102000, hbox{ odd}} n)^{50}}$ . Thus ${B}$ is the product of ${N}$ odd numbers, each of which is at least ${90000}$ .
Call an odd prime ${B}$ -heavy if it divides ${B}$ more often than ${N!}$ , and ${N!}$ -heavy if it divides ${N!}$ more often than ${B}$ . It turns out that there are ${14891}$ more ${B}$ -heavy primes than ${N!}$ -heavy primes (counting multiplicity). On the other hand, ${N!}$ contains ${2999992}$ powers of ${2}$ , while ${B}$ has none. This represents the (multi-)set of primes one has to redistribute in order to convert a factorization of ${B}$ to a factorization of ${N!}$ .
Using a greedy algorithm, one can match a ${B}$ -heavy prime ${p'}$ to each ${N!}$ -heavy prime ${p}$ (counting multiplicity) in such a way that ${p' leq 2^{m_p} p}$ for a small ${m_p}$ (in most cases one can make ${m_p=0}$ , and often one also has ${p'=p}$ ). If we then replace ${p'}$ in the factorization of ${B}$ by ${2^{m_p} p}$ for each ${N!}$ -heavy prime ${p}$ , this increases ${B}$ (and does not decrease any of the ${N}$ factors of ${B}$ ), while eliminating all the ${N!}$ -heavy primes. With a somewhat crude matching algorithm, I was able to do this using ${sum_p m_p = 39992}$ of the ${299992}$ powers of ${2}$ di

Post Author

Sharlin

Posted March 28, 2025 at 3:53 pm

[flagged]

0Likes Log in to Reply
Post Author

btilly

Posted March 28, 2025 at 4:12 pm

Out of curiosity, I wondered how tight these bounds are. Consider the case of 300,000 which Terry has put a lower bound of 90,000 on, and would like a bound of 100,000 on. If a perfect division of factors into equal buckets could be achieved, the answer would be 110366.49020484093 per bucket. That's e^(log(n!)/n), to within the precision that Python calculates it. (My first try was to use Stirling's formula. That estimated it at 110366.49020476094, which is pretty darned close!)

A straightforward greedy approach will see those final buckets differing by factors of around 2. Which is a lot worse than Terry's current approach.

This really is a knapsack problem.

0Likes Log in to Reply
Post Author

zahlman

Posted March 28, 2025 at 4:39 pm

From comments:

>No; in fact, in Guy’s article on this problem, he notes that there is a jump of 2 from t(124)=35 to t(125)=37

Huh. Can we actually prove t(N) is monotonic? Jumps like that seem like they could be one-offs in some cases.

0Likes Log in to Reply
Post Author

curtisszmania

Posted March 28, 2025 at 5:03 pm

[dead]

0Likes Log in to Reply
Post Author

tempodox

Posted March 28, 2025 at 5:08 pm

I'm still confused as to how to compute t(N). Maybe it's buried in the legalese of this text and I'm just dumb, but I can't find a clue.

0Likes Log in to Reply
Post Author

SJC_Hacker

Posted March 28, 2025 at 5:56 pm
Edit: as pointed out, you can't simply divide by each number, because then its not equal to original number factorial. However, I've fixed the algorithm somewhat maintaining the basic idea

The "naive" way

100!= 2*3*4*...*99*100

Obviously t(100) > 1. If we can get rid of the single 2, we know that t(100) > 2. If you multiply the whole thing by 2/2 (=1),

100! = (2/2)*2*3*4*...*50*...99*100 = (2*2/2)*3*4*...*50*...99*100 = (4/2)*3*4*...*50*...99*100 = 4*3*4*...*50*...99*(100/2) = 3*4*4*...*50*50*...99

We can continue with t(100) > 3 by multiplying by 3/3 and pairing it with 99, i.e. 99*3*(3/3) = (99/3)*3*3 = 33*9 yielding

100! = 4*4*5*...*9*9*10*..*33*33*...*50*50...*97*98

However, once we get to t(100) > 4, trying to get rid of 4, we have to skip 98 since its not divisible by 4. The other problem is we have two 4s… If we had instead using 98 for getting rid of 2, we can then use 100, and 96 for the other 4. This is our first "gotcha" for the naive algorithm of always picking the largest number, which seems intuitive at first glance.

Now if we test all possibilities starting with 2, we get 48 choices for the dividing 2 (even numbers > 2, not including 2 which will not increase t(100) beyond 2. Then ~33 choices for dividing 3 (depending if our div of 2 resulted in a factor of 3), ~25 for 4, But notice since we now have two 4s, we have to do it twice, so its 25*24 choices for getting rid of 4.*
0Likes Log in to Reply
Post Author

dvh

Posted March 28, 2025 at 6:20 pm

As a kid I was bugged by the fact that Stirling formula doesn't give exact integer result, so I set to find my own formula. I failed, but discovered that sum from 1 to N is (n*n+n)/2. Surely if perfect formula for sum exists, for multiplication should exist too.

0Likes Log in to Reply
Post Author

phkahler

Posted March 28, 2025 at 7:39 pm

Is there a standard notation for the product of all even numbers up to N and for the product of all odd numbers up to N? I know if N is even then the product of evens is = (N/2)! * 2^(N/2) so I guess notation for that is a little redundant, but there is no simple formula for the product of the odd numbers.

0Likes Log in to Reply

Decomposing a Factorial into Large Factors by surprisetalk

Decomposing a Factorial into Large Factors by surprisetalk

Share This Article

Newsletter

HackTech

8 Comments

Sharlin

btilly

zahlman

curtisszmania

tempodox

SJC_Hacker

dvh

phkahler

Leave a comment Cancel reply

Editor's Choice

Decomposing a Factorial into Large Factors by surprisetalk

Decomposing a Factorial into Large Factors by surprisetalk

Share This Article

Newsletter

8 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter