Debunking NIST’s calculation of the Kyber-512 security level by bumbledraven

Share This Article

Sed ut perspiciatis unde.

Older (Access-J): 2023.06.09: Turbo Boost: How to perpetuate security problems. #overclocking #performancehype #power #timing #hertzbleed #riskmanagement #environment

Table of contents (Access-I for index page)

2023.10.03: The inability to count correctly: Debunking NIST’s calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

2023.06.09: Turbo Boost: How to perpetuate security problems. #overclocking #performancehype #power #timing #hertzbleed #riskmanagement #environment

2022.08.05: NSA, NIST, and post-quantum cryptography: Announcing my second lawsuit against the U.S. government. #nsa #nist #des #dsa #dualec #sigintenablingproject #nistpqc #foia

2022.01.29: Plagiarism as a patent amplifier: Understanding the delayed rollout of post-quantum cryptography. #pqcrypto #patents #ntru #lpr #ding #peikert #newhope

2020.12.06: Optimizing for the wrong metric, part 1: Microsoft Word: Review of “An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development” by Knauff and Nejasmic. #latex #word #efficiency #metrics

2019.10.24: Why EdDSA held up better than ECDSA against Minerva: Cryptosystem designers successfully predicting, and protecting against, implementation failures. #ecdsa #eddsa #hnp #lwe #bleichenbacher #bkw

2019.04.30: An introduction to vectorization: Understanding one of the most important changes in the high-speed-software ecosystem. #vectorization #sse #avx #avx512 #antivectors

2017.11.05: Reconstructing ROCA: A case study of how quickly an attack can be developed from a limited disclosure. #infineon #roca #rsa

2017.10.17: Quantum algorithms to find collisions: Analysis of several algorithms for the collision problem, and for the related multi-target preimage problem. #collision #preimage #pqcrypto

2017.07.23: Fast-key-erasure random-number generators: An effort to clean up several messes simultaneously. #rng #forwardsecrecy #urandom #cascade #hmac #rekeying #proofs

2017.07.19: Benchmarking post-quantum cryptography: News regarding the SUPERCOP benchmarking system, and more recommendations to NIST. #benchmarking #supercop #nist #pqcrypto

2016.10.30: Some challenges in post-quantum standardization: My comments to NIST on the first draft of their call for submissions. #standardization #nist #pqcrypto

2016.06.07: The death of due process: A few notes on technology-fueled normalization of lynch mobs targeting both the accuser and the accused. #ethics #crime #punishment

2016.05.16: Security fraud in Europe’s “Quantum Manifesto”: How quantum cryptographers are stealing a quarter of a billion Euros from the European Commission. #qkd #quantumcrypto #quantummanifesto

2016.03.15: Thomas Jefferson and Apple versus the FBI: Can the government censor how-to books? What if some of the readers are criminals? What if the books can be understood by a computer? An introduction to freedom of speech for software publishers. #censorship #firstamendment #instructions #software #encryption

2015.11.20: Break a dozen secret keys, get a million more for free: Batch attacks are often much more cost-effective than single-target attacks. #batching #economics #keysizes #aes #ecc #rsa #dh #logjam

2015.03.14: The death of optimizing compilers: Abstract of my tutorial at ETAPS 2015. #etaps #compilers #cpuevolution #hotspots #optimization #domainspecific #returnofthejedi

2015.02.18: Follow-You Printing: How Equitrac’s marketing department misrepresents and interferes with your work. #equitrac #followyouprinting #dilbert #officespaceprinter

2014.06.02: The Saber cluster: How we built a cluster capable of computing 3000000000000000000000 multiplications per year for just 50000 EUR. #nvidia #linux #howto

2014.05.17: Some small suggestions for the Intel instruction set: Low-cost changes to CPU architecture would make cryptography much safer and much faster. #constanttimecommitment #vmul53 #vcarry #pipelinedocumentation

2014.04.11: NIST’s cryptographic standardization process: The first step towards improvement is to admit previous failures. #standardization #nist #des #dsa #dualec #nsa

2014.03.23: How to design an elliptic-curve signature system: There are many choices of elliptic-curve signature systems. The standard choice, ECDSA, is reasonable if you don’t care about simplicity, speed, and security. #signatures #ecc #elgamal #schnorr #ecdsa #eddsa #ed25519

2014.02.13: A subfield-logarithm attack against ideal lattices: Computational algebraic number theory tackles lattice-based cryptography.

2014.02.05: Entropy Attacks! The conventional wisdom says that hash outputs can’t be controlled; the conventional wisdom is simply wrong.

2023.10.03: The inability to count correctly: Debunking NIST’s calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

[Sidney Harris cartoon used with permission.
Copyright holder: ScienceCartoonsPlus.com.]

Quick, what’s 2⁴⁰ plus 2⁴⁰?
It’s 2⁸⁰, right?

No, obviously not.
40 plus 40 is 80,
and
2⁴⁰ times 2⁴⁰ is 2⁸⁰,
but
2⁴⁰ plus 2⁴⁰ is only 2⁴¹.

Take a deep breath and relax.
When cryptographers
are analyzing the security of
cryptographic systems,
of course
they don’t make stupid mistakes such as
multiplying numbers that should have been added.

If such an error somehow managed to appear,
of course it would immediately be caught
by the robust procedures that cryptographers follow
to thoroughly review security analyses.

Furthermore,
in the context of standardization processes
such as the NIST Post-Quantum Cryptography Standardization Project (NISTPQC),
of course the review procedures are even more stringent.

The only way
for the security claims for modern cryptographic standards
to turn out to fail
would be because of some unpredictable new discovery
revolutionizing the field.

Oops, wait, maybe not.
In 2022,
NIST announced plans to standardize
a particular cryptosystem, Kyber-512.
As justification, NIST issued claims
regarding the security level of Kyber-512.
In 2023,
NIST issued a draft standard for Kyber-512.

NIST’s underlying calculation of the security level
was a severe and indefensible miscalculation.
NIST’s primary error is exposed in this blog post,
and boils down to nonsensically multiplying two costs that should have been added.

How did such a serious error
slip past NIST’s review process?
Do we dismiss this as an isolated incident?
Or do we conclude that something is fundamentally broken
in the procedures that NIST is following?

Discovering the secret workings of NISTPQC.
I filed a FOIA request
“NSA, NIST, and post-quantum cryptography”
in March 2022.
NIST stonewalled, in violation of the law.
Civil-rights firm
Loevy & Loevy
filed a lawsuit on my behalf.

That lawsuit has been gradually
revealing
secret NIST documents,
shedding some light on what was actually going on behind the scenes,
including much heavier NSA involvement than indicated by NIST’s public narrative.
Compare, for example, the following documents:

A public 2014 document
says that its author is
“Post Quantum Cryptography Team, National Institute of Standards and Technology
(NIST), pqc@nist.gov”.
A secret 2016 document
listed the actual pqc@nist.gov team members,
with more NSA people
(Nick Gajcowski; David Hubbard; Daniel Kirkwood; Brad Lackey; Laurie Law; John McVey;
Mark Motley; Scott Simon; Jerry Solinas; David Tuller)
than NIST people.
(Another Department of Defense representative on the list
was Jacob Farinholt, Naval Surface Warfare Center, US Navy.
I’m not sure about Evan Bullock.)
Another secret 2016 document
shows that NSA’s Scott Simon was scheduled to visit NIST on 12 January 2016.
Another secret 2016 document
shows that NIST’s “next meeting with the NSA PQC folks” was scheduled for 26 January 2016.
Another secret 2016 document
shows that Michael Groves from NSA’s UK partner
was scheduled to visit NIST on 2 February 2016.
Another secret 2016 document
lists Colin Whorlow from NSA’s UK partner
as someone that NIST visited in 2016,
in particular discussing
“confidence and developments for each of the primary PQC families”.
A public 2020 document
says “Engagement with community and stakeholders.
This includes feedback we received from many, including the NSA.
We keep everyone out of our internal standardization meetings and the decision process.
The feedback received (from the NSA) did not change any of our decisions …
NIST encouraged the NSA to provide comments publicly.
NIST alone makes the PQC standardization decisions, based on publicly available information, and stands by those decisions”.

I filed a new FOIA request in January 2023,
after NIST issued its claims regarding the security level of Kyber-512.
NIST again stonewalled.
Loevy & Loevy has now filed a new lawsuit regarding that FOIA request.

Public material regarding Kyber-512 already shows
how NIST multiplied costs that should have been added,
how NIST sabotaged public review of this calculation,
and how important this calculation was for NIST’s narrative of Kyber outperforming NTRU,
filling a critical gap left by other steps
that NIST took to promote the same narrative.
This blog post goes carefully through the details.

Alice and Bob paint a fence.
At this point you might be thinking
something like this:
“Sorry, no, it’s not plausible
that anyone could have mixed up
a formula saying 2^x+2^y with a formula saying 2^x+y,
whatever the motivations might have been.”

As a starting point for understanding what happened,
think about schoolchildren in math class facing a word problem:

There is a fence to paint.
Alice would take 120 minutes to paint the fence.
Bob would take 240 minutes to paint the fence.
How long would it take Alice and Bob
to paint the fence together?

The approved answer in school
says that Alice paints 1/120 of the fence per minute,
and Bob paints 1/240 of the fence per minute,
so together they paint 1/120 + 1/240 = 1/80 of the fence per minute,
so it takes them 80 minutes to paint the fence.

The real answer could be more complicated
because of second-order effects.
Probably Alice and Bob working together
are getting less tired
than Alice or Bob working alone for longer would have.
In the opposite direction,
maybe there’s a slowdown because Alice and Bob
enjoy each other’s company
and pause for a coffee.

Schoolchildren often give answers such as
240 − 120 = 120,
or 120 + 240 = 360,
or (120 + 240)/2 = 180.
These children are just manipulating numbers,
not thinking through what the numbers mean.

Two disciplines for catching errors.
In later years of education,
physics classes
teach students a type-checking discipline
of tracking units with each number.

Here are examples of calculations following this discipline:

Dividing “1 fence” by “120 min”
gives “0.00833 fence/min”.
Adding “0.00833 fence/min”
to “0.00417 fence/min”
gives “0.01250 fence/min”.
Taking the reciprocal gives “80.0 min/fence”.

The same discipline wouldn’t let you add,
for example, “1 fence” to “120 min”:
the units don’t match.

This discipline avoids many basic errors.
On the other hand,
it still allows, e.g., the mistake of adding
“120 min” to “240 min” to obtain “360 min”.

What catches this mistake
is a discipline stronger than tracking units:
namely, tracking semantics.

The numbers have meanings.
They’re quantitative features of real objects.
For example,
80 minutes
is the total time for Alice and Bob
to paint the fence
when Alice is painting part of the fence
and Bob is painting part of the fence.
That’s what
the question asked us to calculate.

A different question would be
the total time for Alice to paint the fence
and then for Bob to repaint the same fence.
This would be 120 minutes plus 240 minutes.

Yet another question would be
the total time
for Alice to paint the fence,
and then for Bob to wait for the coat
of paint to dry,
and then for Bob to apply a second coat.
Answering this would require more information,
namely the waiting time.

All of these questions make sense.
They pass type-checking.
But their semantics are different.

Alice and Bob tally the costs of an attack.
Alice and Bob have finished painting
and are now discussing the merits
of different encryption systems.

They’d like to make sure that
breaking whichever system they pick
is at least as hard as searching for an AES-128 key.
They’ve agreed that searching for an AES-128 key
is slightly above 2¹⁴⁰ bit operations.

Alice and Bob are broadcasting their discussion
for anyone who’s interested.
Let’s listen to what they’re saying:

Alice:
“Hmmm,
there are a bunch of sources saying
that the XYZZY attack algorithm uses 2⁸⁰ iterations
to break this particular cryptosystem.
It’s worrisome that this number is so low.
What else do we know
about the cost of the attack?”
Bob:
“I found a source saying
that there are actually extra factors
in the iteration count,
and estimating that
the XYZZY attack uses 2⁹⁵ iterations.”
Alice:
“Here’s another source
looking at the details
of the computations inside each iteration,
and estimating that those computations
cost 2²⁵ bit operations.”
Bob:
“There’s also a gigantic array being accessed.
Here’s a source
estimating that the memory access
inside each iteration
is as expensive as 2³⁵ bit operations.”
Alice:
“Okay, let’s review.
The best estimate available
for the total cost of each iteration in the XYZZY attack
is around 2³⁵ bit operations.
A tiny part of that is
2²⁵ bit operations for computation.
The main cost is the equivalent of
2³⁵ bit operations for the memory access.”
Bob:
“Agreed.
Multiplying
2⁹⁵ iterations
by
2³⁵ bit operations per iteration
gives us a total of 2¹³⁰ bit operations.
Doesn’t meet the security target.”
Alice:
“Right,
that’s a thousand times easier than AES-128 key search.
Let’s move on to the next cryptosystem.”

How to botch the tally of costs.
Imagine a government agency
that has also
been looking at this particular cryptosystem,
but with one critical difference:
the agency is desperate to say
that this cryptosystem is okay.

How does the agency deal with the XYZZY attack?

One answer is to aim
for a lower security goal,
hyping the cost of carrying out 2¹³⁰ bit operations.
For comparison,
Bitcoin mining
did only about 2¹¹¹ bit operations in 2022.
(“Only”!)

But let’s assume that
the agency has promised the world
that it will reach at least
the AES-128 security level.

What does the agency do?

Here’s an idea.
For the costs per iteration,
instead of adding 2²⁵ for computation to 2³⁵ for memory access,
how about multiplying 2²⁵ for computation by 2³⁵ for memory access?

The product is 2⁶⁰.
Multiplying this by 2⁹⁵ iterations
gives 2¹⁵⁵, solidly above 2¹⁴³.
Problem solved!

How discipline catches the error.
Alice and Bob are correctly tracking
the semantics of each number.
The agency isn’t.

The total attack cost is the number of iterations
times the cost per iteration.
Each iteration incurs

cost for computation, estimated as 2²⁵ bit operations, and
cost for memory access, estimated to be as expensive as 2³⁵ bit operations.

The agency’s multiplication of these two costs
makes no sense,
and produces a claimed per-iteration cost that’s millions of times larger
than the properly estimated per-iteration cost.

This multiplication is so glaringly wrong
that it doesn’t even pass
physics-style type-checking.
Specifically,
multiplying
“2²⁵ bitops/iter”
by
“2³⁵ bitops/iter”
doesn’t give
“2⁶⁰ bitops/iter”.
It gives
“2⁶⁰ bitops²/iter²“.
Multiplying further by “2⁹⁵ iter”
doesn’t give
“2¹⁵⁵ bitops”;
it gives
“2¹⁵⁵ bitops²/iter”.

Agency desperation strikes back.
How can the agency
phrase this nonsensical calculation of a severely inflated security estimate
in a way that will pass superficial review?

The goal here is for the 155 to sound
as if it’s simply putting together
numbers from existing sources.
For example:

Here’s a source estimating
an iteration count of 2⁹⁵.
Here’s a source estimating
2²⁵ bit operations per iteration.
Here’s a source estimating
that accounting for memory
multiplies costs by 2³⁵.
95 plus 25 plus 35 is 155,
solidly above 143.

The deception here is in the third step,
the step that leaps from
cost 2²⁵ per iteration
to cost 2⁶⁰ per iteration.

How many readers are going to check
the third source
and see that it was actually estimating
cost 2³⁵ per iteration?

Streamlining the marketing.
The wrong calculation sounds even simpler
if there’s a previous source
that has already put the 2⁹⁵
and the 2²⁵ together:

Here’s a source estimating
2¹²⁰ bit operations.
Here’s a source estimating
that accounting for memory
multiplies costs by 2³⁵.
120 plus 35 is 155,
solidly above 143.

At this point the agency
has completely suppressed
any mention of iterations,
despite the central role of iterations
in the attack and in any competent analysis of the attack.

How many readers are going
to check both sources,
see that
the second source
estimates cost 2³⁵ per iteration,
and see that
the iteration count in the first source
is far below 2¹²⁰?

Kyber’s limited selection of security levels.
You might be thinking something like this:
“Okay, sure, I see how it would be possible for a desperate agency
to replace cost addition with a nonsensical multiplication,
replacing 2¹³⁰ with a fake 2¹⁵⁵,
while at the same time making this hard for people to see.
But why would anyone have wanted to play this risky game?
If Kyber-512 was around 2¹³⁰,
and the target was a little above 2¹⁴⁰,
why didn’t they just bump up the parameters to 10% higher security,
something like Kyber-576?”

This is an obvious question given that
RSA and ECC and (to take some post-quantum examples) McEliece and NTRU
naturally support whatever size you want.

A long, long time ago,
I wrote
fast software
for the NSA/NIST P-224 elliptic curve,
and then found a
better curve
at that security level,
namely
y² = x³ + 7530x² + x mod 2²²⁶−5.
But then I decided that bumping the size up
to 2²⁵⁵−19
would be much more comfortable, so I did.

Kyber is different.
You can’t just bump up Kyber’s parameters to 10% higher security:

Kyber-576 doesn’t exist.
If you want something stronger than Kyber-512
then you have to increase the “dimension” by 50%,
jumping all the way up to Kyber-768.
If you want something stronger than Kyber-768
then you have to jump all the way up to Kyber-1024.
If you want something stronger than Kyber-1024
then, sorry, tough luck.

One of the “unique advantages of Kyber”
specifically advertised in the
official Kyber documentation
is that implementing a “dimension-256 NTT”
handles “all parameter sets” for Kyber
(emphasis in original).
This “NTT” isn’t something optional for Kyber implementors;
it’s baked into the structure of Kyber’s public keys and ciphertexts.
Using dimensions that aren’t multiples of 256
would require changing the core Kyber design.

The same Kyber “advantage” also means that going beyond 1024
would lead to performance issues and,
more importantly,
security issues surrounding occasional
“decryption failures” forced by the prime baked into the NTT.
Avoiding this would again
require changing the core Kyber design.

For comparison,
NTRU options targeting higher security levels—including
simple proofs that there are no decryption failures—are readily available.
For example, one of the
NTRU Prime
options is sntrup1277.

But let’s assume that NIST doesn’t care about Kyber’s limitations at the high end.
Let’s instead focus on the low end,
specifically on applications that
have limited sizes for public keys and/or ciphertexts
and thus can’t use the highest available security levels.

An application limited to 1KB
can’t use Kyber-768 (1184-byte public keys, 1088-byte ciphertexts).
The highest-security Kyber option for that application
is Kyber-512 (800-byte keys, 768-byte ciphertexts).

The same application obtains
higher security with NTRU,
according to a security-estimation mechanism called “Core-SVP”.
For example, the application can use

sntrup653 (994-byte keys, 897-byte ciphertexts),
where the Core-SVP security estimate is 2¹²⁹,
or
NTRU-677 (ntruhps2048677, 931-byte keys, 931-byte ciphertexts),
where Core-SVP is 2¹⁴⁵,

while the current version of Kyber-512,
starting with the round-3 version from 2020,
has Core-SVP just 2¹¹⁸.

Is this “Core-SVP” something I made up to make Kyber look bad?
Absolutely not:

Core-SVP is the security-estimation mechanism
that was chosen by the Kyber team
to estimate security levels
in its round-1 and round-2 submissions.
The mechanism was introduced by Kyber’s predecessor, NewHope.
In 2020,
after I expressed
skepticism
about whether Core-SVP
“gets the right ordering of security levels”,
NIST
stated that
“we feel that the CoreSVP metric does
indicate which lattice schemes are being more and less
aggressive in setting their parameters”.
NIST’s official
round-2 report
in 2020
used Core-SVP for comparisons.
The original definition of Core-SVP assigns
2¹¹² to the round-3 version of Kyber-512.
Round-3 Kyber
switched
to a new definition of Core-SVP
that increases Kyber’s Core-SVP
(without changing anything for NTRU).

This blog post has bigger fish to fry,
so let’s blindly accept Kyber’s
claim that the new definition is better,
meaning that Kyber-512 has Core-SVP 2¹¹⁸.
That’s still clearly worse than the
2¹²⁹ for sntrup653 and the 2¹⁴⁵ for NTRU-677.

It’s not that Kyber’s competitors
always beat Kyber in size-security tradeoffs.
For example,
if an application instead has a limit of 1184 bytes,
then it can use Kyber-768, which has Core-SVP 2¹⁸¹,
while ntruhps needs 1230 bytes to reach Core-SVP 2¹⁷⁹.

But Kyber’s competitors
often beat Kyber in size-security tradeoffs.
Throwing away Kyber-512,
leaving just Kyber-768 and Kyber-1024,
means that Kyber has nothing as small as the 931 bytes for NTRU-677.

The normal way for scientists to present quantitative tradeoffs is with scatterplots,
such as Figure 3.5 in my 2019 paper
“Visualizing size-security tradeoffs for lattice-based encryption”.
The particular scatterplot shown here is
Figure 7.3 in the 2021 paper
“Risks of lattice KEMs”
from the NTRU Prime Risk-Management Team.
The vertical axis is the Core-SVP security estimate,
and the horizontal axis is ciphertext bytes.

The scatterplot shows that
Kyber has a higher Core-SVP than NTRU
for applications with a size limit of,
e.g., 768 bytes or 1088 bytes.
But NTRU has a higher Core-SVP than Kyber
for applications with a size limit of,
e.g., 700 bytes or 1024 bytes or 2048 bytes.
Kyber has nothing as small as the 699-byte option for NTRU.
Kyber also has nothing as strong as the 1842-byte option for NTRU.
NTRU is also trivially capable of adding further options
between and beyond what’s shown in the graph,
whereas for Kyber this is more problematic.

Official evaluation criteria for the competition.
NIST had issued an
official call
for post-quantum proposals in 2016.
One of the evaluation criteria in the call was as follows:

Assuming good overall security and performance, schemes with greater
flexibility will meet the needs of more users than less flexible schemes,
and therefore, are preferable.

One of the official examples given for “flexibility”
was that it is
“straightforward to customize the scheme’s parameters to meet a
range of security targets and performance goals”.

The call proposed five broad security “categories”,
and said that submitters could specify even more than
five parameter sets to demonstrate flexibility:

Submitters may also provide more than one parameter set in the same
category, in order to demonstrate how parameters can be tuned to offer
better performance or higher security margins.

In 2020,
NIST eliminated NewHope.
One of the reasons stated
in the aforementioned
round-2 report
was that
“KYBER naturally supports a category 3 security strength parameter set,
whereas NewHope does not”.
NewHope offered only NewHope-512 and NewHope-1024.

Imagine Kyber similarly offering only Kyber-768 and Kyber-1024,
acknowledging that Kyber-512 doesn’t meet the minimum security level specified by NIST.
It’s then very easy to see how limited Kyber’s flexibility is
compared to NTRU’s broader, denser spectrum of security levels.
How, then, would NIST argue that Kyber is the best option?

One answer is that the evaluation criteria say more flexibility is preferable
only assuming “good overall security and performance”.
But how would NIST argue that NTRU doesn’t have “good overall security and performance”?

Regarding the security of Kyber and NTRU,
NIST’s official 2022
selection report
says that NIST is
“confident in the security that each provides”.
The report describes MLWE, the problem inside Kyber,
as “marginally more convincing” than the problem inside NTRU.
There’s much more that could and should have been said about
the security comparison between Kyber and NTRU:

Kyber’s use of modules,
despite being portrayed as purely having a (marginal) security benefit,
also introduces
extra subfields
into the cryptosystem structure,
creating security risks
analogous to the risks of taking extra subfields in pre-quantum DH.
Fewer extra subfields appear in NTRU (depending on parameters) than in Kyber.
NTRU Prime completely avoids extra subfields.
Kyber’s QROM IND-CCA2 proof assuming MLWE hardness is much looser
than NTRU’s QROM IND-CCA2 proof assuming hardness of the problem inside NTRU.
In other words,
even under the assumption that MLWE is as strong as the problem inside NTRU,
Kyber could be much weaker than NTRU.
NIST could have told people to use NTRU
shortly after its deadline for NISTPQC input in 2021.
Instead it delayed for three quarters of a year to carry out patent negotiations,
and ended up telling people
to wait for its Kyber patent license to activate in 2024,
giving away three years of user data to attackers.
Picking Kyber was doing obvious damage to security
given the patent situation.

The situation isn’t that NTRU avoids every security risk of Kyber.
A careful comparison
finds mathematical security risks in both directions.
Maybe there’s a way to argue that the mathematical security risks for NTRU
should be given higher weight than the mathematical security risks for Kyber.
But the immediate choice that NIST was facing in 2021 between NTRU and Kyber,
assuming that the attackers currently recording user data
will have quantum computers in the future, was between

the security risks of NTRU and
the guaranteed security failure of not yet deploying anything.

The call for submissions said
“NIST believes it is critical that this process leads to cryptographic
standards that can be freely implemented in security technologies and products”.
Nothing else in the call was labeled as “critical”.
How could NIST ignore the damage that it was doing in not going ahead with NTRU?
NIST knew it didn’t have a patent license signed for Kyber yet,
let alone an activated patent license.

Anyway,
let’s get back to the question of how NIST might be able to argue
that NTRU doesn’t have “good overall security and performance”.
A report saying that NIST is
“confident in the security that each provides”
is obviously not claiming that NTRU doesn’t have “good overall security”.
What about performance?

The same selection report admits that
“the overall performance of any of these KEMs
would be acceptable for general-use applications”.
If the objective is to use performance differences as a deciding factor
between two acceptable options,
let’s see how Kyber would stack up without Kyber-512:

Kyber-768 and Kyber-1024 provide size-security tradeoffs that NTRU doesn’t match.
NTRU-677 and NTRU-1229
provide size-security tradeoffs that Kyber doesn’t match.
Even more options are already implemented for NTRU Prime.
The smallest options are from NTRU, not Kyber.
The highest-security options are from NTRU, not Kyber.

This is a solid case for eliminating Kyber in favor of NTRU,
given NIST’s declaration that there can be only one.

(If NIST thought that performance differences at this scale matter,
and if the best performance comes from Kyber at some security levels
and NTRU at other security levels,
then why wasn’t NIST allowing both?
Answer:
The movie says there can be only one!
STOP ASKING QUESTIONS!)

Tilting the competition, part 1: ignoring NTRU’s extra flexibility.
Keeping Kyber-512 changes the competition.
Having three options, Kyber-512 and Kyber-768 and Kyber-1024,
looks a lot better than having just two.

There are four NTRU circles in the first scatterplot above,
namely NTRU-509 and NTRU-677 and NTRU-821 and NTRU-1229.
But NTRU-821 isn’t a winner,
and earlier in NISTPQC there wasn’t an NTRU-1229.

Wait a minute.
The NTRU literature has always made clear that NTRU supports many more options.
For example,
here’s a scatterplot
from John Schanck’s 2018 paper
“A comparison of NTRU variants”.

There are a huge number of dots;
each dot is showing another NTRU option.

One of the bizarre twists in NISTPQC
was the following announcement
from NIST in 2020:
“NIST believes that too many parameter sets make
evaluation and analysis more difficult.”
I asked
various questions
about this, starting as follows:

How many is “too many”? How did flexibility, which was portrayed as
purely positive in the call for proposals, turn into a bad thing for
NIST? The call for proposals explicitly allowed multiple parameter sets
per category, never suggesting that this would be penalized!

NIST’s latest report complains about NewHope’s lack of flexibility to
use dimensions strictly between 512 and 1024. If a submission team is
thinking “Aha, Kyber similarly suffers from its lack of flexibility to
target security levels strictly between maybe-2¹²⁸ and maybe-2¹⁹², and
we can clearly show this to NIST by selecting parameter sets at several
intermediate security levels”, then isn’t this something NIST should be
interested in, rather than discouraging by making submitters worry that
this is “too many parameter sets”?

NIST never replied.

Think about what this is like for
submitters trying to figure out what to do:

The official evaluation criteria say flexibility is good.
A high-profile submission has just been eliminated,
in part for having only two parameter sets.
So, okay, implement more parameter sets
to demonstrate flexibility.
But, yikes, NIST is suddenly going out of its way
to criticize “too many” parameter sets.
They won’t say what “too many” means
and where this criticism came from.

NTRU Prime
moved up to selecting six sntrup parameter sets
(plus six ntrulpr parameter sets, which, compared to sntrup,
have larger ciphertexts but smaller public keys),
enough that the flexibility advantage over Kyber should have been impossible to ignore.
NIST ignored it.

Tilting the competition, part 2: exaggerating and hyping key-generation costs.
For Intel’s recent Golden Cove microarchitecture
(the “performance” cores in Alder Lake CPUs),
https://bench.cr.yp.to
reports that

Kyber-512 takes 25829 cycles for encapsulation
and 20847 cycles for decapsulation,
while
NTRU-509 takes just 15759 cycles for encapsulation
and 25134 cycles for decapsulation.

The total cycle count for handling a ciphertext,
the total of encapsulation and decapsulation,
is 13% smaller for NTRU-509 than for Kyber-512.

NTRU-509 also beats Kyber-512 in ciphertext size.
NTRU-509 is the leftmost dot in the first scatterplot above,
meaning smallest ciphertexts.

On the other hand,
NTRU-509 takes 112866 cycles for key generation
while Kyber-512 takes only 17777 cycles.
The total of key generation plus encapsulation plus decapsulation
is more than twice as large for NTRU-509 as for Kyber-512.

When some factors favor one option and some factors favor another option,
someone objectively searching for the best option
will think about what weight to put on each factor.
Here are three reasons that a careful performance analysis
will put very low weight on Kyber’s key-generation speedup:

There’s overwhelming evidence that these cycle counts
are far less important than byte counts.
A useful rule of thumb is that sending or receiving a byte
has similar cost to 1000 cycles;
see Section 6.6 of the aforementioned paper
“Risks of lattice KEMs”.
Sending a key, receiving a key, sending a ciphertext, and receiving a ciphertext
involves thousands of bytes, similar cost to millions of cycles.
All of these KEMs are designed to allow a key to be reused for many ciphertexts.
If an application actually cares about the cost of key generation
then this reuse is an obvious step to take.
NIST’s
official evaluation criteria
already acknowledged the possibility
that “applications can cache public keys,
or otherwise avoid transmitting them frequently”.
Many applications are naturally reusing keys in any case.
Even in the extreme case of an application that structurally has to use
a new key for each ciphertext,
there’s a trick due to Montgomery that makes NTRU key generation much faster.
Billy Bob Brumley, Ming-Shing Chen, Nicola Tuveri, and I
have a paper
“OpenSSLNTRU: Faster post-quantum TLS key exchange”
at USENIX Security 2022
giving a web-browsing demo on top of TLS 1.3 using sntrup761
with Montgomery’s trick for key generation.
We already had the paper and code online in 2021,
before NIST’s deadline for input regarding NISTPQC decisions.

In other words:
If an average key is used for just 100 ciphertexts
then Kyber-512 saving 95089 Golden Cove cycles in key generation is

of similar importance to changing ciphertext size by a fraction of a byte;
6x less important
than NTRU-509 saving 5783 cycles per ciphertext;
and
not what will happen in applications trying to optimize key-generation time,
since in NTRU’s case those applications will use Montgomery’s trick.

With this in mind,
let’s look at the “Kyber vs NTRU vs Saber” slide
from NIST’s March 2022 talk
“The beginning of the end: the first NIST PQC standards”.

The eye is immediately drawn to the larger red bars on the right.
NTRU appears in two of the groups of bars,
in both cases with clearly larger bars,
meaning worse performance.

The main message NIST is communicating here is that
NTRU costs strikingly more than Kyber and Saber.
Only a small part of the audience
will go to the effort of checking the numbers
and seeing how NIST manipulated
the choices in its presentation to favor Kyber over NTRU:

The graph gives 100% weight to key generation,
utterly failing to account for key reuse.
The graph also
utterly fails to account for Montgomery’s trick.
The graph does include some recognition of communication costs,
but even here NIST couldn’t resist tweaking the numbers:
“1000*(PK+CT)” counts Alice’s cost while omitting Bob’s cost.

Regarding the last point:
1000 is just a rule of thumb.
NIST could have posted a rationale for a proposal to use 500
and asked for public comments.
But it didn’t.

NIST’s
secret October 2021 Kyber-SABER-NTRU comparison
claimed, without citation, that I had said 1000*(PK+CT) was reasonable.
Compare this to what I had
actually written
in 2019

Debunking NIST’s calculation of the Kyber-512 security level by bumbledraven

Debunking NIST’s calculation of the Kyber-512 security level by bumbledraven

Share This Article

Newsletter

2023.10.03: The inability to count correctly: Debunking NIST’s calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

HackTech

Leave a comment Cancel reply

Editor's Choice

Debunking NIST’s calculation of the Kyber-512 security level by bumbledraven

Debunking NIST’s calculation of the Kyber-512 security level by bumbledraven

Share This Article

Newsletter

2023.10.03: The inability to count correctly: Debunking NIST’s calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter