Skip to content

MLKEM-768 with AWS-LC#14598

Open
DarkaMaul wants to merge 32 commits intopyca:mainfrom
trail-of-forks:dm/mlkem-768
Open

MLKEM-768 with AWS-LC#14598
DarkaMaul wants to merge 32 commits intopyca:mainfrom
trail-of-forks:dm/mlkem-768

Conversation

@DarkaMaul
Copy link
Copy Markdown
Contributor

Add the support for ML-KEM using AWS-LC as the backend.

The PR is massive (sorry), about 1000 lines, but I could not reduce it more.

It follows closely the pattern of MLDSA, with the latest changes (class cases, variant enum)

A following PR with the documentation will be opened once this one get merged.

(Note: until aws/aws-lc#3140 is fixed, we have to store the ML-KEM seed on our side)

return rust_openssl.mlkem.from_mlkem768_public_bytes(data)

@abc.abstractmethod
def encapsulate(self) -> tuple[bytes, bytes]:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reaperhulk WDYT?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the go API

Comment on lines +143 to +145
// ML-KEM returns the seed rather than a PKey because AWS-LC's
// kem_priv_encode writes the expanded key (2,400 bytes) into PKCS#8,
// not the seed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it to clarify it.
I wanted to say that we can't use OpenSSL PKey construct here, so we need a second element in our ParsedPrivateKey enum.

}

pub fn serialize_private_key(key: &ParsedPrivateKey) -> crate::KeySerializationResult<Vec<u8>> {
let ParsedPrivateKey::Pkey(pkey) = key;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you refactor this out into its own PR for readability

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open as #14607

}

extern "C" {
fn EVP_PKEY_kem_new_raw_public_key(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to declare these, they should just be available in ffi::

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, only EVP_PKEY_keygen_deterministic is needed because it is still in experimental/ in aws-lc.

IIUC, this is Algorithm 16 in FIPS 203.

Using this function allows us to use the seed form of the private key.

format: crate::serialization::PrivateFormat,
encryption_algorithm: &pyo3::Bound<'p, pyo3::PyAny>,
) -> CryptographyResult<pyo3::Bound<'p, pyo3::types::PyBytes>> {
if !encryption_algorithm.is_instance(&types::KEY_SERIALIZATION_ENCRYPTION.get(py)?)? {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we end up needing to re-implement a ton of the utility function whereas that wasn't required for MLDSA?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that we can't represent the private key as an openssl PKey (AWS-LC's kem_priv_encode writes the expanded key into PKCS#8, not the seed), so we can't use pkey_private_bytes here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have another solution by refactoring pkey_private_bytes to:

  • accept the new enum ParsedPrivateKey as argument
  // Replace this line with a match {}
        let raw_bytes = pkey.raw_private_key()?;
        return Ok(pyo3::types::PyBytes::new(py, &raw_bytes));
  • Revert
        let parsed = cryptography_key_parsing::ParsedPrivateKey::Pkey(pkey.to_owned());
  • Gate before if format == PrivateFormat::TraditionalOpenSSL { to ensure we're not reaching this for MLKem keys

The main downside is that the refactor modifies every caller of the function from &slf.borrow().pkey, to ParsedPrivateKey::Pkey(slf.borrow().pkey.to_owned())

That would avoid the deduplication here (and the need for extensive tests for each variant of the encryption part)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the aws-lc change to store the seed on the pkey lands, we'll need neither of these changes, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct! Once we can represent MLKEM's key as PKey, all this complexity goes away

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment clearly indicating this (with a link to the issue/PR)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landed as aws/aws-lc#3149

In this case, I think it's better to wait for a new version of AWS-LC, so we remove this complexity?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we think that'll be in a release soon (<10 days), then waiting seems ok. If not, landing and then following up with the cleanup seems preferable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v1.72.0 has been released !
I'll updated my PR!

This was referenced Apr 8, 2026
# Conflicts:
#	src/cryptography/hazmat/primitives/asymmetric/types.py
}

extern "C" {
fn EVP_PKEY_keygen_deterministic(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment that this is needed because its in the expiremental header

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The review comment got displaced by GitHub UI, but a comment was added Line 42 )

#[cfg(CRYPTOGRAPHY_IS_AWSLC)]
cryptography_key_parsing::ParsedPrivateKey::MlKem(variant, seed) => {
let pkey = cryptography_openssl::mlkem::new_from_seed(variant, &seed)?;
Ok(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably match on variant here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

Comment on lines +369 to +370
let pub_len = pkey.raw_public_key()?.len();
assert_eq!(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do the match pattern here (like mldsa) so it's ready for other variants

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

format: crate::serialization::PrivateFormat,
encryption_algorithm: &pyo3::Bound<'p, pyo3::PyAny>,
) -> CryptographyResult<pyo3::Bound<'p, pyo3::types::PyBytes>> {
if !encryption_algorithm.is_instance(&types::KEY_SERIALIZATION_ENCRYPTION.get(py)?)? {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment clearly indicating this (with a link to the issue/PR)

ss_n = key.decapsulate(binascii.unhexlify(vector["ct_n"]))
assert ss_n == binascii.unhexlify(vector["ss_n"])

@pytest.mark.parametrize(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like a test or two that load a specific key and asserts about the parsed results.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test for the serialization, but I don't think we can test reliably the encryption part.

(Added as part of this PR, but it contains two new test vectors. If you agree with the approach, I'll move them in a separate PR)

skip_message="Requires a backend with ML-KEM-768 support",
)
@wycheproof_tests("mlkem_768_encaps_test.json")
def test_mlkem768_encaps(backend, wycheproof):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have private key to decap and verify it round trips?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is covered by the test_kat_vectors in tests/hazmat/primitives/test_mlkem.py.
We decapsulate there the ciphertext ct, and verify that the shared secret is correct .

But I think I'm not understanding your comment here.

}

pub fn encapsulate(
variant: MlKemVariant,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to pass variant, or can we extract it from pkey? (I want to avoid them not being consistent with each other)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to not use the variant here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants