Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Hacking a CTF: Do not use ECB mode for encryption (znano.eu.org)
42 points by znano on Jan 5, 2024 | hide | past | favorite | 58 comments


There is an AES secure mode of operation that is almost on the level of ECB in terms of performance: OCB[1]. OpenSSL has support for it and all patents have expired.

    The 'numbers' are in 1000s of bytes per second processed.
    type                16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
    AES-256-GCM         655193.56k  1747223.44k  3399072.36k  4490100.61k  5033129.30k  5108596.96k
    AES-256-OCB         631357.15k  2268916.74k  4794610.30k  6492985.36k  7174274.14k  7301540.60k
    AES-256-ECB         997960.96k  3972424.35k  8096120.70k  8105542.89k  8179659.94k  8188882.64k
[1] <https://en.wikipedia.org/wiki/OCB_mode>, <https://competitions.cr.yp.to/round3/ocbv11.pdf>


In the main, people don't use ECB for performance; they use it because it doesn't require any library support and it doesn't require you to understand what a "cipher mode" is --- it's just the cipher applied directly to the plaintext, which, if you don't have any cryptography domain knowledge, seems like a totally sane thing to do. I'm an advocate for calling ECB "the default mode".


"Insecure mode" sounds a lot better than "default mode". If I didn't know what any of the options meant, I'd feel safe using BlockCipherMode.Default, but I wouldn't feel safe using BlockCipherMode.Insecure.


Calling it the default mode sounds a bit risky, if one assumes that defaults are secure.


The implication, of course, is that the defaults in cryptography are rarely secure.


That seems like something we should to try to fix, rather than lean into


Good luck!

The reality is that this was a lot more relevant 10 years ago; today, new code is much more likely to use GCM or Chapoly than to generically compose a cipher mode with (hopefully) an authenticator.


Cryptography libraries etc. are moving towards exposing AEAD modes by default, so doesn't that make it even more weird to call ECB a default mode?


It does. I'm not saying it's not a weird me thing. It was a better rhetorical dunk back when people were rolling their own message encryption; then it served as a kind of warning that if you just took the library defaults and the simplest path to getting your code working, you'd end up with something totally insecure.


Is there a legitimate reason to use it? Or is this a backward compatibility thing that nobody has made a fuss about?

Sounds more like a foot gun to me.


Perhaps call it the dumb mode, as in it's not doing anything more clever than straight application, and it's dumb to use it like that...


It's also more resilient to partial transmission errors (that's just a different way of describing the same property that is its security weakness).


That's Applied Cryptography logic. Applied Cryptography allocates a lot of space to discussing error recovery in ciphers, a concept that has essentially no place whatsoever in modern cryptography --- if you're worried about transmission errors, solve that problem at a higher level with an error correcting code, or send smaller messages so your system can tolerate the loss of messages.


naive mode


Exciting to hear that about OCB -- I hadn't been keeping track of the expiration. It's great that there's now a widely-usable, single-pass, block-cipher-only authenticated encryption mode.


I really don't understand why people don't just use a counter-based cipher. Like whichever one is the one where you start with a random IV, generate an infinite pseudorandom string from it, xor it with your data & its hash. Bonus points if you do this with multiple ciphers and xor them. Why is this not popular?


> Why is this not popular?

If you meant using a counter-based cipher: it is. GCM is the most popular mode on the modern internet, for example.

If you meant using multiple ciphers: because it adds overhead without adding security.


Of note: cascading ciphers is, in general, at most as secure as the first cipher. The rest add nothing. For additive stream ciphers (like CTR mode) it's as secure as the most secure cipher instead.


I think you made a typo, "at most as secure" should be "at least as secure" :)

But you're right, the important takeaway here is that encrypting with a weak and a strong cipher sequentially may be less secure than just using the strong cipher. For those to which this seems counterintuitive, here is the paper that shows this result: https://link.springer.com/article/10.1007/BF02620231


Would be a fun plot twist in a Bond-like movie. Baddies have used multiple encryption algorithms like an onion, which due to this issue can be broken rather easily by the good guys. "Good thing they didn't know about this issue and put the weak cipher first..."


Correct, should have been "at least".


How does it not add security? It forces attackers to break every cipher instead of just one. That doesn't add security?


In theory - yes, multiple ciphers are harder to break. In practice, we have zero examples of anyone breaking (good) modern ciphers, and lots of examples of people breaking clever constructions involving those ciphers.


The scheme you describe (as-is) is not authenticated, compared to something like OCB (mentioned in a sibling comment). Xoring multiple ciphers together doesn't achieve anything especially useful.


Are you saying the hash I mentioned doesn't prevent tampering?

And I thought xoring them forces attackers to have to break all of the ciphers instead of just one. That seems pretty useful.


If you hashed the plaintext, you leak information about the plaintext, and depending on the hash it's vulnerable to length extension attacks.

If you hashed the ciphertext then anyone can re-hash a modified version.

What you're looking for is a HMAC, which works, but it's usually slower than a block mode that integrates authentication functionality (like OCB, or GCM).

It's true that you'd have to break all the ciphers, but we're generally very confident about the security of our symmetric ciphers, so unless you're very paranoid it's just a waste of clock cycles,


I think you misunderstood what I wrote. What I said was you finish off by hashing the plaintext and xoring it with the final chunk of your random stream. So you're not sending over the hash itself, but the encryption of it.

I don't buy the confidence thing but sure.


A hash of the plaintext isn't an authenticator; it doesn't have a key, so anybody can compute the hash of an intended plaintext. The CTR cipher by itself is very malleable; if you know a run of plaintext characters, you can rewrite it to anything with XOR:

https://cryptopals.com/sets/4/challenges/26

You could CTR-encrypt a message, and then use a hash function in an HMAC construction. HMAC takes a secret key, so an attacker can't compute the HMAC of some other message and swap it in. CTR+HMAC is pretty common, though it's a design smell in a modern system, virtually all of which use AEADs rather than hand-rolled ("generic composition") systems.


I'm not following unfortunately. To be clear, what I'm saying is ciphertext = Encrypt(plaintext || Hash(plaintext)) where Encrypt(data_stream)[i] = data_stream[i] XOR random_stream[i]. I'm not following how an attacker can tamper or decipher anything in the ciphertext here. The moment you tamper with any bit, the hash would become invalid. You can XOR it with whatever you want but you can't turn it into anything valid... right? What am I missing?


If you know the plaintext, you know the hash of the plaintext, because the hash doesn't have a key.

Encrypt() is CTR, so you can flip bits in ciphertext to alter the same bits in the plaintext.

You know the plaintext and the hash and you know where the hash is, so you can mechanically rewrite the hash to be whatever you need it to be.

You can prove this to yourself in Python or Ruby or whatever: just CTR-encrypt "We all live in a yellow submarine", along with its hash. Then bitflip "yellow" to "orange", compute the hash for "We all live in a orange submarine" (ignore the grammar), and bitflip that into the original hash.

There are formalisms to why hash-then-encrypt is unsafe that I'm not giving you here, because I'm not smart enough to pull them out of the top of my head, but the scenario I just described is already a game-over vulnerability in many settings; a similar fact pattern has broken lots of encrypted cookie schemes, which you do in fact know the plaintext, and some bit of the plaintext (like "admin=n") happens to be incovenient to attackers.


> If you know the plaintext, you know the hash of the plaintext, because the hash doesn't have a key.

I actually was imagining the key would be used in the hash function (just like it's also used by the encryption function). It didn't occur to me to explicitly mention it, but I see why it can be interpreted differently otherwise. (Thanks for explaining/clarifying.)

If I did that though, wouldn't that take care of this?


You're gradually deriving MAC-then-Encrypt. Switch the two around: encrypt, and then MAC (use HMAC, not simply a keyed hash). Authenticate the ciphertext, which denies attackers any opportunity to alter the ciphertext bits. Now you have a construction that has a reasonably strong chance of being solid, assuming you manage the CTR nonce and counter safely. Because CTR nonces are such a reliable source of vulnerabilities in real software, go a step further and just use CBC mode, where a repeated/predictable IV is less devastating. Now you have a construction that might credibly even prefer to GCM.


Using a key in the hash function is basically what an HMAC is.


My recollection was that by definition an HMAC crucially performs 2 rounds of hashing layered over the message... something like h(key || h(key || msg))? Whereas in my mind I was imagining something with one layer over the message (like h(f(key) || msg)) could work; I didn't actually see the need to do two layers of hashing over the message in what constituted (or at least I understood to constitute) a proper HMAC. Not sure if I had a misunderstanding of either the concepts or the terminology here, but if I did, please do let me know.


https://en.wikipedia.org/wiki/HMAC#Design_principles addresses the rationale for performing 2 rounds of hashing.


Right, I get that it's guarding against a bad hash. I was (obviously) assuming the hash and RNG aren't flawed. In which case an proper HMAC seems unnecessary in this scheme. But thanks for clarifying.


I think maybe the gap here is about what a "bad hash" is. The attack described in the design principles section is applicable to hashes like SHA1 and SHA2, which makes it a pretty big deal. You could just always use a truncated variant of SHA2, or use SHA3, and you'd be covered for this specific risk... but at some point it's way easier to just start with an existing AEAD scheme instead of trying to piece something together from parts.


The thing is, if I'm just referring to an unspecified hash in a cipher I'm proposing, it's coming with the implicit assumption that (a) the hash is sufficiently good to make the scheme work, and (b) such a hash is actually available. To me, in a discussion like this, "your scheme sucks because SHA2 would break it" is fundamentally the same as "your scheme sucks because CRC would break it": from a mathematical standpoint it's just nonsensical (you can always find a bad hash function to break a cipher), and from a practical standpoint, while I agree it's a "big deal" (e.g., if I were implementing such a cipher, I would surely put up guardrails to prevent people using such a hash), it's also the least interesting way to shoot down the cipher I'm trying to discuss.


What do you mean, "flawed"? Merkle-Damgard-flavored hashes are extendable by design. You could use a more recent hash, like SHA3 or Blake2, simply with a key; that's basically KMAC. The point is: you need a MAC, not a "hash".

Most people in your situation would just use HMAC. You're not really optimizing anything by avoiding HMAC, but if you try to roll KMAC-SHA2, you've created a cryptographic vulnerability.


> What do you mean, "flawed"? Merkle-Damgard-flavored hashes are extendable by design.

Flawed as in "cryptographically insecure". Which in this case is "it lets you predict something about the hash of one input from the hash of a different input".

As I see it, a cryptographically secure (read: flawless) hash function is just another name for an RNG whose output is computationally impossible to predict without knowledge of its seed. Hash functions that are vulnerable to length extension don't satisfy that: their outputs on longer inputs can be predicted from their outputs on shorter inputs, without any knowledge of their keys. That's a pretty big flaw. To me they're similar to hashes that lack the avalanche effect. They're not cryptographically secure as far as I'm concerned.

> you need a MAC, not a "hash"

I hate jargon honestly. I just refer to these (cryptographically secure) hashes and move on with my day. To me a hash that's vulnerable to length extension isn't cryptographically secure. I kind of assume people understand I'm assuming sufficiently strong building blocks when proposing a way to compose them, otherwise it makes for an uninteresting discussion.

> Most people in your situation would just use HMAC. You're not really optimizing anything by avoiding HMAC

Totally agree, but I wasn't suggesting you can't or shouldn't. I just didn't see HMAC as a requirement, hence why I didn't specifically write HMAC.


It's weird to me to pick on "jargon" like "MAC", but then make up your own definition for what counts as a hash.

A MAC is not a "cryptographically secure hash". Nor is SHA-2 somehow not a secure hash because it can't be directly used as a MAC.


> It's weird to me to pick on "jargon" like "MAC", but then make up your own definition for what counts as a hash.

This is probably because I completely forgot "cryptographically secure hash" is itself jargon with a wacky definition. To me it was an intuitive concept, but looking it up now, the technical definition isn't what I expected.

In my mind I have both a formal definition of what a "cryptographically secure hash" is (the one I wrote above), and an informal definition: "a hash function that's safe to use anywhere a hash function is needed in cryptography". Regardless of what you think about the formal definition... I would've hoped everyone would agree about the informal one. It only seems reasonable to deduce that from its name. But I guess I'm too naive to think things mean what they say?

So now that you point it out, I hate that definition too. Would be lovely if they changed it to mean what it says on the tin. I doubt that I'll even manage to remember to change my vernacular as a workaround for the wacky technical definition.


Just a note that if jargon upsets you to the point where you self-consciously and deliberately avoid it, cryptography might not be the specialty for you.

We're talking here about one of the very simplest tasks in cryptography, of simply encrypting a discrete message. To steal a rhetorical device from Watson Ladd, after a long discussion that started with a gravely broken system, we arrived at something that might work, and that would almost certainly get you a D in a first undergraduate course.

The experience you'll have trying to design anything more complicated is that, even with a lot of effort and careful thought _and experience_, will be that the first cut of any system you come up with will have vulnerabilities. Of course they will. Look at the track record of (checks) everybody else that has ever tried to ship cryptography.

But the problems you're going to run into aren't going to be described intuitively, in "plain language". Just like being a medical doctor, you need to be conversant with the terminology of the field to understand what the pitfalls are. Very serious vulnerabilities in schemes you come up with are going to be described in the literature in terms of acronyms related to proofs that are themselves stitched together with jargon and acronyms.

Cryptography is hard! I solve this problem for myself by simply refusing to design cryptosystems for other people. It's much easier being a vulnerability researcher that dabbles in cryptography than it is to be a practicing cryptography engineer.


I'm indeed neither a cryptographer nor planning to become one, don't worry ;)


Let me again put a word in for being a cryptographic vulnerability researcher, which will enable you to thumb your nose at academic cryptography jargon all you want!


>ciphertext = Encrypt(plaintext || Hash(plaintext))

As already mentioned, this allows the attacker to generate a valid hash in some circumstances. You don't have to allow this and it has been argued that such an approach is secure[1].

As a practical example, which is guaranteed to annoy someone just by the suggestion, the OpenPGP authenticated encryption mode that has been around forever has never been subverted in this way but is at its heart "hash then encrypt". It depends on a random value that is kept from attackers[2].

[1] https://cseweb.ucsd.edu/~mihir/papers/enc-red.pdf

[2] https://articles.59.ca/doku.php?id=pgpfan:mdc (my article)


This violates the Cryptographic Doom Principle https://moxie.org/2011/12/13/the-cryptographic-doom-principl...


That's about padding (in CBC), IIUC. There was no padding in what I wrote. If you believe what I wrote is vulnerable to that could you explain how?


No, it's not. CBC padding oracles are just the most popular incarnation of the underlying problem.

https://crypto.stanford.edu/RealWorldCrypto/slides/phil.pdf


Guess OP never cracked a book on crypto. The aes penguin is like two decades old now.

https://raw.githubusercontent.com/pakesson/diy-ecb-penguin/m...


For over a decade, the Internet believed the problem with ECB was that you could see penguins through it. Implementors looked at that, said "I don't care if you can see penguins through this", and plowed ahead. In fact the real problems with ECB are that you can rewrite the plaintext and, in many common configurations, decrypt it outright --- the same insight got us the BEAST TLS attack.


A block cipher in ECB mode is a substitution cipher, with all the weaknesses that entails. A substitution cipher with a lot of bits per symbol.


Do any specific books come to mind for you as recommended reading for those looking to learn more?


JP Aumasson's "Serious Cryptography" and David Wong's "Real World Cryptography" are good starts.


I recommend going through the CryptoPals[0] challenges. I'm halfway and they really beat into you the problems with ECB.

[0]https://cryptopals.com/


Cryptography Engineering is the classic. Serious Cryptography is also good and more recent.


I liked Cryptography Engineering ("Practical Cryptography") but it definitely doesn't hold up; it was obsolete by 2010. You'd get yourself in trouble implementing from it today.


Serious Cryptography by Jean-Philippe Aumasson.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: