This repo is a sample of some basic encryption methods, and some references for further reading. It is the public half of an interactive unit on encryption being developed for an undergraduate elective course. The full unit includes PPT slides, other VSCode examples*, and several group assignments.
Within the ./src directory are groups of encryption methods sorted into categories such as substitution, transposition and grids. The public version of this repo does not go into modern encryption methods, such as AES.
Each encryption method has its own folder with the following:
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
These four files are the core demonstration of each algorithm. In some directories there may be additional files such as decrypt_improved.py or decrypt_improved_test.py. The contents of additional files are discussed in the Discussion Section under their specific algorithm.
Claude AI was used to generate the test cases for the public version of this repo, in order to not give away any hints for the group assignments or demos. Claude AI also inserts extra commentary and examples, which is pretty fun for creating samples. NOTE: these test examples are very different in format from the PPT slides and demos, and shouldn't be used as a starting point for the group work. The encrypt and decrypt classes are currently 1:1 with the demos. For details on AI Usage and how it performed, see the Notes on AI as an Instructional Tool
*VSCode is the 'default' IDE to match the Python virtual environment tutorial that is shared between the tutorial units. Other IDEs and setups can be used if you have a preferred setup.
- What is Encryption?
- Requirements
- Implementation
- Included in this Project
- Some Algorithm Discussion
- Notes on AI as an Instructional Tool
- References
A message encrypted using a public key and decrypted using a private key
Encryption is a method of making transmitted information unreadable to anyone who is not the intended sender or recipient(s). These transmissions can be text, images, raw data, or any kind of information that can be digitized and sent between at least two points.
Algorithms for obfuscating text have been around for thousands of years. The 'Caesar Cipher', where a message is encrypted with a shifted dictionary, is one of the most well-known early examples of a substitution cipher. Physical methods, such as using a scytale wrapped with pieces of parchment or leather to encode and decode messages in a transposition cipher have also existed for at least as long as substitution ciphers. There also exists entire secret languages that have been used in both spoken and written forms to communicate. (It turns out people have always been clever about sending secret messages!)
Why all the effort? When using complex algorithms, encryption ensures that even if the data is intercepted, it remains secure. This makes it essential for maintaining privacy and security in communications, including across the internet in the modern day.
In the figure above, Hello World! is encrypted with a public key, and then sent as EBIIL TLOIA!. The intended receiver, who has the private key, is able to decrypt the message into the original Hello World!.
This image, while simple, is a good introduction for several complex topics:
-
Public key cryptography, also calledasymmetric cryptography, uses a pair of keys in order to encode and decode messages. This is what is depicted in the image above. When the cryptographic algorithm is created, two keys are generated. One of these keys is thepublic keywhile the other is theprivate key. The public key is known by anyone and everyone. To send a private message, the sender encrypts it with the recipient's public key, and only the recipient'sprivate keycan decode it. The same key pair can also be used in the other direction fordigital signatures: the owner signs with their private key, and anyone can use the matching public key toverifythat the message really came from that owner. (Encryption keeps a message secret; signing proves who sent it. They are related but distinct uses.) The private key is known only by its owner. There are several methods of combining public and private key usage in order to securely encrypt and decrypt messages. See the References for further reading. -
With the existence of
asymmetric cryptography, it makes sense that there is alsosymmetric cryptography. Symmetric key algorithms are algorithms that use the same key for encryption and decryption. This is the oldest and the most straightforward type of encryption. The keys can be identical, or they may have a simple reversal in order to 'undo' the encryption. This key should be thought of as aprivate key, and should only be shared between the sender and intended recipients. Having the private, shared, key is considered validation for this method. This method is less secure, but noticeably faster for encryption and transmission, which makes it appealing. -
There also exists various hybrid methods (
hybrid cryptosystems), which combine the benefits of both public key cryptography and symmetric cryptography. These are beyond this introduction, but have some reading in the References section. -
Hashing, or hash functions, is another topic that may come up, and is a form of encryption that is intended to go inone direction, unlike symmetric and asymmetric methods, which arebidirectional. Hashing is intended for verification, not decryption. For example, passwords submitted on a banking app may be stored in a database as a generated hash value rather than storing a password directly. When a user logs in to the app, the entered password is converted to a hash and then compared to the hash value in the database. The raw text of the password is not sent. This method is outside the scope of this introduction, but may appear in some demos.
The Hello World! text in the figure in this section was created using the included Caesar Cipher, making the text actually from a symmetric encryption method as the private key is a simple reversal of the public key.
This repository was built and tested on Python 3.12. While effort was put into keeping the tools used in the example as generic as possible, at least Python 3.6 is needed in order to run some of the random and pseudo-random number generation functions properly.
pandas
numpyOptionally, requirements can be installed manually with:
pip install pandas, numpyThis is an example for if you've had a difficult time with the requirements.txt file. Sometimes libraries are packaged together.
Each encryption method has its own folder with the following:
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
encrypt.py and decrypt.py are stand alone and do not need the other to work. The test files use either the encrypt or decrypt classes to demonstrate how encryption and brute force encryption are used. The brute force method shown in this repo are not the only way the decrypt these, but they made good examples for the process.
In some repositories there may be a decrypt_improved.py and matching decrypt_improved_test.py, these are more detailed in their brute force attempts, but may only be more accurate or improved under specific conditions. They are considered experimental and for educational purposes outside of the main demonstration.
Regarding non-English dictionaries, there are some hooks for that integration, but it is not fully implemented into these samples to keep them simple. A directory of default_dictionaries has been included for generating different ASCII and Unicode dictionaries, but this is exploratory. They're also AI generated, which makes them fun to play with, but not particularly useful at this time.
Discussion of ciphers and history are included in the non-public half of this material; this repository is focused on the demo code.
In a substitution cipher, each letter (or symbol) in the plaintext is replaced by a corresponding letter (or symbol) in the ciphertext. The simplest of these are vulnerable to decryption even without having a private key due to frequency analysis. Some methods exist to make frequency analysis less effective.
-
Caesar Cipher
Era: classical (~50 BCE) · Type: monoalphabetic substitution · Broken by: brute force / frequency analysis · Status: insecure, teaching only- This cipher is attributed to Julius Caesar, and involves an alphabetical shift. The cipher is created by taking the plaintext and shifting it by a certain number of positions down or up the alphabet. It is one of the most common 'starting' ciphers used to introduce the topic of cryptography.
-
Bacon Cipher
Era: 1605 · Type: steganographic binary encoding · Broken by: recognizing the 5-symbol grouping · Status: insecure, teaching only- Created by Francis Bacon in 1605, this is a type of steganography rather than just a cipher. Using this method, letters are encoded rather than the message. This method works like binary representation, where two characters (typically 'A' and 'B') are used in a series of 5 to represent a numerical value of a letter.
-
Monoalphabetic Cipher
Era: classical/medieval · Type: substitution (arbitrary 1:1 map) · Broken by: frequency analysis · Status: insecure, teaching only- This cipher creates a new dictionary with the same 1:1 relation as a Caesar Cipher, but the letters in the cipher dictionary can be randomly matched to their plaintext equivalent rather than determined by shifting.
- This is the most difficult of the substitution ciphers in this section to brute force, and requires some tuning. HOWEVER, even when it does not perform perfectly, it often results in a starting point for further analysis. For example, a high-scoring but incorrect result is still partially readable.
-
Atbash Cipher
Era: ~500 BCE (Hebrew origin) · Type: keyless monoalphabetic substitution · Broken by: applying it again (self-inverse) · Status: insecure, teaching only- Atbash reverses the alphabet (A↔Z, B↔Y, C↔X, ...). It originated as a Hebrew scribal cipher and appears in the Hebrew Bible. It has no key at all — the transformation is fixed — which makes it the simplest substitution in the collection and a clean demonstration of an involution: encrypting twice returns the original. It is also exactly the Affine cipher with
a=25, b=25.
-
Affine Cipher
Era: classical generalization · Type: monoalphabetic substitution · Broken by: exhaustive search (only 312 keys) / frequency analysis · Status: insecure, teaching only- The Affine cipher generalizes Caesar with a multiply and an add:
E(x) = (a*x + b) mod 26, where Caesar is the special casea=1. For the map to be invertible,amust be coprime with 26, giving 12 valid multipliers × 26 shifts = 312 keys. It is a nice mathematical step up from Caesar — it introduces modular inverses and the coprimality requirement — while still falling to the same attacks.
-
Vigenère Cipher
Era: 1553 (Bellaso; later misattributed to Vigenère) · Type: polyalphabetic substitution · Broken by: Kasiski examination / index of coincidence · Status: insecure, teaching only- A keyword applies a different Caesar shift to each successive letter, cycling through the keyword. This is the historical answer to frequency analysis: spreading each plaintext letter across several cipher alphabets flattens the single-letter frequency signature that breaks the monoalphabetic cipher. Known for centuries as le chiffre indéchiffrable until Babbage (c. 1854, unpublished) and Kasiski (1863) broke it. The
decrypt.pyimplements exactly that break.
-
Autokey Cipher
Era: 1586 (Vigenère's actual invention) · Type: polyalphabetic substitution (non-repeating key) · Broken by: primer search exploiting that the key stream is itself English · Status: insecure, teaching only- The Autokey cipher fixes the fatal weakness of the repeating-key Vigenère: it does not repeat the key. A short primer keyword starts the key stream, and the plaintext itself continues it. Because the key never repeats, there is no period for the index of coincidence or Kasiski examination to find; the primary attack on Vigenère simply does not apply. (It is still breakable; see the discussion section.)
-
Beaufort Cipher
Era: 19th century · Type: polyalphabetic substitution (reciprocal) · Broken by: the same period-finding attack as Vigenère · Status: insecure, teaching only- A polyalphabetic relative of Vigenère attributed to Sir Francis Beaufort (of wind-scale fame). Where Vigenère adds the key (
C = P + K), Beaufort subtracts the plaintext from the key (C = (K - P) mod 26). The elegant consequence is that Beaufort is reciprocal; the same operation both encrypts and decrypts, which is why the Hagelin M-209 cipher machine needed no separate decrypt setting.
A transposition cipher uses the message itself as a form of obfuscation. Rather than changing the letters (substituting), the letters in the message are rearranged to make the message unreadable.
- Rail Fence
Era: classical · Type: geometric transposition · Broken by: reconstructing the rail pattern (exhaustive over rail count) · Status: insecure, teaching only- This cipher is named for its distinctive 'zig zag' pattern where the plaintext is written up and down columns ('rails') and then read across the rows to create the cipher text. This can be brute forced by reconstructing the zigzag pattern.
A block cipher is a symmetric algorithm that encrypts fixed-size blocks of data at once, rather than one symbol at a time (as in a stream cipher). It is neither pure substitution nor pure transposition: it repeatedly combines both over multiple rounds, which is why it sits in its own category. This is the structural hinge between the classical hand ciphers above and modern computer-era cryptography below — the direct ancestor of DES and, through it, the lineage that leads to AES.
- Feistel (Block)
Era: 1970s (IBM) · Type: product cipher (Feistel network) · Broken by: not by brute force at real key sizes; cryptanalysis only · Status: foundational structure; DES itself now retired- The 'block' cipher in this example is a Feistel cipher (also known as the Luby–Rackoff block cipher). It is named after Horst Feistel, a physicist and cryptographer at IBM, and it is the foundational model used in the DES algorithm. The elegant property of a Feistel network is that encryption and decryption use the same structure, except the round-key order reverses.
The grid ciphers in this section have some relation to the substitution and transposition ciphers in the previous sections, but are notable for their distinctive grid pattern used in the encoding.
-
Polybius
Era: ~150 BCE · Type: substitution (coordinate encoding) · Broken by: exhaustive search / frequency analysis · Status: insecure, teaching only- A Polybius cipher uses a 5x5 grid in order to encode letters into numerical coordinates of a grid. This is a type of
substitutioncipher. - Attributed to the Greek historian Polybius.
-
Playfair
Era: 1854 (Wheatstone; promoted by Lord Playfair) · Type: digraph substitution on a 5x5 key square · Broken by: keyword dictionary search / digraph frequency analysis · Status: insecure, teaching only- The Playfair cipher encrypts letters two at a time using a 5x5 key square. Because it substitutes pairs rather than single letters, simple single-letter frequency analysis does not break it the way it breaks the monoalphabetic cipher. It lives in this section because, like Polybius, it is built on a 5x5 grid.
-
ADFGVX
Era: 1918 (WW1) · Type: fractionating substitution + transposition · Broken by: exhaustive search / frequency analysis (famously by Painvin) · Status: insecure, teaching only- This is a WW1 cipher used to send encrypted messages over radiotelegraphy/wireless telemetry. It is a revised version of the earlier ADFGV cipher and uses a modified Polybius square (6x6 compared to the 5x5).
- This is a
transpositioncipher but is in this section due to the grid used in the algorithm. This grid form made it possible to implement manually.
-
One-Time Pad (OTP / Vernam)
Era: 1919 (Vernam patent; Mauborgne's 'random, used once' insight) · Type: stream cipher (XOR pad) · Broken by: nothing when used correctly — provably; catastrophically broken by key reuse · Status: perfect secrecy in theory, brutal key management in practice- The One-Time Pad is the only cipher with proven perfect secrecy (Shannon, 1949). Each plaintext symbol is XOR'ed with a key symbol that is (1) truly random, (2) at least as long as the message, and (3) never reused. Under those conditions the ciphertext reveals literally nothing — every plaintext of that length is an equally likely explanation. The catch is that the three conditions are brutal in practice, and violating any of them collapses the security completely (see the key-reuse demonstration in the discussion section).
-
RC4
Era: 1987 · Type: stream cipher · Broken by: keystream biases (Fluhrer–Mantin–Shamir, Klein) · Status: broken, deprecated- RC4 is a once-popular stream cipher designed in 1987. It has since fallen into disuse due to discovered vulnerabilities. Attacks against this cipher include the 2001 Fluhrer-Mantin-Shamir attack and attacks against biases in the keystream, and the 2005 attack by Andreas Klein, which builds upon previous attacks. Later attacks are not examined here.
-
ChaCha20
Era: 2008 · Type: stream cipher (ARX) · Broken by: no practical break known · Status: secure, in wide modern use- ChaCha20 generates a pseudo-random keystream which is then XOR'ed with the plaintext to encrypt it. This method requires a key and a nonce. ChaCha20 is the 20-round member of the ChaCha family, introduced by Daniel J. Bernstein in 2008 as a refinement of his earlier Salsa20 (2005).
- Key: a secret value, typically 32 byte/256 bits, used to initialize a cipher's state
- Nonce: an 8-byte, 12-byte, or 24-byte number used only once with any given key. ChaCha20 typically uses 12 bytes/96 bits.
NOTE about the included stream ciphers:
- Decryption on a stream cipher is non-trivial, so the included
decrypt.pyfor each example uses a small selection of common attacks (e.g., dictionary attacks, frequency analysis, true 'try all combinations' brute force) that are discussed in terms of analysis. In some cases where the cipher algorithm has been completely broken, an example of that attack may be implemented. There are no 'test' attacks on algorithms included in this repository.
Every cipher above is symmetric: the same secret is used to encrypt and decrypt, and the two parties must somehow share that secret in advance. Asymmetric (public-key) cryptography removes that requirement entirely. Each party has a pair of keys. A public one anyone may hold and a private one kept secret. Then the security rests not on a hidden algorithm but on a mathematical problem that is easy in one direction and infeasible in reverse. These two examples are the classical pillars of that idea.
-
RSA
Era: 1977 · Type: public-key encryption · Hard problem: integer factorization · Status: secure at large key sizes (>= 2048-bit n)- Encryption is
c = m^e mod nwith the public key; decryption ism = c^d mod nwith the private key. An eavesdropper who has the public key(e, n)and the ciphertext can only recover the private key by factoringnback into its two prime factors. Thedecrypt.pyattack does exactly that by trial division, anddecrypt_improved.pycontrasts naive factoring with smarter methods (Fermat, Pollard's rho) — watch the effort scale with key size. - Teaching note: the implementation here is textbook RSA (no padding, small primes, deterministic). It is correct for showing how RSA works and why factoring matters, but it is intentionally insecure — a deliberate lesson that "the math being right" is not the same as "the system being secure."
-
ECDH (Elliptic-Curve Diffie–Hellman)
Era: DH 1976 / EC variant mid-1980s · Type: public-key *key exchange* · Hard problem: elliptic-curve discrete logarithm · Status: secure at real curve sizes (~256-bit)- ECDH is not a message cipher; it is a key exchange. Two parties each pick a private scalar, derive a public point on a shared elliptic curve, swap public points, and independently arrive at the same shared secret — which never crosses the wire. Breaking it means solving the elliptic-curve discrete log (recovering the private scalar from a public point). The
decrypt.pyattack does this naively;decrypt_improved.pyuses Baby-Step Giant-Step for the~sqrt(order)speedup. - RSA and ECDH make a clean pair: RSA's security is "factoring is hard," ECDH's is "discrete log is hard." They are the two foundations classical public-key cryptography is built on.
A cryptographic hash is the odd one out: it is not encryption at all. There is no key and no decryption — a hash is a one-way function that maps any input to a fixed-size fingerprint (digest). The README's introduction described this as the "one-directional" method used for verification (e.g. storing a password's hash rather than the password); this is the worked example of that idea.
-
Merkle–Damgård (teaching construction)
Era: 1989 (construction proven by Merkle and Damgård) · Type: hash construction skeleton · Property: collision-resistant iff its compression function is; deliberately inherits the length-extension flaw · Status: teaching only, never for real use- Not a standard hash — a deliberately transparent Merkle–Damgård construction whose only job is to show the skeleton shared by MD5, SHA-1, and SHA-256 without the distraction of an optimized compression function. Once the skeleton is visible, the three real hashes read as 'the same shape with a better compression function and more rounds.' Its
analysis.pydemonstrates the length-extension attack the construction inherits (and which SHA-3's sponge design avoids).
-
MD5
Era: 1992 (Rivest, RFC 1321) · Type: 128-bit Merkle–Damgård hash · Broken by: practical collisions (Wang et al., 2004) · Status: broken, retired- Included as a broken hash. Practical collisions were announced in 2004 and can now be generated in seconds; MD5 collisions were used to forge a rogue CA certificate in 2008 and a Microsoft code-signing certificate by the Flame malware in 2012. Its value here is entirely pedagogical: it shows the Merkle–Damgård skeleton clearly, and its downfall is one of the great teaching stories in cryptography. Verified against
hashlibinhash_test.py.
-
SHA-1
Era: 1995 (NSA/NIST, FIPS 180-1) · Type: 160-bit Merkle–Damgård hash · Broken by: SHATTERED collision (2017), chosen-prefix collisions (2020) · Status: broken, retired- Also included as a broken hash. This is included as a cautionary tale about the gap between 'no attack yet' and 'secure.' Theoretical collision attacks appeared in 2005; the first practical collision (two distinct PDFs with the same digest) arrived in 2017, and cheap chosen-prefix collisions in 2020. SHA-1 secured much of the internet for two decades and limped on in deployed systems long after it was mortally wounded — that migration story is the lesson. Verified against
hashlibinhash_test.py.
-
SHA-256
Era: 2001 (NSA / NIST, SHA-2 family) · Type: cryptographic hash · Resists: preimage (~2^256) and collision (~2^128, birthday bound) attacks · Status: secure, in wide modern use- SHA-256 pads a message, splits it into 512-bit blocks, and runs each through a 64-round compression function to produce a 256-bit digest. The implementation here is built from scratch and verified bit-for-bit against the published NIST test vectors in
hash_test.py. - Because there is nothing to decrypt, the companion file is
analysis.pyrather than adecrypt.py. It is a resist-don't-break demonstration: rather than recovering an input, it shows the properties that make the hash useful; the avalanche effect (a one-bit input change flips ~half the output bits) and the brute-force cost that makes reversal infeasible. (This mirrors the ChaCha20alg_analysis.pyapproach: the lesson is in watching the function resist, not in breaking it.)
-
CRC32
Era: 1961 (CRCs generally) · Type: checksum — NOT a cryptographic hash · Broken by: design: linear over GF(2), trivially forgeable · Status: excellent for accidental-error detection; zero security against an adversary- Deliberately included as the non-cryptographic counterexample. CRC32 is a checksum designed to detect accidental errors (flipped bits from noise), and it is excellent at that. This is used in Ethernet, ZIP, PNG, and gzip. But it is linear over GF(2), has no preimage or collision resistance, and an attacker can modify a message and surgically patch the CRC to match. Holding it next to SHA-256 makes the 'checksum vs. cryptographic hash' distinction unmistakable. Verified against Python's
zlib.crc32inhash_test.py.
This section discusses the general operation of the algorithms in terms of the input, output, and some tuning. In-depth discussion is reserved for the PPTs and group assignments.
Accuracy could be improved for all of these algorithms by implementing a dictionary or spellcheck program to identify actual English. This would slow automated detection drastically, however. It is also beyond the scope of this introduction.
- Caesar Cipher
This is one of the easier algorithms to brute force given the limited solution space. The alphabet can only shift left or right, not scramble. A brute force decryption for a Caesar Cipher involves trying every possible combination and then scoring based on English (in this case) letter frequency.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
Encryption:
=== Encryption Dictionary ===
Original: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Encrypted: XYZABCDEFGHIJKLMNOPQRSTUVW
=== Encryption Examples ===
'ABC' -> 'XYZ'
'HELLO' -> 'EBIIL'
'WORLD' -> 'TLOIA'
'HELLO WORLD!' -> 'EBIIL TLOIA!'
'ABC123' -> 'XYZ123'
'ZEBRA' -> 'WBYOX'Decryption:
============================================================
TEST CASE 1: 'KHOOR ZRUOG'
============================================================
Trying offsets 0 to 51...
============================================================
Top 3 most likely decryptions:
============================================================
1. Offset 16 (Score: -800.8): aXeeh phkeW
2. Offset 42 (Score: -800.8): AxEEH PHKEw
3. Offset 20 (Score: -878.3): ebiil tloia
Most likely decryption: 'aXeeh phkeW'
============================================================
TEST CASE 2: 'WKH TXLFN EURZQ IRA MXPSV RYHU WKH ODCB GRJ'
============================================================
Trying offsets 0 to 51...
============================================================
Top 3 most likely decryptions:
============================================================
1. Offset 23 (Score: -171.8): the quick brown foX jumps over the laZY dog
2. Offset 49 (Score: -171.8): THE QUICK BROWN FOx JUMPS OVER THE LAzy DOG
3. Offset 13 (Score: -291.8): jXU gkYSa Rhemd VeN Zkcfi elUh jXU bQPO TeW
Most likely decryption: 'the quick brown foX jumps over the laZY dog'
============================================================
TEST CASE 3: 'JBHF GUVF VF N GRKF ZRFFNTR'
============================================================
Trying offsets 0 to 51...
============================================================
Top 3 most likely decryptions:
============================================================
1. Offset 13 (Score: -495.1): WOUS ThiS iS a TeXS meSSage
2. Offset 39 (Score: -495.1): wous tHIs Is A tExs MEssAGE
3. Offset 9 (Score: -596.8): SKQO PdeO eO W PaTO iaOOWca
Most likely decryption: 'WOUS ThiS iS a TeXS meSSage'- Bacon Cipher
The Bacon Cipher is another relatively straightforward cipher to brute force. It has a limited dictionary and can still be broken with frequency analysis. Each single character is replaced with a binary representation (with ‘1’ and ‘0’ replaced with any arbitrary 2 symbols), and the text is not scrambled.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
Encryption:
=== Encryption Dictionary ===
Baconian cipher mapping (using 'A' and 'B'):
Variant: 26-letter (all separate)
A -> AAAAA
B -> AAAAB
C -> AAABA
D -> AAABB
E -> AABAA
F -> AABAB
G -> AABBA
H -> AABBB
I -> ABAAA
J -> ABAAB
=== Encryption Examples ===
'ABC' -> 'AAAAAAAAABAAABA'
'HELLO' -> 'AABBBAABAAABABBABABBABBBA'
'WORLD' -> 'BABBAABBBABAAABABABBAAABB'
'HELLO WORLD' -> 'AABBBAABAAABABBABABBABBBA BABBAABBBABAAABABABBAAABB'
'ABC123' -> 'AAAAAAAAABAAABA123'
'ZEBRA' -> 'BBAABAABAAAAAABBAAABAAAAA'
=== Different Symbol Sets ===
Symbols '0'/'1': 'HELLO' -> '0011100100010110101101110'
Symbols '.'/'-': 'HELLO' -> '..---..-...-.--.-.--.---.'
Symbols '*'/'#': 'HELLO' -> '**###**#***#*##*#*##*###*'
Symbols 'X'/'O': 'HELLO' -> 'XXOOOXXOXXXOXOOXOXOOXOOOX'Decryption:
================================================================================
TEST CASE 1: 'AABBBAABAAABABBABABBABBBA'
(Expected: HELLO)
================================================================================
Top 3 most likely decryptions:
================================================================================
1. Symbols: 'A'/'B' (26-letter) - Score: -1600.8
Result: HELLO
2. Symbols: 'B'/'A' (24-letter) - Score: -1728.7
Result: ASZKKBBBA
3. Symbols: 'B'/'A' (26-letter) - Score: -1815.1
Result: YAXJJBBBA
Most likely decryption: 'HELLO'
================================================================================
TEST CASE 2: '100100010000010100010010010011'
(Expected: SECRET)
================================================================================
Top 3 most likely decryptions:
================================================================================
1. Symbols: '0'/'1' (26-letter) - Score: -813.2
Result: SECRET
2. Symbols: '0'/'1' (24-letter) - Score: -891.7
Result: TECSEU
3. Symbols: '1'/'0' (26-letter) - Score: -8343.9
Result: 10010001A10100010010010011
Most likely decryption: 'SECRET'
================================================================================
TEST CASE 3: '....-........-..---..--.-'
(Expected: BACON)
================================================================================
Top 3 most likely decryptions:
================================================================================
1. Symbols: '.'/'-' (26-letter) - Score: -1035.9
Result: BACON
2. Symbols: '.'/'-' (24-letter) - Score: -1186.6
Result: BACPO
3. Symbols: '-'/'.' (26-letter) - Score: -2957.8
Result: ...X....WG-.-
Most likely decryption: 'BACON'
================================================================================
TEST CASE 4: 'BAABBAABAABAABABAABB'
(Expected: TEST)
================================================================================
Top 3 most likely decryptions:
================================================================================
1. Symbols: 'B'/'A' (24-letter) - Score: -1621.6
Result: NAYAXABB
2. Symbols: 'B'/'A' (26-letter) - Score: -1661.4
Result: MAWAVABB
3. Symbols: 'A'/'B' (26-letter) - Score: -2078.8
Result: TEST
- Monoalphabetic Cipher
The Monoalphabetic cipher is the most difficult substitution example in this collection and requires some tuning. However, even when it does not perform perfectly, it often results in a starting point for further analysis. For example, a high-scoring but incorrect result is still partially readable. What’s happening is that the text is English-like statistically with the frequency of letter distribution, even if it is very distinctly not English. To improve the decryption, context dependent words can be added to the dictionary.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryptiondecrypt_improved_test.py- an LLM generated test case for decryptiondecrypt_improved.py- The originaldecrypt.pyclass for this cipher uses a 'hill climbing' approach, which only accepts new solutions that are BETTER than the previous answer. This can cause the algorithm to get stuck in a local optimum since it cannot 'backtrack'. This algorithm allows for backtracking. There is a noticeable improvement in general decryption accuracy, but it is less accurate than adding context-specific words to the dictionary (see the decryption results with ['CRYPTOGRAPHY','SUBSTITUTION','CIPHER','METHOD','ENCRYPT'] added to the dictionary). These results are not surprising. A more intelligent algorithm may be able to more accurately predict the text without dictionary additions.
Encryption:
=== Encryption Examples ===
Using random key (seed=42):
'ABC' -> 'IQA'
'HELLO' -> 'BLMME'
'WORLD' -> 'KEWMY'
'HELLO WORLD' -> 'BLMME KEWMY'
'ABC123' -> 'IQA123'
'ATTACKATDAWN' -> 'IZZIACIZYIKD'
Using custom key (alphabet backwards):
'ABC' -> 'ZYX'
'HELLO' -> 'SVOOL'
'WORLD' -> 'DLIOW'
'HELLO WORLD' -> 'SVOOL DLIOW'
'ABC123' -> 'ZYX123'
'ATTACKATDAWN' -> 'ZGGZXPZGWZDM'Decryption Test Case 5, Encrypted Text:
================================================================================
TEST CASE 5: '''RM XIBKGLTIZKSB Z HFYHGRGFGRLM XRKSVI RH Z NVGSLW LU VMXIBKGRMT
RM DSRXS FMRGH LU KOZRMGVCG ZIV IVKOZXVW DRGS GSV XRKSVIGVCG RM Z WVURMVW NZMMVI
DRGS GSV SVOK LU Z PVB GSV FMRGH NZB YV HRMTOV OVGGVIH GSV NLHG XLNNLM KZRIH
LU OVGGVIH GIRKOVGH LU OVGGVIH NRCGFIVH LU GSV ZYLEV ZMW HL ULIGS GSV IVXVREVI
WVXRKSVIH GSV GVCG YB KVIULINRMT GSV RMEVIHV HFYHGRGFGRLM KILXVHH GL VCGIZXG
GSV LIRTRMZO NVHHZTV HFYHGRGFGRLM XRKSVIH XZM YV XLNKZIVW DRGS GIZMHKLHRGRLM
XRKSVIH RM Z GIZMHKLHRGRLM XRKSVI GSV FMRGH LU GSV KOZRMGVCG ZIV IVZIIZMTVW
RM Z WRUUVIVMG ZMW FHFZOOBJFRGV XLNKOVC LIWVI YFG GSV FMRGH GSVNHVOEVH ZIV
OVUG FMXSZMTVW YB XLMGIZHG RM Z HFYHGRGFGRLM XRKSVI GSV FMRGH LU GSV KOZRMGVCG
ZIV IVGZRMVW RM GSV HZNV HVJFVMXV RM GSV XRKSVIGVCG YFG GSV FMRGH GSVNHVOEVH ZIV
ZOGVIVW'''
(Expected: IN CRYPTOGRAPHY A SUBSTITUTION CIPHER... [long text])Iteration with 'default' dictionary. Score: 2269.6
'''ON CRBUTPKRAUHB A IMWITOTMTOPN COUHER OI A GETHPD PL ENCRBUTONK ON XHOCH MNOTI
PL USAONTEFT ARE REUSACED XOTH THE COUHERTEFT ON A DELONED GANNER XOTH THE HESU PL
A QEB THE MNOTI GAB WE IONKSE SETTERI THE GPIT CPGGPN UAORI PL SETTERI TROUSETI PL
SETTERI GOFTMREI PL THE AWPVE AND IP LPRTH THE RECEOVER DECOUHERI THE TEFT WB
UERLPRGONK THE ONVERIE IMWITOTMTOPN URPCEII TP EFTRACT THE PROKONAS GEIIAKE
IMWITOTMTOPN COUHERI CAN WE CPGUARED XOTH TRANIUPIOTOPN COUHERI ON A TRANIUPIOTOPN
COUHER THE MNOTI PL THE USAONTEFT ARE REARRANKED ON A DOLLERENT AND MIMASSBJMOTE
CPGUSEF PRDER WMT THE MNOTI THEGIESVEI ARE SELT MNCHANKED WB CPNTRAIT ON A
IMWITOTMTOPN COUHER THE MNOTI PL THE USAONTEFT ARE RETAONED ON THE IAGE IEJMENCE
ON THE COUHERTEFT WMT THE MNOTI THEGIESVEI ARE ASTERED'''
Key sample: P->Q, J->J, B->B, K->U, Z->A, W->D, N->G, S->H, E->V, H->I...Iteration with ['SUBSTITUTION','METHOD'] added to the dictionary. Score: 2490.9
'''IN CRBSTOGRASHB A DJKDTITJTION CISHER ID A METHOP OU ENCRBSTING IN WHICH JNITD OU
SLAINTEFT ARE RESLACEP WITH THE CISHERTEFT IN A PEUINEP MANNER WITH THE HELS OU A QEB
THE JNITD MAB KE DINGLE LETTERD THE MODT COMMON SAIRD OU LETTERD TRISLETD OU LETTERD
MIFTJRED OU THE AKOVE ANP DO UORTH THE RECEIVER PECISHERD THE TEFT KB SERUORMING THE
INVERDE DJKDTITJTION SROCEDD TO EFTRACT THE ORIGINAL MEDDAGE DJKDTITJTION CISHERD CAN
KE COMSAREP WITH RANDSODITION CISHERD IN A TRANDSODITION CISHER THE JNITD OU THE
SLAINTEFT ARE REARRANGEP IN A PIUUERENT ANP JDJALLBXJITE COMSLEF ORPER KJT THE JNITD
THEMDELVED ARE LEUT JNCHANGEP KB CONTRADT IN A DJKDTITJTION CISHER THE JNITD OU THE
SLAINTEFT ARE RETAINEP IN THE DAME DEXJENCE IN THE CISHERTEFT KJT THE JNITD THEMDELVED
ARE ALTEREP'''
Key sample: R->I, B->B, F->J, O->L, D->W, G->T, S->H, I->R, Z->A, H->D...Iteration with ['CRYPTOGRAPHY','SUBSTITUTION','CIPHER','METHOD','ENCRYPT'] added to the dictionary. Score: 2706.2
'''IN CRYPTOGRAPHY A UMKUTITMTION CIPHER IU A SETHOD OF ENCRYPTING IN WHICH MNITU OF
PLAINTEXT ARE REPLACED WITH THE CIPHERTEXT IN A DEFINED SANNER WITH THE HELP OF A BEY
THE MNITU SAY KE UINGLE LETTERU THE SOUT COSSON PAIRU OF LETTERU TRIPLETU OF LETTERU
SIXTMREU OF THE AKOVE AND UO FORTH THE RECEIVER DECIPHERU THE TEXT KY PERFORSING THE
INVERUE UMKUTITMTION PROCEUU TO EXTRACT THE ORIGINAL SEUUAGE UMKUTITMTION CIPHERU CAN
KE COSPARED WITH RANUPOUITION CIPHERU IN A TRANUPOUITION CIPHER THE MNITU OF THE
PLAINTEXT ARE REARRANGED IN A DIFFERENT AND MUMALLYJMITE COSPLEX ORDER KMT THE MNITU
THESUELVEU ARE LEFT MNCHANGED KY CONTRAUT IN A UMKUTITMTION CIPHER THE MNITU OF THE
PLAINTEXT ARE RETAINED IN THE UASE UEJMENCE IN THE CIPHERTEXT KMT THE MNITU
THESUELVEU ARE ALTERED'''
Key sample: Z->A, R->I, M->N, G->T, S->H, V->E, I->R, H->U, L->O, K->P...- Atbash Cipher
Atbash is the degenerate case that makes a useful classroom contrast: there is no key, so there is nothing to brute force. "Decryption" is the cipher applied a second time (E(x) = 25 - x, and 25 - (25 - x) = x), so the decrypt.py class exists for repo symmetry and to make the self-inverse property explicit. The discussion point is why a keyless cipher offers no security at all — every attacker knows the entire transformation — and how Atbash slots into the Affine family (a=25, b=25).
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
- Affine Cipher
The Affine cipher's key space is its lesson: only 12 valid multipliers (those coprime with 26) × 26 shifts = 312 keys, so the decrypt.py attack simply tries every (a, b) pair and scores by English letter frequency, reusing the same scoring approach as the Caesar decryptor. The legitimate decryption introduces the modular inverse: x = a^-1 (y - b) mod 26. This makes Affine a good bridge between "shift and check" (Caesar) and the number theory that returns later in RSA.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
- Vigenère Cipher
This is the cryptanalytic centerpiece of the polyalphabetic additions. The decrypt.py implements the historical break in two stages: (a) find the key length via the index of coincidence and/or Kasiski examination (repeated-substring spacing), then (b) split the ciphertext into that many columns — each of which is now a simple Caesar cipher — and solve each column by frequency analysis. The structure shows why a longer key is stronger: more columns means less data per column, so the per-column statistics get noisier.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
- Autokey Cipher
The pedagogical point of including Autokey next to Vigenère is the contrast in attacks. The Kasiski/index-of-coincidence attack fails here — the key stream is PRIMER + PLAINTEXT, so it never repeats and there is no period to find. Instead, the cipher is broken by exploiting that the key stream is itself English text: the decrypt.py brute forces short primers (the practical case) and scores the recovered plaintext for English-likeness. The legitimate decryption is also fun to step through: the first letters are decrypted with the primer, and each recovered plaintext letter then becomes the next key letter — the key stream "unzips" itself as you go.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
- Beaufort Cipher
Because Beaufort is reciprocal (P = (K - C) mod 26 has the same form as encryption), legitimate decryption is identical to encryption with the same key — a nice property to demonstrate alongside Atbash's self-inverse and RC4/ChaCha20's XOR symmetry later in the collection. The attack is the same period-finding approach as Vigenère, with each column solved subtractively rather than additively, reusing the same statistical machinery.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
- Rail Fence
The Rail Fence cipher can be brute forced by working backwards from the length of the message to the number of rails. One difficulty here is that the letter frequency is preserved due to the words being scrambled. An exhaustive search is possible, given that the maximum number of rails in a text of length N is N rails. However, pattern analysis will likely yield results quicker.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
Encryption:
=== 3 Rails, Direction: down, Remove Spaces: True ===
Rail Fence cipher with 3 rails:
Direction: down
Text length: 10
Rail visualization:
Rail 0: H O L
Rail 1: E L W R D
Rail 2: L O
Rail pattern: [0, 1, 2, 1, 0, 1, 2, 1, 0, 1]
Original: 'HELLO WORLD'
Encrypted: 'HOLELWRDLO'
Decrypted: 'HELLOWORLD'
Match: True
=== 4 Rails, Direction: down, Remove Spaces: True ===
Rail Fence cipher with 4 rails:
Direction: down
Text length: 10
Rail visualization:
Rail 0: H O
Rail 1: E W R
Rail 2: L O L
Rail 3: L D
Rail pattern: [0, 1, 2, 3, 2, 1, 0, 1, 2, 3]
Original: 'HELLO WORLD'
Encrypted: 'HOEWRLOLLD'
Decrypted: 'HELLOWORLD'
Match: TrueDecryption:
Detailed analysis of: 'HOLELWRDLO'
(Correct encryption of 'HELLOWORLD' with 3 rails)
=== Detailed Analysis: 2 Rails ===
Encrypted: 'HOLELWRDLO'
Decrypted: 'HWORLDELLO'
Score: -900.0
Text length: 10
Pattern period: 2
Rail distribution: {0: 5, 1: 5}
Pattern sample: [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
Rail visualization:
Rail 0: H O L E L
Rail 1: W R D L O
=== Detailed Analysis: 3 Rails ===
Encrypted: 'HOLELWRDLO'
Decrypted: 'HELLOWORLD'
Score: -870.0
Text length: 10
Pattern period: 4
Rail distribution: {0: 3, 1: 5, 2: 2}
Pattern sample: [0, 1, 2, 1, 0, 1, 2, 1, 0, 1]
Rail visualization:
Rail 0: H O L
Rail 1: E L W R D
Rail 2: L O
=== Detailed Analysis: 4 Rails ===
Encrypted: 'HOLELWRDLO'
Decrypted: 'HLWLREOLDO'
Score: -918.0
Text length: 10
Pattern period: 6
Rail distribution: {0: 2, 1: 3, 2: 3, 3: 2}
Pattern sample: [0, 1, 2, 3, 2, 1, 0, 1, 2, 3]
Rail visualization:
Rail 0: H O
Rail 1: L E L
Rail 2: W R D
Rail 3: L O
=== Detailed Analysis: 5 Rails ===
Encrypted: 'HOLELWRDLO'
Decrypted: 'HLWDOLREOL'
Score: -913.0
Text length: 10
Pattern period: 8
Rail distribution: {0: 2, 1: 3, 2: 2, 3: 2, 4: 1}
Pattern sample: [0, 1, 2, 3, 4, 3, 2, 1, 0, 1]
Rail visualization:
Rail 0: H O
Rail 1: L E L
Rail 2: W R
Rail 3: D L
Rail 4: OA Feistel block cipher is neither pure substitution nor pure transposition — it repeatedly combines both over several rounds. It is treated as its own category here (rather than under Transposition) because that combined, multi-round product cipher structure is precisely what defines it, and it is the bridge from classical hand ciphers to modern computer-era cryptography.
Classical ciphers have search space of 26! ≈ 4×10²⁶, but this type of cipher has a search space on the scale of 2¹²⁸ ≈ 3×10³⁸ (AES-128), making it impossible to attempt an exhaustive search on your average computer. Instead, more intelligent attacks must be utilized to get the key for decryption. The decryption files included for this example only provide an analysis of a sample of search results based on user settings. This algorithm is a good comparison, however, because it is the backbone of modern-day encryption.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryptiondecrypt_improved_test.py- an LLM generated test case for decryptiondecrypt_improved.py- The 'improved' version of this algorithm was created by letting Claude AI make changes in order to make the decryption algorithm more accurate. It added a very specific dictionary so that the test cases passed more often. There are several group questions comparing & contrasting results and methods. Otherwise, the decryption typically fails.
Encryption:
================================================================================
CONFIGURATION 1
================================================================================
Block Cipher Configuration:
Block size: 8 bytes
Number of rounds: 3
Key: 0123456789ABCDEF
Padding mode: PKCS7
Output format: hex
S-box (first 16 values): ['0xe4', '0x6', '0x4f', '0xce', '0x75', '0xb9', '0xf2', '0xa7', '0x9', '0x1e', '0xb4', '0xde', '0xe6', '0xd9', '0x88', '0x44']
P-box (bit positions): [0, 1, 3, 7, 4, 2, 5, 6]
Round keys:
Round 1: 2344658AAFC8E906
Round 2: 44678AA9C8EB0625
Round 3: 658AABCCE9062740
=== Testing Messages ===
'HELLO' -> '84448CA0F2AD4F7D' -> 'HELLO' (Match: True)
'HELLO WORLD' -> '84448CA0F254C2B0F4B508E07FB449A3' -> 'HELLO WORLD' (Match: True)
'THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG' -> '1D246C09677927883482F6C1F24362D44449BE097B7910811482C6A6DBE0EBF68444D0A0ED6B95D4C749B8E07FB449A3' -> 'THE
QUICK BROWN FOX JUMPS OVER THE LAZY DOG' (Match: True)Decryption (attempt. There are no results here):
=== Analysis of Unknown Ciphertext ===
Ciphertext: A1B2C3D4E5F67890A1B2C3D4E5F67890A1B2C3D4E5F67890A1B2C3D4E5F67890A1B2C3D4E5F67890A1B2C3D4E5F67890A1B2C3D4E5F67890A1B2C3D4E5F67890
=== Comprehensive Block Cipher Cryptanalysis ===
Analyzing 8 blocks of 8 bytes each
============================================================
=== Statistical Randomness Tests ===
Chi-square test (byte frequency):
Chi-square statistic: 1922.00
Expected for random: ~255
Assessment: FAIL
Runs test (consecutive bytes):
Observed runs: 63
Expected runs: 31.5
Deviation: 1.000
Assessment: FAIL
Autocorrelation test (lag=1):
Correlation coefficient: 0.2946
Assessment: FAIL
============================================================
=== Frequency Analysis Attack ===
Analyzed 8 ciphertext blocks
Block frequency analysis:
a1b2c3d4e5f67890: 8 times (100.000%)
Repeated blocks (ECB vulnerability): 1
Byte frequency by position:
Position 0: 0xA1 appears 8 times (100.000%)
Position 1: 0xB2 appears 8 times (100.000%)
Position 2: 0xC3 appears 8 times (100.000%)
Position 3: 0xD4 appears 8 times (100.000%)
Position 4: 0xE5 appears 8 times (100.000%)
Position 5: 0xF6 appears 8 times (100.000%)
Position 6: 0x78 appears 8 times (100.000%)
Position 7: 0x90 appears 8 times (100.000%)
============================================================
=== Exhaustive Key Search (max 1000 keys) ===
Key space: 2^16 = 65,536 total keys
Testing only 1,000 keys for demonstration...
Found 0 candidate keys- Polybius
The Polybius cipher is possible to brute force due to its size (5x5 or 6x6), making an exhaustive search possible. While the Polybius Square can be randomized, there are still a limited number of cells. Frequency analysis can also be used to decode this cipher without reconstructing the original grid. This makes the decryption methods similar to the Monoalphabetic cipher shown earlier.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
Encryption:
================================================================================
CONFIGURATION 1
================================================================================
Polybius Square (5x5):
Combined letters: IJ
Number base: 1 (1-based)
1 2 3 4 5
1 F U Y R H
2 Q Z X G V
3 P S K A L
4 N D T C B
5 W O E M
Coordinate examples:
A → 34
E → 53
M → 54
Z → 22
Grid Statistics:
grid_size: 5x5
total_positions: 25
filled_positions: 24
keyword_used: False
combine_letters: IJ
number_base: 1
separator: ' '
=== Testing Messages ===
'HELLO' -> '15 53 35 35 52' -> 'HELLO' (Match: True)
'HELLO WORLD' -> '15 53 35 35 52 51 52 14 35 42' -> 'HELLOWORLD' (Match: False)
Note: J→I substitution expected with IJ combination
'ATTACK AT DAWN' -> '34 43 43 34 44 33 34 43 42 34 51 41' -> 'ATTACKATDAWN' (Match: False)
Note: J→I substitution expected with IJ combination
================================================================================
CONFIGURATION 3
================================================================================
Polybius Square (5x5):
Combined letters: IJ
Number base: 0 (0-based)
0 1 2 3 4
0 B F C R A
1 S E T P O
2 K U X W L
3 N Q H Z M
4 D V Y G
Coordinate examples:
A → 04
E → 11
M → 34
Z → 33
Grid Statistics:
grid_size: 5x5
total_positions: 25
filled_positions: 24
keyword_used: False
combine_letters: IJ
number_base: 0
separator: '-'
=== Testing Messages ===
'HELLO' -> '32-11-24-24-14' -> 'HELLO' (Match: True)
'HELLO WORLD' -> '32-11-24-24-14- -23-14-03-24-40' -> 'HELLO WORLD' (Match: True)
'ATTACK AT DAWN' -> '04-12-12-04-02-20- -04-12- -40-04-23-30' -> 'ATTACK AT DAWN' (Match: True)
Decryption:
=== CREATING PROPER TEST CASES ===
Standard grid coordinate mapping:
H → 23
E → 15
L → 25
O → 33
W → 51
R → 41
D → 14
'HELLO WORLD' encrypts to: 23 15 25 25 33 51 33 41 25 14
SECRET grid coordinate mapping:
S → 11
E → 12
C → 13
R → 14
T → 15
'SECRET' encrypts to: 11 12 13 14 12 15
============================================================
TEST: Standard Grid - HELLO WORLD
============================================================
=== Ciphertext Analysis ===
Text: 23 15 25 25 33 51 33 41 25 14
Length: 29
Possible separators found: [' ']
Coordinate pairs found: 10
Sample pairs: ['23', '15', '25', '25', '33', '51', '33', '41', '25', '14']
Row coordinates: 1 to 5
Col coordinates: 1 to 5
Likely 1-based numbering
Suggested grid size: 5x5
Suggested number base: 1
Correct Grid Configuration:
Polybius Square (5x5):
Standard alphabetical arrangement
Combined letters: IJ
Number base: 1 (1-based)
1 2 3 4 5
1 A B C D E
2 F G H K L
3 M N O P Q
4 R S T U V
5 W X Y Z
Coordinate examples:
11 → A
15 → E
31 → M
54 → Z
Known decryption: 'HELLOWORLD'
Expected: 'HELLOWORLD'
Match: True- ADFGVX
The use of a 2-step substitution and transposition (or the reverse, depending on if you're encrypting or decrypting) makes this trickier to decode than the Polybius Cipher even though they may take up the same physical size for the grid. However, this algorithm is still possible to brute force through an exhaustive search or frequency analysis.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
Encryption:
================================================================================
CONFIGURATION 1
================================================================================
ADFGVX Cipher Configuration:
Random Grid (seed: 42)
Transposition Keyword: 'SECRET'
Transposition Order: [4, 2, 1, 3, 5]
Substitution Grid (6x6):
A D F G V X
A J M F 1 6 T
D 4 Z W 0 5 K
F L 2 I U Q 9
G G Y A 8 N S
V C 7 3 V D X
X E O P R B H
Substitution Examples:
A → GF
E → XA
M → AD
Z → DD
5 → DV
9 → FX
Cipher Statistics:
cipher_type: ADFGVX
grid_size: 6x6
total_characters: 36
grid_keyword_used: False
grid_keyword: None
transposition_keyword: SECRET
transposition_length: 5
transposition_order: [4, 2, 1, 3, 5]
random_seed: 42
separator: ''
=== Testing Messages ===
'HELLO' -> 'XAXFAXXAFD'
'HELLO WORLD' -> 'XAXAXFFFAXDVXADGFDXV'
'ATTACK AT DAWN' -> 'AFGVGFGXVFXVFGVGXDXDAAAFX'Decryption:
=== CREATING PROPER TEST CASES ===
Test cipher configuration:
ADFGVX Cipher Configuration:
Random Grid (seed: 42)
Transposition Keyword: 'KEY'
Transposition Order: [2, 1, 3]
Substitution Grid (6x6):
A D F G V X
A J M F 1 6 T
D 4 Z W 0 5 K
F L 2 I U Q 9
G G Y A 8 N S
V C 7 3 V D X
X E O P R B H
Coordinate examples:
GF → A
XA → E
AD → M
DD → Z
DV → 5
FX → 9
=== MANUAL ENCRYPTION TRACE (for testing) ===
Let's trace how 'HELLO' would be encrypted with this configuration:
Grid lookup (manual):
Character → Coordinate mapping:
H → XX
E → XA
L → FA
L → FA
O → XD
After substitution: 'XXXAFAFAXD'
Transposition keyword: 'KEY'
Alphabetical order: [2, 0, 1]
Padded substituted text: 'XXXAFAFAXDXX'
Row 1: X X X
Row 2: A F A
Row 3: F A X
Row 4: D X X
Column E: 'XFAX'
Column K: 'XAFD'
Column Y: 'XAXX'
Simulated encrypted result: 'XFAX XAFD XAXX'
=== TESTING DECRYPTION ===
Decrypted result: 'HELLO'
Original message: 'HELLO'
Match: True
=== STEP-BY-STEP DEMONSTRATION ===
=== ADFGVX CIPHER DECRYPTION DEMONSTRATION ===
Encrypted text: 'XFAX XAFD XAXX'
ADFGVX Cipher Configuration:
Random Grid (seed: 42)
Transposition Keyword: 'KEY'
Transposition Order: [2, 1, 3]
Substitution Grid (6x6):
A D F G V X
A J M F 1 6 T
D 4 Z W 0 5 K
F L 2 I U Q 9
G G Y A 8 N S
V C 7 3 V D X
X E O P R B H
Coordinate examples:
GF → A
XA → E
AD → M
DD → Z
DV → 5
FX → 9
=== STEP 1: REVERSE TRANSPOSITION ===
After reversing transposition: 'XXXAFAFAXD'
=== STEP 2: REVERSE SUBSTITUTION ===
Coordinate pairs to decode:
XX → H
XA → E
FA → L
FA → L
XD → O
Final decrypted text: 'HELLO'
Demo result: 'HELLO'
- Playfair
Playfair's key square has a 25! key space, so exhaustive search is hopeless even though the grid physically resembles Polybius. The realistic classical attack — implemented in decrypt.py — is a keyword dictionary search, scoring each candidate square by letter frequency. (A full break uses simulated annealing on digraph frequencies; that is noted in the code as a natural decrypt_improved-style extension.) One quirk worth showing a class: Playfair inserts pad letters during encryption (an X between doubled letters, a trailing X on odd-length messages), so a correctly recovered message may still contain stray X's. This is expected and is part of the lesson.
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
- One-Time Pad (OTP)
The OTP files are a proof demonstration rather than an attack demonstration, and they pair naturally with the resist-don't-break framing used for ChaCha20 and SHA-256:
encrypt.py- the encryption class (byte-wise XOR pad, plus a mod-26 letter variant for the historical form), standardized input.decrypt.py- legitimate decryption with the pad, plus the two demonstrations below.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption
The two demonstrations in decrypt.py are the whole lesson:
- Perfect secrecy: brute-forcing the key produces every possible plaintext of the right length, each equally valid — the demo shows that the same ciphertext can be "decrypted" to any chosen message by some key, so the attacker gains zero information. This is Shannon's 1949 result made tangible.
- Key-reuse catastrophe: if one pad encrypts two messages, XORing the two ciphertexts cancels the key (
c1 ^ c2 = p1 ^ p2), leaking the XOR of the plaintexts — which is readable. This is the classic "two-time pad" break that has sunk real systems (VENONA).
Together they bracket the OTP's strange position in the collection: the only provably unbreakable cipher, and one of the easiest to break in practice when its conditions are violated.
- RC4
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryptionkey_comparison_demo.py- a demonstration of how dictionaries with partial keys in a brute force attack might performklein_demo.py- a demo of the Klein attack for real decryption of RC4
The decryption attempts below using key_comparison_demo.py shows that during the brute force attempt with the 18 keys, it is possible to decrypt a message from this algorithm by running down the list in the dictionary. Notice that keys using repeating dictionary words are able to be decrypted, but deviations from the dictionary mean that the message may not (and probably not) be decrypted.
Encryption:
RC4 State Information:
Key: 'SECRET'
=== Testing Messages ===
'HELLO' → D42B203340
'RC4' → CE2D58
'HELLO WORLD' → D42B203340007487B02AF8
RC4 State Information:
Key: 'ThisIsALongerSecretKey123'
=== Testing Messages ===
'HELLO' → 3099B29409
'RC4' → 2A9FCA
'HELLO WORLD' → 3099B2940970426389CECBCipher symmetry demo (not decryption):
============================================================
RC4 SYMMETRY DEMONSTRATION
============================================================
=== RC4 SYMMETRY DEMONSTRATION ===
Plaintext: 'SECRET'
Key: 'KEY'
--- Step 1: Encryption ---
=== RC4 Key Scheduling Algorithm (KSA) for Decryption ===
Key: 'KEY' = 4B4559
Key length: 3 bytes
Initial S-box: [0, 1, 2, ..., 255]
i= 0: j=(0+0+75)%256= 75, swap S[0]↔S[75]
i= 1: j=(76+1+69)%256=145, swap S[1]↔S[145]
i= 2: j=(147+2+89)%256=236, swap S[2]↔S[236]
i= 3: j=(-17+3+75)%256= 58, swap S[3]↔S[58]
i= 4: j=(62+4+69)%256=131, swap S[4]↔S[131]
i= 5: j=(136+5+89)%256=225, swap S[5]↔S[225]
i= 6: j=(-25+6+75)%256= 50, swap S[6]↔S[50]
i= 7: j=(57+7+69)%256=126, swap S[7]↔S[126]
Final S-box first 16 values: [28, 170, 236, 58, 131, 225, 50, 126, 223, 214, 96, 52, 61, 143, 246, 32]
Final S-box last 16 values: [173, 55, 99, 75, 196, 124, 102, 82, 91, 135, 87, 149, 23, 212, 114, 108]
=== RC4 Pseudo-Random Generation Algorithm (PRGA) for Decryption ===
Generating 6 keystream bytes...
Step | i | j | S[i] | S[j] | S[i]+S[j] | S[sum] | Keystream
-----------------------------------------------------------------
0 | 1 | 170 | 81 | 170 | 251 | 149 | 0x95
1 | 2 | 150 | 30 | 236 | 10 | 96 | 0x60
2 | 3 | 208 | 60 | 58 | 118 | 252 | 0xFC
3 | 4 | 83 | 31 | 131 | 162 | 84 | 0x54
4 | 5 | 52 | 230 | 225 | 199 | 182 | 0xB6
5 | 6 | 102 | 219 | 50 | 13 | 143 | 0x8F
Ciphertext: C625BF06F3DB
--- Step 2: Decryption ---
=== RC4 Key Scheduling Algorithm (KSA) for Decryption ===
Key: 'KEY' = 4B4559
Key length: 3 bytes
Initial S-box: [0, 1, 2, ..., 255]
i= 0: j=(0+0+75)%256= 75, swap S[0]↔S[75]
i= 1: j=(76+1+69)%256=145, swap S[1]↔S[145]
i= 2: j=(147+2+89)%256=236, swap S[2]↔S[236]
i= 3: j=(-17+3+75)%256= 58, swap S[3]↔S[58]
i= 4: j=(62+4+69)%256=131, swap S[4]↔S[131]
i= 5: j=(136+5+89)%256=225, swap S[5]↔S[225]
i= 6: j=(-25+6+75)%256= 50, swap S[6]↔S[50]
i= 7: j=(57+7+69)%256=126, swap S[7]↔S[126]
Final S-box first 16 values: [28, 170, 236, 58, 131, 225, 50, 126, 223, 214, 96, 52, 61, 143, 246, 32]
Final S-box last 16 values: [173, 55, 99, 75, 196, 124, 102, 82, 91, 135, 87, 149, 23, 212, 114, 108]
=== RC4 Pseudo-Random Generation Algorithm (PRGA) for Decryption ===
Generating 6 keystream bytes...
Step | i | j | S[i] | S[j] | S[i]+S[j] | S[sum] | Keystream
-----------------------------------------------------------------
0 | 1 | 170 | 81 | 170 | 251 | 149 | 0x95
1 | 2 | 150 | 30 | 236 | 10 | 96 | 0x60
2 | 3 | 208 | 60 | 58 | 118 | 252 | 0xFC
3 | 4 | 83 | 31 | 131 | 162 | 84 | 0x54
4 | 5 | 52 | 230 | 225 | 199 | 182 | 0xB6
5 | 6 | 102 | 219 | 50 | 13 | 143 | 0x8F
Decrypted: 'SECRET'
--- Verification ---
Keystream 1: 9560FC54B68F
Keystream 2: 9560FC54B68F
Keystreams identical: True
Decryption successful: True # NOT REAL DECRYPTION!! This re-used a function call to compare the two, but they SHOULD match because this algorithm is symmetricDecryption with key_comparison_demo.py:
STEP 1: ENCRYPTING MESSAGES
==================================================
1. 'HELLO' + key 'KEY' → DD25B018F9
2. 'SECRET' + key 'ABC' → 9D611DC10AFA
3. 'TEST' + key '123' → 07B5ECD6
4. 'RC4' + key 'X' → B7C3C5
5. 'BRUTE' + key 'PASS' → C2D2A9DEB6
6. 'TESTTESTTESTTESTTESTTEST' + key 'PEANUTBUTTER' → B901B1D905B94619715026232C779904EE2A80F5B99DFDAA
7. 'THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG' + key 'QUICKBROWNFOX' → 4D2F0712D8C84303DF0E8AEEFA73ACC5CEFBE9CA9CE3B9EACF309A80674C65E984DE5730BC6BA6F7636FA6548. 'THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG' + key 'FOXFOXFOXFOX' → 7447591A0418A1EA1FABAAAC2FE8B73423BCA23F6FD72419ABB46A9FBF396B2CC7C72983CDFE31D35A5BF81B
9. 'THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG' + key 'F' → C11C78918F720AA0A34130F8A8894E81EA0C2A3EAD9A7F2F8504E094617CB50D3D7A1698606D38BDB2C8EB65
10. 'THIS is a LONGER PHRASE with mixed letters' + key 'KEYKEYKEYKEYKEY' → C128B50796E637D857A4B7C9BE18967F566530651DE0D0008151C5E0C2AAB12207559CB8F9F93AF3275B
11. 'THIS is a LONGER PHRASE with mixed letters' + key 'AMOREDIFFICULTKEY' → F01C043F1FE9CC68EAEA4C50DECBE16AA85C79D44C0DC1A93EF1CA16AFAE863D820D6036690A532CD666
STEP 2: BRUTE FORCE DECRYPTION ATTEMPTS
==================================================
1. Brute forcing: DD25B018F9
(Original: 'HELLO' with key 'KEY')
SUCCESS! Key 'KEY' → 'HELLO' (attempt 7)
2. Brute forcing: 9D611DC10AFA
(Original: 'SECRET' with key 'ABC')
SUCCESS! Key 'ABC' → 'SECRET' (attempt 8)
3. Brute forcing: 07B5ECD6
(Original: 'TEST' with key '123')
SUCCESS! Key '123' → 'TEST' (attempt 9)
4. Brute forcing: B7C3C5
(Original: 'RC4' with key 'X')
SUCCESS! Key 'X' → 'RC4' (attempt 4)
5. Brute forcing: C2D2A9DEB6
(Original: 'BRUTE' with key 'PASS')
SUCCESS! Key 'PASS' → 'BRUTE' (attempt 10)
6. Brute forcing: B901B1D905B94619715026232C779904EE2A80F5B99DFDAA
(Original: 'TESTTESTTESTTESTTESTTEST' with key 'PEANUTBUTTER')
NO KEY: Key not found in 18 attempts
7. Brute forcing: 4D2F0712D8C84303DF0E8AEEFA73ACC5CEFBE9CA9CE3B9EACF309A80674C65E984DE5730BC6BA6F7636FA654
(Original: 'THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG' with key 'QUICKBROWNFOX')
NO KEY: Key not found in 18 attempts
8. Brute forcing: 7447591A0418A1EA1FABAAAC2FE8B73423BCA23F6FD72419ABB46A9FBF396B2CC7C72983CDFE31D35A5BF81B
(Original: 'THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG' with key 'FOXFOXFOXFOX')
NO KEY: Key not found in 18 attempts
9. Brute forcing: C11C78918F720AA0A34130F8A8894E81EA0C2A3EAD9A7F2F8504E094617CB50D3D7A1698606D38BDB2C8EB65
(Original: 'THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG' with key 'F')
NO KEY: Key not found in 18 attempts
10. Brute forcing: C128B50796E637D857A4B7C9BE18967F566530651DE0D0008151C5E0C2AAB12207559CB8F9F93AF3275B
(Original: 'THIS is a LONGER PHRASE with mixed letters' with key 'KEYKEYKEYKEYKEY')
SUCCESS! Key 'KEY' → 'THIS is a LONGER PHRASE with mixed letters' (attempt 7)
11. Brute forcing: F01C043F1FE9CC68EAEA4C50DECBE16AA85C79D44C0DC1A93EF1CA16AFAE863D820D6036690A532CD666
(Original: 'THIS is a LONGER PHRASE with mixed letters' with key 'AMOREDIFFICULTKEY')
NO KEY: Key not found in 18 attemptsDecryption (short printout), klein_demo.py:
NOTE: everything is compared to the ground truth value of the key byte to make this readable. Pay close attention to the appears X/Y times. That is relative to the total samples.
This algorithm works on VERY SUBTLE differences. Below shows the occurrence of the TARGET value in the location. For the top example, K appeared in position 0 only 391 times out of 100k. That's only 0.391 percent. For very small keys, there really wasn't a statistically noticeable difference.
Analyzing keystream position 0:
Key[0] = 0x4B ('K') appears 391/100000 times (prob: 0.0039, bias: 1.00x)
Key[1] = 0x45 ('E') appears 196/100000 times (prob: 0.0020, bias: 0.50x)
Key[2] = 0x59 ('Y') appears 488/100000 times (prob: 0.0049, bias: 1.25x)
Analyzing keystream position 1:
Key[0] = 0x4B ('K') appears 195/100000 times (prob: 0.0019, bias: 0.50x)
Analyzing keystream position 2:
Key[0] = 0x4B ('K') appears 195/100000 times (prob: 0.0019, bias: 0.50x)
Key[1] = 0x45 ('E') appears 194/100000 times (prob: 0.0019, bias: 0.50x)
Key[2] = 0x59 ('Y') appears 196/100000 times (prob: 0.0020, bias: 0.50x)However, with longer keys there were some values that began to stand out. That means that this is NOT perfectly random, and that there is something that can be pulled out of this algorithm. 450k samples were run, and some threshold values may need to be tuned, but there are several statistically significant values.
The following is run over the first 16 bytes. Klein discovered that there was the strongest correlation in the first 16-32 bytes.
Analyzing keystream position 0:
Key[0] = 0x4C ('L') appears 1760/450000 times (prob: 0.0039, bias: 1.00x)
Key[1] = 0x4F ('O') appears 1758/450000 times (prob: 0.0039, bias: 1.00x)
Key[2] = 0x4E ('N') appears 1758/450000 times (prob: 0.0039, bias: 1.00x)
Key[3] = 0x47 ('G') appears 880/450000 times (prob: 0.0020, bias: 0.50x)
Key[4] = 0x50 ('P') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[5] = 0x41 ('A') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[6] = 0x53 ('S') appears 879/450000 times (prob: 0.0020, bias: 0.50x)
Key[7] = 0x53 ('S') appears 879/450000 times (prob: 0.0020, bias: 0.50x)
Key[8] = 0x57 ('W') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[9] = 0x4F ('O') appears 1758/450000 times (prob: 0.0039, bias: 1.00x)
Key[10] = 0x52 ('R') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[11] = 0x44 ('D') appears 39257/450000 times (prob: 0.0872, bias: 22.33x) # NOTE THIS VALUE!!
Analyzing keystream position 3:
Key[0] = 0x4C ('L') appears 1759/450000 times (prob: 0.0039, bias: 1.00x)
Key[1] = 0x4F ('O') appears 38819/450000 times (prob: 0.0863, bias: 22.08x)
Key[3] = 0x47 ('G') appears 439/450000 times (prob: 0.0010, bias: 0.25x)
Key[4] = 0x50 ('P') appears 439/450000 times (prob: 0.0010, bias: 0.25x)
Key[6] = 0x53 ('S') appears 2199/450000 times (prob: 0.0049, bias: 1.25x)
Key[7] = 0x53 ('S') appears 2199/450000 times (prob: 0.0049, bias: 1.25x)
Key[8] = 0x57 ('W') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[9] = 0x4F ('O') appears 38819/450000 times (prob: 0.0863, bias: 22.08x) # AND THIS VALUE!
Key[10] = 0x52 ('R') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[11] = 0x44 ('D') appears 1318/450000 times (prob: 0.0029, bias: 0.75x)
Analyzing keystream position 8:
Key[1] = 0x4F ('O') appears 1318/450000 times (prob: 0.0029, bias: 0.75x)
Key[2] = 0x4E ('N') appears 1317/450000 times (prob: 0.0029, bias: 0.75x)
Key[3] = 0x47 ('G') appears 879/450000 times (prob: 0.0020, bias: 0.50x)
Key[4] = 0x50 ('P') appears 1317/450000 times (prob: 0.0029, bias: 0.75x)
Key[5] = 0x41 ('A') appears 879/450000 times (prob: 0.0020, bias: 0.50x)
Key[6] = 0x53 ('S') appears 1756/450000 times (prob: 0.0039, bias: 1.00x)
Key[7] = 0x53 ('S') appears 1756/450000 times (prob: 0.0039, bias: 1.00x)
Key[8] = 0x57 ('W') appears 57570/450000 times (prob: 0.1279, bias: 32.75x) # LETTER NUMBER 3!
Key[9] = 0x4F ('O') appears 1318/450000 times (prob: 0.0029, bias: 0.75x)
Key[10] = 0x52 ('R') appears 439/450000 times (prob: 0.0010, bias: 0.25x)
Key[11] = 0x44 ('D') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Analyzing keystream position 15:
Key[0] = 0x4C ('L') appears 1317/450000 times (prob: 0.0029, bias: 0.75x)
Key[1] = 0x4F ('O') appears 56250/450000 times (prob: 0.1250, bias: 32.00x) #HERE!
Key[2] = 0x4E ('N') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[3] = 0x47 ('G') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[4] = 0x50 ('P') appears 880/450000 times (prob: 0.0020, bias: 0.50x)
Key[5] = 0x41 ('A') appears 1756/450000 times (prob: 0.0039, bias: 1.00x)
Key[6] = 0x53 ('S') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[7] = 0x53 ('S') appears 440/450000 times (prob: 0.0010, bias: 0.25x)
Key[8] = 0x57 ('W') appears 1756/450000 times (prob: 0.0039, bias: 1.00x)
Key[9] = 0x4F ('O') appears 56250/450000 times (prob: 0.1250, bias: 32.00x) # AND HERE!
Key[10] = 0x52 ('R') appears 879/450000 times (prob: 0.0020, bias: 0.50x)
Key[11] = 0x44 ('D') appears 1317/450000 times (prob: 0.0029, bias: 0.75x)Doing a second analysis for the first 32 bytes and increasing the number of samples from 450k to 500k showed:
Analyzing keystream position 0:
Key[0] = 0x4C ('L') appears 1955/500000 times (prob: 0.0039, bias: 1.00x)
Key[1] = 0x4F ('O') appears 1952/500000 times (prob: 0.0039, bias: 1.00x)
Key[2] = 0x4E ('N') appears 1953/500000 times (prob: 0.0039, bias: 1.00x)
Key[3] = 0x47 ('G') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[4] = 0x50 ('P') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[5] = 0x41 ('A') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[6] = 0x53 ('S') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[7] = 0x53 ('S') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[8] = 0x57 ('W') appears 489/500000 times (prob: 0.0010, bias: 0.25x)
Key[9] = 0x4F ('O') appears 1952/500000 times (prob: 0.0039, bias: 1.00x)
Key[10] = 0x52 ('R') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[11] = 0x44 ('D') appears 43618/500000 times (prob: 0.0872, bias: 22.33x) # 22x HIGHER!
Analyzing keystream position 3:
Key[0] = 0x4C ('L') appears 1955/500000 times (prob: 0.0039, bias: 1.00x)
Key[1] = 0x4F ('O') appears 43130/500000 times (prob: 0.0863, bias: 22.08x)
Key[3] = 0x47 ('G') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[4] = 0x50 ('P') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[6] = 0x53 ('S') appears 2443/500000 times (prob: 0.0049, bias: 1.25x)
Key[7] = 0x53 ('S') appears 2443/500000 times (prob: 0.0049, bias: 1.25x)
Key[8] = 0x57 ('W') appears 489/500000 times (prob: 0.0010, bias: 0.25x)
Key[9] = 0x4F ('O') appears 43130/500000 times (prob: 0.0863, bias: 22.08x) # 22x HIGHER!
Key[10] = 0x52 ('R') appears 489/500000 times (prob: 0.0010, bias: 0.25x)
Key[11] = 0x44 ('D') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Analyzing keystream position 8:
Key[1] = 0x4F ('O') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[2] = 0x4E ('N') appears 1464/500000 times (prob: 0.0029, bias: 0.75x)
Key[3] = 0x47 ('G') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[4] = 0x50 ('P') appears 1464/500000 times (prob: 0.0029, bias: 0.75x)
Key[5] = 0x41 ('A') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[6] = 0x53 ('S') appears 1952/500000 times (prob: 0.0039, bias: 1.00x)
Key[7] = 0x53 ('S') appears 1952/500000 times (prob: 0.0039, bias: 1.00x)
Key[8] = 0x57 ('W') appears 63966/500000 times (prob: 0.1279, bias: 32.75x) # 32x HIGHER!
Key[9] = 0x4F ('O') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[10] = 0x52 ('R') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[11] = 0x44 ('D') appears 489/500000 times (prob: 0.0010, bias: 0.25x)
] = 0x4C ('L') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[1] = 0x4F ('O') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[2] = 0x4E ('N') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[3] = 0x47 ('G') appears 3905/500000 times (prob: 0.0078, bias: 2.00x) #<- that would be worth investigating, but is likely too low
Key[4] = 0x50 ('P') appears 1953/500000 times (prob: 0.0039, bias: 1.00x)
Key[5] = 0x41 ('A') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[6] = 0x53 ('S') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[7] = 0x53 ('S') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[9] = 0x4F ('O') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[11] = 0x44 ('D') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Analyzing keystream position 23:
Key[0] = 0x4C ('L') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[2] = 0x4E ('N') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[3] = 0x47 ('G') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[5] = 0x41 ('A') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[6] = 0x53 ('S') appears 1953/500000 times (prob: 0.0039, bias: 1.00x)
Key[7] = 0x53 ('S') appears 1953/500000 times (prob: 0.0039, bias: 1.00x)
Key[8] = 0x57 ('W') appears 42644/500000 times (prob: 0.0853, bias: 21.83x) # 21x HIGHER!
Key[11] = 0x44 ('D') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Analyzing keystream position 24:
Key[0] = 0x4C ('L') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[1] = 0x4F ('O') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[2] = 0x4E ('N') appears 1466/500000 times (prob: 0.0029, bias: 0.75x)
Key[3] = 0x47 ('G') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[4] = 0x50 ('P') appears 62500/500000 times (prob: 0.1250, bias: 32.00x) #32x HIGHER!
Key[5] = 0x41 ('A') appears 1464/500000 times (prob: 0.0029, bias: 0.75x)
Key[8] = 0x57 ('W') appears 42643/500000 times (prob: 0.0853, bias: 21.83x) #21x HIGHER!
Key[9] = 0x4F ('O') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[10] = 0x52 ('R') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[11] = 0x44 ('D') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Analyzing keystream position 28:
Key[0] = 0x4C ('L') appears 1464/500000 times (prob: 0.0029, bias: 0.75x)
Key[1] = 0x4F ('O') appears 1464/500000 times (prob: 0.0029, bias: 0.75x)
Key[2] = 0x4E ('N') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[3] = 0x47 ('G') appears 1467/500000 times (prob: 0.0029, bias: 0.75x)
Key[5] = 0x41 ('A') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[6] = 0x53 ('S') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[7] = 0x53 ('S') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[8] = 0x57 ('W') appears 42643/500000 times (prob: 0.0853, bias: 21.83x) #21x HIGHER!
Key[9] = 0x4F ('O') appears 1464/500000 times (prob: 0.0029, bias: 0.75x)
Key[10] = 0x52 ('R') appears 1954/500000 times (prob: 0.0039, bias: 1.00x)
Key[11] = 0x44 ('D') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Analyzing keystream position 30:
Key[0] = 0x4C ('L') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[1] = 0x4F ('O') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[3] = 0x47 ('G') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[4] = 0x50 ('P') appears 1954/500000 times (prob: 0.0039, bias: 1.00x)
Key[5] = 0x41 ('A') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[6] = 0x53 ('S') appears 42156/500000 times (prob: 0.0843, bias: 21.58x) #21x HIGHER!
Key[7] = 0x53 ('S') appears 42156/500000 times (prob: 0.0843, bias: 21.58x) #21x HIGHER!
Key[8] = 0x57 ('W') appears 976/500000 times (prob: 0.0020, bias: 0.50x)
Key[9] = 0x4F ('O') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[10] = 0x52 ('R') appears 2930/500000 times (prob: 0.0059, bias: 1.50x)
Key[11] = 0x44 ('D') appears 2928/500000 times (prob: 0.0059, bias: 1.50x)
Analyzing keystream position 31:
Key[0] = 0x4C ('L') appears 1465/500000 times (prob: 0.0029, bias: 0.75x)
Key[1] = 0x4F ('O') appears 978/500000 times (prob: 0.0020, bias: 0.50x)
Key[2] = 0x4E ('N') appears 4393/500000 times (prob: 0.0088, bias: 2.25x) <- that is worth looking at a little closer, but too low individually
Key[3] = 0x47 ('G') appears 1464/500000 times (prob: 0.0029, bias: 0.75x)
Key[4] = 0x50 ('P') appears 488/500000 times (prob: 0.0010, bias: 0.25x)
Key[8] = 0x57 ('W') appears 977/500000 times (prob: 0.0020, bias: 0.50x)
Key[9] = 0x4F ('O') appears 978/500000 times (prob: 0.0020, bias: 0.50x)
Key[10] = 0x52 ('R') appears 2441/500000 times (prob: 0.0049, bias: 1.25x)
Key[11] = 0x44 ('D') appears 977/500000 times (prob: 0.0020, bias: 0.50x)So we start feeling pretty good about the following parts of the key:
- Key[1] = 0x4F ('O') appears 62500/500000 times (prob: 0.1250, bias: 32.00x) # 32x HIGHER!
- Key[4] = 0x50 ('P') appears 62500/500000 times (prob: 0.1250, bias: 32.00x) #32x HIGHER!
- Key[6] = 0x53 ('S') appears 42156/500000 times (prob: 0.0843, bias: 21.58x) #21x HIGHER!
- Key[7] = 0x53 ('S') appears 42156/500000 times (prob: 0.0843, bias: 21.58x) #21x HIGHER!
- Key[8] = 0x57 ('W') appears 63966/500000 times (prob: 0.1279, bias: 32.75x) # 32x HIGHER!, 21x HIGHER!, 21x HIGHER!, 21x HIGHER! (this many? This is W.)
- Key[9] = 0x4F ('O') appears 43130/500000 times (prob: 0.0863, bias: 22.08x) # 22x HIGHER!, 32x HIGHER!
- Key[11] = 0x44 ('D') appears 43618/500000 times (prob: 0.0872, bias: 22.33x) # 22x HIGHER!
KEY PROGRESS: ?O??P?SSWO?D
And we've got some suspicions of two positions:
- Key[2] = 0x4E ('N') appears 4393/500000 times (prob: 0.0088, bias: 2.25x) <- that is worth looking at a little closer, but too low individually
- Key[3] = 0x47 ('G') appears 3905/500000 times (prob: 0.0078, bias: 2.00x) #<- that would be worth investigating, but is likely too low
But that doesn't mean we've found our key! We're working from an inherently biased point of view because the klein_demo.py compares the results to the ground truth (our secret key). In a real situation, we would look at the statistical significance over that 16-32 bytes (or longer), and use that to choose characters with the highest probability. Those instances where 'O' is in position 9 43130 times out of 500000 (8.63%)? That is statistically highly significant, even if it doesn't happen every time. Those letters would be likely to be part of the key.
The situations where 'N' is in position 2 and 'G' is in position 3? Those may not be statistically significant at all. Mathematically, you would analyze what characters might be more significant and collect a few of those to try. Because if you have as much data as it takes to run a Klein attack, then you have enough data to test out a few keys and see if you've decrypted it.
However, also consider that if a key is generated by a person rather than a computer it's going to have some patterns. So for something that looks like "?O??P?SSWO?D", where the last part looks a lot like "PASSWORD"? You may be able to look at the key and guess.
- ChaCha20
encrypt.py- the encryption class, standardized input.decrypt.py- the brute force decryption class, standardized input.encrypt_test.py- an LLM generated test case for encryptiondecrypt_test.py- an LLM generated test case for decryption, shows highlights of the UNBROKEN algorithmalg_analysis.py- LLM generated techniques for analysis of methods that could be used to detect anomalies in this algorithm (DOES NOT BRUTE FORCE DECRYPT)
The ChaCha20 cipher has not been broken (unlike RC4), and is resilient against related-key attacks. However, it is a popular algorithm and has been included in this set as a demo to contrast with other stream ciphers. Interestingly, The ChaCha family of ciphers use both a Key and a Nonce. The Key can be reused, but the Nonce (remember this with "number used once") is used only once! This is part of what makes it strong against related-key attacks.
The decryption highlight below shows several steps for working the encryption process backwards (decryption_test.py), but it is not a brute force attack. ChaCha20 is a symmetrical algorithm, meaning that the encryption and decryption process are the same (That's due to the XOR usage!). Due to the use of both a KEY and a single-use NONCE, ChaCha 20 remains unbroken.
Encryption:
ChaCha20 State Information:
Key: 'SECRET' (6 chars)
Nonce: 'nonce123' (8 chars)
Counter: 0
Output format: HEX
Initialized: False
=== Testing Messages ===
'HELLO' → 050124EBD0
'HELLO WORLD!' → 050124EBD05543194917D4B7
'THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG' → 190C2D87CE205D15507BD2C4195C4FDFBFC39A61DA088D44F72D7CD10593999D27322AA2766BDC25ED0BEFDecryption Highlights:
--- Basic Short Message ---
Original message: 'HELLO WORLD'
Message length: 11 characters
Encrypted: 72BE1AF8C5FB7AC40C7401
Ciphertext length: 22 hex characters (11 bytes)
Using base test case: Basic Short Message 'HELLO WORLD'
Correct parameters: key='TESTKEY', nonce='testnonce', counter=0
--- Testing Wrong Key ---
With wrong key: '8A31BDEF5664FB892A11AC'
❌ Should produce garbage (not original message)
--- Testing Wrong Nonce ---
With wrong nonce: '14B6B381BCFD98DDCCE7AA'
❌ Should produce garbage (not original message)
--- Testing Wrong Counter ---
With wrong counter: '3C120D609C7FE1DFE20251'
❌ Should produce garbage (not original message)The asymmetric examples differ from everything above in a fundamental way: there is no shared secret. Security comes from a computational asymmetry — a problem that is cheap to compute in one direction and infeasible to reverse. The teaching value is watching that asymmetry directly: the forward operation runs instantly, while the "attack" file shows the reverse operation's cost climbing as the key size grows. Both examples deliberately use tiny parameters so the attack actually succeeds in front of a class; the lesson is in the scaling, not the break.
- RSA
The encryption (c = m^e mod n) and legitimate decryption (m = c^d mod n) are both fast modular exponentiations. The attacker, holding only the public key (e, n), must factor n. The files demonstrate this directly:
encrypt.py— generates the keypair and encrypts with the public key; includes a determinism demo showing how repeated plaintext characters produce identical ciphertext (the textbook-RSA weakness).decrypt.py— both the legitimate private-key decryption and the naive attack (trial-division factoring ofn), with a cost estimator that reports the~sqrt(n)effort.decrypt_improved.py— smarter factoring (Fermat's method, Pollard's rho) for the naive-vs-intelligent comparison, exactly paralleling the monoalphabetic and blockdecrypt_improvedfiles.
A nice classroom result: with primes chosen close together, Fermat's method factors n in a single iteration regardless of key size — a concrete lesson that careless prime selection defeats RSA no matter how large the modulus.
- ECDH
ECDH agrees on a shared secret over a public channel. The forward direction (scalar * point via double-and-add) is cheap; recovering the scalar from the resulting point (the elliptic-curve discrete log) is the hard direction.
encrypt.py— builds the curve, runs the full Alice/Bob exchange, and can enumerate every point in the finite group so students see the whole structure at once.decrypt.py— the naive discrete-log attack (addGrepeatedly until the public point is reached), plus shared-secret reconstruction showing that recovering one private scalar breaks the whole exchange.decrypt_improved.py— Baby-Step Giant-Step, the meet-in-the-middle method that recovers the scalar in~sqrt(order)steps. The step counts make the contrast vivid: naive search of scalarktakes aboutksteps, while BSGS resolves even largekin a handful of giant strides.
SHA-256 has no key and no inverse, so the usual encrypt/decrypt framing does not apply. The three files instead demonstrate correctness and resistance:
hash.py— the full FIPS 180-4 algorithm from scratch (padding, 64-word message schedule, 64-round compression function).hash_test.py— drives the hash and, crucially, verifies its output bit-for-bit against the published NIST test vectors. A from-scratch hash is only trustworthy if it matches the standard exactly, so this known-answer check is the heart of the test file.analysis.py— a resist-don't-break demonstration (in the spirit of the ChaCha20alg_analysis.py). Rather than attempting to reverse the hash, it makes the security properties visible:- Avalanche effect: flipping a single input bit changes about half of the 256 output bits. In testing this averaged ~50.8%, almost exactly the ideal 50%.
- Determinism and sensitivity: identical inputs always produce identical digests; near-identical inputs produce completely unrelated ones.
- Preimage / collision cost: a small brute-force search (e.g. finding a digest with a 16-zero-bit prefix) takes thousands of attempts, setting up the scaling argument for why a full preimage (2^256) or collision (2^128, the birthday bound) is infeasible.
A deliberate teaching caveat lives in these files: a fast hash like raw SHA-256 is the wrong tool for password storage — that calls for a slow, salted construction (bcrypt, scrypt, Argon2). This heads off a common student misconception, since password hashing is one of the motivating examples in the introduction above.
On "resist-don't-break": several files in this repo demonstrate cryptanalysis by succeeding — the Caesar brute-forcer, the RC4 key search, the RSA factoring attack — where the payoff is the moment the cipher falls. A resist-don't-break demonstration is the opposite: it runs an analysis that is meant to fail, and the lesson lives in watching the algorithm hold and in understanding why the attack gains no traction. This is the only honest framing for modern primitives, where no break exists at realistic parameters; it is also the framing that keeps such demonstrations safe to publish, since an analysis that shows resistance at teaching scale has no use as an attack tool. The ChaCha20 and SHA-256 analysis files are the clearest examples.
The rest of the hash family
SHA-256 is the worked example above, but the hashing directory holds four companions, each with the same hash.py / hash_test.py / analysis.py structure:
- Merkle–Damgård (teaching construction) —
hash.pyis a deliberately transparent implementation of the construction itself: pad with the message length encoded ("MD strengthening"), split into blocks, and chain a compression function from a fixed IV. The theorem (Merkle, Damgård, 1989) is that if the compression function is collision-resistant, the whole hash is — which is why hash design reduces to compression-function design.analysis.pydemonstrates the construction's inherited flaw live: the length-extension attack, possible because the output is the final chaining value. - MD5 — follows RFC 1321 and is verified against
hashlib.analysis.pytells the collision story (Wang et al. 2004, the rogue CA certificate, Flame) and includes a real collision pair, so students can hash two different inputs and watch the digests match. - SHA-1 — follows FIPS 180-4 and is verified against
hashlib.analysis.pycovers SHATTERED (2017) and the migration lesson: how long a wounded primitive can limp on in deployed systems after "no attack yet" stops being true. - CRC32 — the non-cryptographic counterexample, verified against
zlib.crc32.analysis.pydemonstrates why it is not a cryptographic hash: its linearity over GF(2) (CRC(a^b) = CRC(a) ^ CRC(b) ^ CRC(0)) and the trivial forgery that linearity allows. It is excellent at its actual job (detecting accidental errors) and useless against an adversary — the cleanest way to make the checksum/hash distinction stick.
Reading order matters here: Merkle–Damgård first (the skeleton), then MD5 and SHA-1 (the skeleton broken twice, for different reasons and on different timelines), then SHA-256 (the skeleton holding), with CRC32 off to the side as the "this was never even playing the same game" comparison.
A paid version of Claude AI (https://claude.ai) was used in this project as detailed below. To keep things consistent, no other AI tools were used.
-
Evaluate the viability of using an AI tool to generate examples & toy problems for demonstrations on this topic.
- Is it efficient?
- Is it accurate?
- Is it readable and easy to follow?
-
Given a template and limited instruction, can an AI tool be used to address the following:
- Does this make this topic more accessible to students?
- Does this provide more experimental opportunities?
- Could students use this tool to reasonably test their own ciphers with a similar format?
- At what algorithm complexity (or type of algorithm), does the AI tool usage conflict with goal 1?
-
Claude AI was provided the Caesar Cipher encryption and brute force decryption algorithms as a template/starting point. I wrote the initial version of those algorithms and then asked Claude to improve on the structure and readability. It took several iterations to get a balance between extra features and the format that I wanted for modularity purposes (see the Pandas dataframe for importing algorithm specific variables for the encrypt class). The substitution ciphers were written prior to this exercise, though not in the current class structure.
-
The bulk of the
encryptanddecryptclasses were written first locally, and then the AI tool was used to: -
- Create some consistency between naming within a class, and across the different algorithms
-
- Add (and spell check) comments, though this took manual editing to add some information relating to the unit material this repo was developed for
-
- Make minor edits when the encryption and decryption algorithms were slow or did not appear to output correctly.
- The algorithms were overall correct, though the stream ciphers had some minor errors.
- Claude AI's preview/debug of algorithm steps where the visual representation is printed out were a vast improvement on mine.
-
Claude AI was used to generate the
encrypt_test.pyanddecrypt_test.pyclasses. I made very few edits to those (primarily to correct some factual errors or non-existent function calls) -
For comparison purposes, Claude AI was told to make a version of
transposition/block/decrypt.pyand corresponding test function where it could make changes, additions, and commentary to improve the accuracy of the brute force decryption.- This had an increased PASS rate of test cases, but upon closer inspection this is because specific words unique to the testcases were inserted into the dictionary by Claude AI.
- On the given test cases there was no noticeable improvement on the brute force algorithm in general.
-
Overall, an AI tool was useful, but it had a tendency to insert excessive commentary, remove key comments in the code about modification, and make edits to the code to falsely increase the pass rate of test cases. These algorithms are also very well known, so there were no issues with training data behind the LLM.
-
From a programming and editing perspective:
- The function renaming to be a common structure was appreciated. This made the process of checking that all functions were consistently named very fast. The code also looks cleaner
- Asking Claude AI to comment, or to increase the number of comments in the code I provided it was ultimately mixed. It added a lot of comments that were distracting from the actual implementation. This was rolled back.
- Comments provided on the
encrypt_test.pyfunctions as these are relatively self contained and don't need to be included in every use of the algorithms, these were largely left as-is.
- Comments provided on the
-
From a content perspective:
- Generating the test cases for encryption and decryption was overall a positive experience. The test cases are unique enough from my demonstrations that they can be released publicly and still demonstrate the encryption and decryption process. They also included display and analysis that I was not expecting.
- The commentary, when in the test files, was overall interesting and did explain individual algorithm steps fairly well
- Some of the code generation was just for algorithm explanation and previewing internal steps/process. This was useful for demonstration in this use case, but would be excessive in normal use.
-
Debug and preview
- Claude AI's printout of algorithm progress, and the visualization of the algorithms were better than mine, no competition. This was especially true with the
Rail Fenceandgridalgorithms that required some ASCII art.
- Claude AI's printout of algorithm progress, and the visualization of the algorithms were better than mine, no competition. This was especially true with the
-
Surprise code & comment changes
- Claude AI tended to change code even when instructed not to. Some of this was for readability, or to increase modularity. Other times it was additional to the initial algorithm.
- In many cases, Claude AI removed my comments completely or replaced them with function descriptions. While this is not an instant negative thing, it did notably remove algorithm discussion and some troubleshooting tips relevant to lesson materials.
- Some additions were overly specific to the test cases that Claude AI generated for the test scripts. While this increased the pass rate of the test cases, it did not actually improve the algorithm.
-
AI edits to falsely pass test cases
- Somewhat expected, Claude would make edits to the dictionaries in the decryption process (and sometimes the code structure itself) for the test cases to pass.
-
AI Generated Commentary
- (disclaimer: no attempt was made to instruct otherwise)
- Claude inserted a lot of commentary into the test cases for encryption and decryption, some of which was off topic.
- When fed multiple algorithms in the same chat, it would begin to compare them unprompted. This was not a bad thing, but discussion was intended to be reserved for outside the encryption and decryption process. When Claude was allowed to make modifications (See
transposition/block/decrypt_improved_test.py), it inserted a lot of commentary and tended to make modifications that did not apply to the current problem/topic, or code was modified to allow test cases to pass.
These are general references. As related tutorials and some demos become public, information will be added.
Types of Encryption:
-
Wikipedia, “Public-key cryptography,” Wikipedia, Jul. 19, 2019. https://en.wikipedia.org/wiki/Public-key_cryptography
- Has links to other pages of interest, and some history. Current (6/2025) images are also good.
-
pago, “‘Symmetric, Asymmetric, and Hashing: Exploring the Different Types of Cryptography,’” Medium, Mar. 21, 2023. https://pagorun.medium.com/symmetric-asymmetric-and-hashing-exploring-the-different-types-of-cryptography-c4b5b9590b44
-
ClickSSL, “What is Encryption? – A Detailed Guide,” ClickSSL, Feb. 11, 2024. https://www.clickssl.net/blog/what-is-encryption
- Explanation of different types of encryption, some examples for both.
-
R. Awati, “What is a Public Key and How Does it Work?,” SearchSecurity. https://www.techtarget.com/searchsecurity/definition/public-key
-
Wikipedia, “Public-key cryptography,” Wikipedia, Jul. 19, 2019. https://en.wikipedia.org/wiki/Public-key_cryptography
-
GeeksforGeeks, “Difference between Private key and Public key,” GeeksforGeeks, May 13, 2019. https://www.geeksforgeeks.org/computer-networks/difference-between-private-key-and-public-key/
-
GeeksforGeeks, “What is Hashing?,” GeeksforGeeks, Jun. 12, 2014. https://www.geeksforgeeks.org/dsa/what-is-hashing/ (accessed Jun. 25, 2025).
- This is an explanation of Hashing in terms of the process and as data storage. The process of generating a hash is covered, but not in terms of usage with passwords.
-
C. Crane, “What Is a Hash Function in Cryptography? A Beginner’s Guide,” Hashed Out by The SSL StoreTM, Jan. 25, 2021. https://www.thesslstore.com/blog/what-is-a-hash-function-in-cryptography-a-beginners-guide/
History, Cryptography, and Ciphers
-
J. Schneider, “The History of Cryptography | IBM,” Ibm.com, Jul. 17, 2024. https://www.ibm.com/think/topics/cryptography-history
- A sample of cryptography through history, includes some of IBM's ideas on quantum computing
-
H. Sidhpurwala, “A Brief History of Cryptography,” www.redhat.com, Jan. 12, 2023. https://www.redhat.com/en/blog/brief-history-cryptography
-
Wikipedia, “History of cryptography,” Wikipedia, Apr. 07, 2019. https://en.wikipedia.org/wiki/History_of_cryptography
- As always, not a primary source itself, but includes a collection of interesting references and links to many, many interesting related topics
-
T. M. P. Reader, “The Clue to the Labyrinth: Francis Bacon and the Decryption of Nature,” The MIT Press Reader, Feb. 27, 2023. https://thereader.mitpress.mit.edu/the-clue-to-the-labyrinth-francis-bacon-and-the-decryption-of-nature/
-
GeeksforGeeks, “Baconian Cipher,” GeeksforGeeks, Jun. 27, 2017. https://www.geeksforgeeks.org/python/baconian-cipher/
- Includes code example
-
“Baconian Cipher - Francis Bacon Code AABAA - Online Decoder, Solver,” www.dcode.fr. https://www.dcode.fr/bacon-cipher
- Live demo encode and decode from a website that has a large selection of ciphers, games, and cryptography facts
-
GeeksForGeeks, “Caesar Cipher in Cryptography,” GeeksforGeeks, Jun. 02, 2016. https://www.geeksforgeeks.org/caesar-cipher-in-cryptography/
- Good explanation, demonstration, and code example
-
“Mono-Alphabetic Substitution Cipher,” 101 Computing, Nov. 09, 2019. https://www.101computing.net/mono-alphabetic-substitution-cipher/
- Live demo website for testing monoalphabetic ciphers
-
“Monoalphabetic Substitution Cipher - Online Cryptogram Decoder, Solver,” www.dcode.fr. https://www.dcode.fr/monoalphabetic-substitution
- Live demo encode and decode from a website that has a large selection of ciphers, games, and cryptography facts
-
GeeksforGeeks, “What is Monoalphabetic Cipher?,” GeeksforGeeks, May 07, 2024. https://www.geeksforgeeks.org/computer-networks/what-is-monoalphabetic-cipher/
- Also includes affine cipher definition
-
GeeksforGeeks, “Difference between Monoalphabetic Cipher and Polyalphabetic Cipher,” GeeksforGeeks, Jun. 08, 2020. https://www.geeksforgeeks.org/computer-networks/difference-between-monoalphabetic-cipher-and-polyalphabetic-cipher/
-
“Rail Fence Cipher,” Crypto Corner, 2017. https://crypto.interactive-maths.com/rail-fence-cipher.html
- Interactive encode and decode
-
“Rail Fence,” rumkin.com. https://rumkin.com/tools/cipher/rail-fence/
- Interactive encode and decode
-
“Rail Fence Cipher - Encryption and Decryption,” GeeksforGeeks, Jan. 20, 2017. https://www.geeksforgeeks.org/rail-fence-cipher-encryption-decryption/
- Good explanation, demonstration, and code example
-
“Polybius square cipher – Encrypt and decrypt online,” cryptii. https://cryptii.com/pipes/polybius-square
- Online encryption and decryption
-
“Polybius Square Cipher - GeeksforGeeks,” GeeksforGeeks, Feb. 18, 2018. https://www.geeksforgeeks.org/polybius-square-cipher/
- Good explanation, demonstration, and code example
-
Wikipedia, “Feistel cipher,” Wikipedia, Nov. 25, 2019. https://en.wikipedia.org/wiki/Feistel_cipher
- The block cipher used in this repo
-
GeeksforGeeks, “Feistel Cipher,” GeeksforGeeks, Mar. 02, 2020. https://www.geeksforgeeks.org/python/feistel-cipher/ (accessed Jun. 25, 2025).
-
“ADFGVX cipher,” Wikipedia, Feb. 24, 2022. https://en.wikipedia.org/wiki/ADFGVX_cipher
-
“ADFGVX Cipher - Decoder, Encoder, Solver, Translator,” www.dcode.fr. https://www.dcode.fr/adfgvx-cipher
- Online encryption and decryption
-
“ADFGVX Cipher,” Crypto Corner. https://crypto.interactive-maths.com/adfgvx-cipher.html
- Good explanation, demonstration, and code example. Links on website to other interesting ciphers
-
“Practical Cryptography,” Practicalcryptography.com, 2009. http://practicalcryptography.com/ciphers/adfgvx-cipher/
-
GeeksforGeeks, “RC4 Encryption Algorithm,” GeeksforGeeks, Mar. 23, 2018. https://www.geeksforgeeks.org/rc4-encryption-algorithm/
-
“RC4 Cipher - ArcFour - Online Decoder, Encryption,” Dcode.fr, 2025. https://www.dcode.fr/rc4-cipher
-
Wikipedia, “RC4,” Wikipedia, Dec. 31, 2019. https://en.wikipedia.org/wiki/RC4
-
“The ChaCha family of stream ciphers,” Cr.yp.to, 2024. https://cr.yp.to/chacha.html
-
Wikipedia, “Salsa20,” Wikipedia, Mar. 16, 2023. https://en.wikipedia.org/wiki/Salsa20
-
“Salsa20 — PyCryptodome 3.23.0 documentation,” Readthedocs.io, 2025. https://pycryptodome.readthedocs.io/en/latest/src/cipher/salsa20.html (accessed Jun. 25, 2025).
- Example implementation for generating and using Salsa20 from the Cipher library in Python
-
GeeksforGeeks, “Stream Ciphers,” GeeksforGeeks, Oct. 09, 2020. https://www.geeksforgeeks.org/computer-networks/stream-ciphers/
Asymmetric (Public-Key) Cryptography, Hashing, and the Mathematics
These are the primary sources and standard references for the public-key, hashing, and advanced-mathematics material. Where a foundational paper exists, it is cited directly.
-
R. L. Rivest, A. Shamir, and L. Adleman, “A method for obtaining digital signatures and public-key cryptosystems,” Communications of the ACM, vol. 21, no. 2, pp. 120–126, 1978. https://doi.org/10.1145/359340.359342
- The original RSA paper. Remarkably readable; worth assigning directly.
-
W. Diffie and M. E. Hellman, “New directions in cryptography,” IEEE Transactions on Information Theory, vol. 22, no. 6, pp. 644–654, 1976. https://doi.org/10.1109/TIT.1976.1055638
- The paper that introduced public-key cryptography and the Diffie–Hellman key exchange.
-
N. Koblitz, “Elliptic curve cryptosystems,” Mathematics of Computation, vol. 48, no. 177, pp. 203–209, 1987. https://doi.org/10.1090/S0025-5718-1987-0866109-5
- One of the two independent introductions of elliptic-curve cryptography.
-
V. S. Miller, “Use of elliptic curves in cryptography,” in Advances in Cryptology — CRYPTO ’85, LNCS 218, pp. 417–426, 1986. https://doi.org/10.1007/3-540-39799-X_31
- The other independent introduction of ECC, same era as Koblitz.
-
D. Shanks, “Class number, a theory of factorization, and genera,” in Proc. Symp. Pure Math., vol. 20, pp. 415–440, 1971.
- Origin of the Baby-Step Giant-Step algorithm used in the ECDH
decrypt_improved.py.
- Origin of the Baby-Step Giant-Step algorithm used in the ECDH
-
National Institute of Standards and Technology, “Secure Hash Standard (SHS),” FIPS PUB 180-4, 2015. https://doi.org/10.6028/NIST.FIPS.180-4
- The authoritative SHA-256 specification. The
hash.pyimplementation follows this document, andhash_test.pychecks against its test vectors.
- The authoritative SHA-256 specification. The
-
J. Katz and Y. Lindell, Introduction to Modern Cryptography, 3rd ed. Boca Raton, FL: CRC Press, 2020.
- A rigorous, widely-used graduate/advanced-undergraduate textbook. Strong on definitions, proofs, and the reduction-based view of security; good backbone for the "done rigorously" material.
-
D. Hankerson, A. Menezes, and S. Vanstone, Guide to Elliptic Curve Cryptography. New York: Springer, 2004. https://doi.org/10.1007/b97644
- The standard reference for elliptic curves done properly — the group law, point counting, and the discrete-log problem in full.
-
A. J. Menezes, P. C. van Oorschot, and S. A. Vanstone, Handbook of Applied Cryptography. Boca Raton, FL: CRC Press, 1996. (Freely available: https://cacr.uwaterloo.ca/hac/)
- Encyclopedic and free. Excellent for the number-theoretic core of RSA: Euler's theorem, the Chinese Remainder Theorem, and primality testing (Miller–Rabin).
-
O. Regev, “On lattices, learning with errors, random linear codes, and cryptography,” Journal of the ACM, vol. 56, no. 6, pp. 1–40, 2009. https://doi.org/10.1145/1568318.1568324
- The foundational Learning With Errors (LWE) paper; the basis for the lattice-based material.
-
V. Lyubashevsky, C. Peikert, and O. Regev, “On ideal lattices and learning with errors over rings,” in EUROCRYPT 2010, LNCS 6110, pp. 1–23, 2010. https://doi.org/10.1007/978-3-642-13190-5_1
- The Ring-LWE paper, the efficiency-oriented variant underpinning modern post-quantum schemes.
-
C. Peikert, “A decade of lattice cryptography,” Foundations and Trends in Theoretical Computer Science, vol. 10, no. 4, pp. 283–424, 2016. https://doi.org/10.1561/0400000074
- A thorough, readable survey — a good bridge from the original LWE papers to a working understanding.
-
S. Goldwasser, S. Micali, and C. Rackoff, “The knowledge complexity of interactive proof systems,” SIAM Journal on Computing, vol. 18, no. 1, pp. 186–208, 1989. https://doi.org/10.1137/0218012
- The paper that introduced zero-knowledge proofs.
-
A. Fiat and A. Shamir, “How to prove yourself: practical solutions to identification and signature problems,” in CRYPTO ’86, LNCS 263, pp. 186–194, 1987. https://doi.org/10.1007/3-540-47721-7_12
- The Fiat–Shamir heuristic, turning interactive proofs into non-interactive ones.
-
C. P. Schnorr, “Efficient signature generation by smart cards,” Journal of Cryptology, vol. 4, no. 3, pp. 161–174, 1991. https://doi.org/10.1007/BF00196725
- The Schnorr identification/signature protocol — the cleanest concrete example of a zero-knowledge-style proof of knowledge.
