Introduction

RSA is the first practical public key cryptosystem, published in 1977 by Ron Rivest, Adi Shamir, and Leonard Adleman at MIT. Named after its inventors, RSA can be used for both encryption (confidentiality) and digital signatures (authentication and non-repudiation). It remains one of the most widely deployed cryptographic algorithms, securing TLS certificates, SSH authentication, email encryption, code signing, and more.

The security of RSA rests on a single mathematical assumption: integer factorization is hard. While it is trivial to multiply two large prime numbers together, there is no known efficient algorithm to factor their product back into the original primes. The largest RSA number factored to date (RSA-250, 829 bits) required approximately 2,700 CPU core-years of computation. Modern RSA keys use 2048 or 4096 bits, making factorization far beyond current or foreseeable classical computing capability.

"The RSA system has the novel property that publicly revealing an encryption key does not thereby reveal the corresponding decryption key. This separation is the essence of public key cryptography." -- Rivest, Shamir, and Adleman, original 1977 paper

Mathematical Foundations

Modular Arithmetic

Modular arithmetic is the mathematics of remainders. The expression a mod n gives the remainder when a is divided by n. For example, 17 mod 5 = 2 because 17 divided by 5 leaves a remainder of 2. In RSA, all operations are performed modulo n (the product of two primes), which means the result is always between 0 and n - 1.

Modular exponentiation -- computing m^e mod n -- is the core operation of RSA. Despite the enormous numbers involved (2048-bit integers), modular exponentiation can be computed efficiently using the square-and-multiply algorithm (also called binary exponentiation), which reduces the number of multiplications from exponential to logarithmic in the exponent.

// Square-and-multiply algorithm for modular exponentiation// Computes base^exponent mod modulus efficientlyfunction mod_exp(base, exponent, modulus): result = 1 base = base mod modulus while exponent > 0: if exponent is odd: result = (result * base) mod modulus exponent = exponent >> 1 // integer division by 2 base = (base * base) mod modulus return result// Example: 7^13 mod 11// Binary of 13 = 1101// Requires only 5 multiplications instead of 12

Euler's Theorem and Totient Function

Euler's totient functionphi(n) counts the number of integers from 1 to n that are coprime to n (i.e., share no common factor with n other than 1). For a prime p, phi(p) = p - 1. For the product of two distinct primes p and q: phi(n) = phi(p * q) = (p - 1)(q - 1).

Euler's theorem states that for any integer a coprime to n:

a^phi(n) = 1 (mod n)

This theorem is the mathematical foundation of RSA. It guarantees that if e and d are chosen such that e * d = 1 (mod phi(n)), then for any message m:

(m^e)^d = m^(e*d) = m^(1 + k*phi(n)) = m * (m^phi(n))^k = m * 1^k = m (mod n)

In other words, encrypting with exponent e and then decrypting with exponent d recovers the original message.

Modular Multiplicative Inverse

The modular multiplicative inverse of e modulo phi(n) is the integer d such that e * d = 1 (mod phi(n)). This is computed using the Extended Euclidean Algorithm, which not only finds the GCD of two numbers but also expresses it as a linear combination of those numbers.

The inverse exists if and only if e and phi(n) are coprime, which is why the public exponent e must be chosen such that gcd(e, phi(n)) = 1.

Key Generation

RSA key generation produces a public key (used by anyone to encrypt) and a private key (used only by the owner to decrypt):

  1. Choose two large random primes p and q of approximately equal size. For a 2048-bit RSA key, each prime is approximately 1024 bits. Primality is verified using probabilistic tests (Miller-Rabin).
  2. Compute n = p * q. This is the modulus, shared between the public and private keys. The bit length of n is the "key size."
  3. Compute phi(n) = (p - 1)(q - 1). This value is kept secret. (Modern implementations use the Carmichael function lambda(n) = lcm(p-1, q-1) instead, which is equally correct and produces smaller private exponents.)
  4. Choose the public exponent e such that 1 < e < phi(n) and gcd(e, phi(n)) = 1. The most common choice is e = 65537 (2^16 + 1), which is prime, provides fast encryption (only 17 multiplications due to its binary representation 10000000000000001), and has been standardized in FIPS 186.
  5. Compute the private exponent d using the Extended Euclidean Algorithm: d = e^(-1) mod phi(n).
  6. Destroy p, q, and phi(n). Only n, e, and d are retained. (In practice, p and q are often kept for CRT-based decryption optimization.)
ComponentPublic/PrivateDescriptionTypical Size (2048-bit RSA)
n (modulus)PublicProduct of p and q2048 bits
e (public exponent)PublicEncryption exponent17 bits (65537)
d (private exponent)PrivateDecryption exponent~2048 bits
p, q (primes)Private (destroy or retain for CRT)The two prime factors~1024 bits each
// RSA key generation example (small numbers for illustration)p = 61, q = 53n = p * q = 3233phi(n) = (p-1)(q-1) = 60 * 52 = 3120e = 17 (coprime to 3120: gcd(17, 3120) = 1)d = 2753 (because 17 * 2753 = 46801 = 15 * 3120 + 1)Public key: (n=3233, e=17)Private key: (n=3233, d=2753)

Encryption and Decryption

Encryption: To encrypt a message m (where m is an integer with 0 <= m < n), the sender computes:

c = m^e mod n

Decryption: To decrypt ciphertext c, the receiver computes:

m = c^d mod n

// Encryption example (using the small key above)m = 65 (plaintext message, ASCII 'A')c = 65^17 mod 3233 = 2790 (ciphertext)// Decryptionm = 2790^2753 mod 3233 = 65 (recovered plaintext)

Chinese Remainder Theorem (CRT) Optimization: Decryption can be made approximately 4x faster by performing the modular exponentiation separately modulo p and modulo q, then combining the results using the CRT. This is why p and q are typically retained in the private key file. Most RSA implementations (OpenSSL, Java, .NET) use CRT by default.

Digital Signatures with RSA: Signing uses the private key, and verification uses the public key -- the reverse of encryption. To sign a message, the signer computes: s = hash(m)^d mod n. To verify, the verifier computes: hash'= s^e mod n and checks that hash' == hash(m). For details on signature schemes and padding, see Digital Signatures.

Why RSA Works

The correctness of RSA follows directly from Euler's theorem. We need to prove that (m^e)^d mod n = m for all valid messages m.

Since e * d = 1 mod phi(n), there exists an integer k such that e * d = 1 + k * phi(n). Therefore:

(m^e)^d = m^(e*d) = m^(1 + k*phi(n)) = m * (m^phi(n))^k

By Euler's theorem, if m is coprime to n, then m^phi(n) = 1 (mod n). Substituting:

m * (1)^k = m (mod n)

This proof assumes m is coprime to n. For the rare case where m shares a factor with n (i.e., m is a multiple of p or q), the result still holds, as can be shown using the Chinese Remainder Theorem. The probability of randomly selecting such an m is approximately 1/p + 1/q, which for 1024-bit primes is astronomically small (around 2^(-1023)).

Padding Schemes

Textbook RSA -- applying the raw m^e mod n operation directly -- is insecure for several reasons: it is deterministic (identical messages produce identical ciphertexts), it is malleable (an attacker can manipulate ciphertexts to produce predictable changes in plaintexts), and it leaks information about the message (e.g., if m is 0 or 1, the ciphertext equals the plaintext). Padding schemes are essential.

PKCS#1 v1.5

PKCS#1 v1.5, published in 1993, adds random bytes to the message before encryption:

EM = 0x00 || 0x02 || [random non-zero padding bytes] || 0x00 || [message]

The padding ensures that even identical messages produce different ciphertexts and that the padded message occupies the full modulus length. However, in 1998, Daniel Bleichenbacher demonstrated a devastating adaptive chosen-ciphertext attack against PKCS#1 v1.5 encryption. By sending carefully crafted ciphertexts to a server and observing whether it returned a padding error, an attacker could decrypt any ciphertext in approximately one million queries. Variants of this attack (DROWN, ROBOT) continue to affect systems decades later.

OAEP (Optimal Asymmetric Encryption Padding)

RSA-OAEP (Optimal Asymmetric Encryption Padding), designed by Bellare and Rogaway in 1994 and standardized in PKCS#1 v2, provides provably secure encryption under the RSA assumption. OAEP uses a two-round Feistel-like structure with hash functions to process the message and random seed before RSA encryption:

  1. Generate a random seed r
  2. Compute a mask from the seed using a mask generation function (MGF1, based on SHA-256)
  3. XOR the padded message with the mask
  4. Compute a second mask from the result and XOR it with the seed
  5. Concatenate and encrypt with RSA

RSA-OAEP is the recommended padding for all new RSA encryption implementations. For signatures, RSA-PSS (Probabilistic Signature Scheme) provides equivalent provable security.

"Never use textbook RSA. Never use PKCS#1 v1.5 for new systems. OAEP and PSS exist precisely because raw RSA is broken in practice." -- Dan Boneh, Stanford University

Key Sizes and Security Levels

RSA Key SizeSecurity Level (bits)StatusNIST Recommendation
512 bits~30 bitsTrivially broken (factorable in hours)Prohibited
768 bits~45 bitsBroken (factored in 2009)Prohibited
1024 bits~80 bitsDeprecated (theoretically feasible for well-funded adversaries)Disallowed after 2013
2048 bits~112 bitsCurrent minimum standardAcceptable through 2030
3072 bits~128 bitsRecommended for new deploymentsAcceptable beyond 2030
4096 bits~140 bitsMaximum commonly usedHigh-security applications

The progression from 1024 to 2048 to 3072 bits reflects the steady improvement in factoring algorithms and computing power. Unlike symmetric key sizes (where each additional bit doubles the difficulty), RSA key sizes must grow much faster -- doubling the security level requires roughly tripling the key size. This is why elliptic curve cryptography (ECC), which provides equivalent security with much smaller keys, is increasingly preferred. See Symmetric Encryption for the key size comparison.

Known Attacks on RSA

While RSA's mathematical foundation remains sound, numerous attacks exploit implementation weaknesses, poor parameter choices, or mathematical shortcuts:

Factoring Attacks: The most fundamental attack is factoring n to recover p and q. The best known classical algorithm is the General Number Field Sieve (GNFS), which runs in sub-exponential time. The largest RSA modulus publicly factored is RSA-250 (829 bits), requiring approximately 2,700 CPU core-years. Factoring a 2048-bit modulus with GNFS is estimated to require 10^18 to 10^20 operations -- far beyond feasibility.

Small Public Exponent Attacks: Using e = 3 with no padding allows an attacker to recover the plaintext by computing the cube root if the message is small. This is called Hastad's broadcast attack when the same message is encrypted to three or more recipients with e = 3. Using e = 65537 with proper padding prevents this.

Wiener's Attack: If the private exponent d is too small (specifically, d < n^0.25 / 3), Michael Wiener showed in 1990 that the continued fraction expansion of e/n reveals d. Proper key generation (using standard-length d) prevents this attack.

Common Factor Attack: If two different RSA moduli share a prime factor (due to faulty random number generation), computing gcd(n1, n2) reveals the shared factor, breaking both keys instantly. A 2012 study by Lenstra et al. found that approximately 0.2% of RSA keys on the internet shared factors due to weak entropy during key generation.

Timing and Side-Channel Attacks: The time taken to perform RSA decryption can reveal information about the private key. Kocher's timing attack (1996) exploited the square-and-multiply algorithm's variable execution time. Countermeasures include constant-time implementations and RSA blinding (multiplying the ciphertext by a random factor before decryption).

Quantum Threat: Shor's algorithm (1994) can factor integers in polynomial time on a quantum computer, which would break RSA completely. A quantum computer with approximately 4,000 logical qubits could factor a 2048-bit RSA modulus. Current quantum computers have far fewer qubits and high error rates, but the threat is taken seriously for long-term security planning. NIST has published post-quantum cryptographic standards (ML-KEM, ML-DSA) as eventual replacements.

Implementation Considerations

Correct RSA implementation is as important as the mathematical algorithm itself:

Random Number Quality: RSA key generation requires cryptographically secure random numbers for prime generation. Weak randomness has caused real-world key compromises. Always use OS-provided CSPRNGs (/dev/urandom, CryptGenRandom, getrandom()).

Constant-Time Operations: All operations involving the private key must execute in constant time to prevent timing side-channel attacks. This includes modular exponentiation, comparison operations, and padding verification.

CRT Fault Attacks: When using CRT-based decryption, a single computational fault during the computation can reveal the private key. If an attacker can induce a fault (via voltage glitching, for example), they can factor n from the faulty and correct signatures. The countermeasure is to verify the result before outputting it.

Key Storage: RSA private keys should be stored in hardware security modules (HSMs) or trusted platform modules (TPMs) when possible. PKCS#8 and PKCS#12 formats with strong password-based encryption are used for software key storage.

Never Implement Your Own RSA: Use established libraries (OpenSSL, libsodium, BoringSSL, Go's crypto/rsa, Java's JCA). Custom implementations are virtually guaranteed to contain subtle vulnerabilities.

For related topics, see Digital Signatures (RSA-PSS signatures), Public Key Infrastructure (RSA in TLS certificates), and AES (the symmetric cipher typically paired with RSA in hybrid encryption).

References

  • Rivest, R., Shamir, A., & Adleman, L. (1978). "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems." Communications of the ACM, 21(2), 120-126.
  • Boneh, D. (1999). "Twenty Years of Attacks on the RSA Cryptosystem." Notices of the American Mathematical Society, 46(2), 203-213.
  • Bleichenbacher, D. (1998). "Chosen Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS#1." CRYPTO '98, Springer.
  • Bellare, M., & Rogaway, P. (1994). "Optimal Asymmetric Encryption -- How to Encrypt with RSA." EUROCRYPT '94, Springer.
  • Wiener, M. (1990). "Cryptanalysis of Short RSA Secret Exponents." IEEE Transactions on Information Theory, 36(3), 553-558.
  • Lenstra, A., et al. (2012). "Ron was Wrong, Whit is Right." IACR ePrint Archive, 2012/064.
  • Kocher, P. (1996). "Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems." CRYPTO '96, Springer.
  • Shor, P. (1994). "Algorithms for Quantum Computation: Discrete Logarithms and Factoring." IEEE FOCS, 124-134.
  • Jonsson, J., & Kaliski, B. (2003). RFC 3447: Public-Key Cryptography Standards (PKCS) #1: RSA Cryptography Specifications Version 2.1.
  • Barker, E. (2020). NIST SP 800-57 Part 1 Rev. 5: Recommendation for Key Management.