Jump to content

Prime factorization to defeat bad RSA


Recommended Posts

I recently gave a presentation for my cryptology class on cryptanalysis and for a portion of it I talked about prime factorization and breaking RSA.

Note: RSA is still a secure cipher and nothing in this post will allow you to break it if it is used properly (assuming you don't have a quantum computer and can just run Shor's algorithm).

That said, RSA is only secure insofar as it is difficult to solve the seemingly simple problem of what two prime numbers (p, q) were used to obtain the modulus (n). While seemingly simple, the solution to this problem is very difficult (thanks P vs NP) but not impossible. 

One algorithm that finds the prime factors of a number is Direct Search Factorization, which is essentially the greedy brute-force method of trial division. Simply try dividing n by 2, 3, 5, 7, 11, 13... until you find one that it divides evenly by. 

#!/usr/bin/perl -w
use strict;
my $x;

print "Enter the number to factor into primes:\n> ";
chomp($x = <stdin>);

my @allNums = (0)x($x / 2);		#Getting primes with the Sieve of Eratosthenes
my @primes = ();
for(my $i = 2; $i < scalar(@allNums); $i++){
	if($allNums[$i] == 0){		#0 represents being prime
		push(@primes, $i);
		for(my $j = 2*$i; $j < scalar(@allNums); $j += $i){
			$allNums[$j] = 1;	#Flag all multiples as composites

foreach my $i (@primes){
	if($x % $i == 0){
		print "$i\n";
		$x /= $i;
		print "$x\n";

And running this yields the following:


Now unfortunately, this method sucks. To be fair, the worst part of this program is the fact that I do so much work generating a list of prime numbers and storing them all, but I don't know that I have too much of a choice given the nature of the algorithm. 


A much better method would be something like Pollard's Rho algorithm. This method is fairly confusing, and fair warning: my explanation will not do it any justice. I would recommend reading about it yourself here. Essentially, you have two numbers moving through a finite field (that's like a subset of the integer space ℤ) at different but related rates. When a cycle-finding algorithm (in this case Floyd's algorithm) detects a cycle (duh) then you've found a factor. Since the number's we're attempting to factor are the products of two primes then they have NO factors other than those two primes. That's nice because we don't have to waste any time trying to factor our factors or using primality tests on them. (Credits to Wikipedia because the Rho subroutine in the following program is largely just translated from their example).

#!/usr/bin/perl -w
use strict;
use Math::Prime::Util qw(gcd);
my $c = 1;

print "Enter the number to factor into primes:\n> ";
chomp(my $theNum = <stdin>);

	my $aFactor = rho($theNum);
	if($aFactor == $theNum){
		print $aFactor."\n";
		$theNum /= $aFactor;
		print $theNum."\n";

sub rho{
	my $num = $_[0];
	my $x = 2;
	my $x_fixed = 2;
	my $cycle_size = 2;
	my $factor = 1;
	while($factor == 1){
		my $count = 1;
		while($count <= $cycle_size and $factor <= 1){
			$x = ($x*$x + $c) % $num;
			$factor = gcd($x - $x_fixed, $num);
		$cycle_size *= 2;
		$x_fixed = $x;
	return $factor

And running this yields the following:


Please ignore the negative microseconds. It's because of the dumb way I subtracted the start time from the end time, and I didn't feel like fixing it. This should demonstrate how much better this method is. The previous method would use all of my memory if I tried a number much bigger than the example. That was only 103 million compared to 5 quadrillion.

As impressive as that is, RSA ciphers should generally use numbers with over 1000 digits for their modulus. Needless to say, that's simply not going to be broken by my laptop this century. There are multiple algorithms which are faster at prime factorization than even Pollard's Rho algorithm, but none of them can do this in polynomial time.

Once you have the prime factors p and q, the decryption key is then simply e^(-1) % (p-1)(q-1)


An interesting footnote is that a previous version of the Rho program I wrote would have taken well over a minute to factor that number because I had written my own GCD subroutine recursively like this:

sub euclidGCD{
	if($_[1] == 0){
		return $_[0];
		return euclidGCD($_[1], ($_[0] % $_[1]));

(Yes, that's not readable but I'm going for speed)

While optimizing the program I immediately noted that the GCD calculation would easily be the most computationally expensive. I rewrote the subroutine iteratively and made it about twice as fast. After that I looked further into it and was going to implement the Binary GCD algorithm iteratively before finding numerous online benchmarks showing that the native C implementation of the euclidean algorithm is generally significantly faster (at least for numbers this order of magnitude).

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Create New...