Factoring integers in Perl

Recently I've been working on factoring and primality proofs (with certificates) for Math::Prime::Util. I thought I'd give a brief summary and comparison of the modules I know of for factoring integers from Perl.

Let me take a moment to promote the factor.pl utility installed by Math::Prime::Util. It works just like the BSD or GNU factor utility, except it supports bigints on all platforms and parses simple expressions. So you can run commands like:
 factor.pl 2**63-1
 factor.pl 'nth_prime(100000)*pn_primorial(10)*random_nbit_prime(90)'
If you want to factor 30+ digit numbers I recommend additionally installing Math::BigInt::GMP and Math::Prime::Util::GMP.

This is just a brief look so I chose some easy numbers:
  • 2**38-1 (3,174763,524287)
  • 2**90-1 (3,3,3,7,11,19,31,73,151,331,631,23311,18837001)
  • 2**150-1 (3,3,7,11,31,151,251,331,601,1801,4051,100801,
              10567201,1133836730401)
and some "hard" numbers:
  • 2**62-1 (3,715827883,2147483647)
  • 2**95-1 (31,191,524287,420778751,30327152671)
  • 2**195-1 (7,31,79,151,8191,121369,145295143558111,
              134304196845099262572814573351)
  • 2**250-1 (3,11,31,251,601,1801,4051,229668251,269089806001,
              4710883168879506001,5519485418336288303251)

An aside: everyone's thoughts on what "small", "big", "easy", "hard" mean differ. To some, a 19-digit number is big, while to many people working on modern factoring that size is completely trivial, and anything under 100-digits is a yawn. My goal is to make it easy to factor 50-80 digit numbers in a reasonable time from Perl. If you are seriously looking for larger or better methods, I recommend yafu, msieve, GMP-ECM, and GGNFS.

So first let's look at the modules:

  • Math::Big::Factors. I believe this was intended as an example of using Math::BigInt, but there are some modules actually using it. The main problem is that is super slow (slower than a simple trial division loop in many cases).
  • Math:Factor::XS. Nicely coded trial division in C. Great for easy native size integers, slower than need be for ones with two large factors, and doesn't support bigints.
  • Math::Factoring. Leto's placeholder for factoring algorithms in Perl+Math::GMPz. Unfortunately it currently only does trial division, so it only works well on easy numbers. Also be careful to give it only native or Math::GMPz inputs or it blows up.
  • Math::Pari. Perl interface to the Pari library (2.1 by default, 2.3 possible if hand-built, 2.5+ not supported). Downside: licensing and portability issues on some platforms. Upside: lots of functions, integer factoring is fast and has a relatively predictable slowdown as the input gets larger. Uses SQUFOF, Pollard Rho, ECM, and MPQS.
  • Math::Prime::Util. The module I've been working on lately. Small numbers are factored in C using Pollard Rho, SQUFOF, Pollard p-1, and HOLF. For BigInts we see if Math::Prime::Util::GMP is installed and use that if so, otherwise various Perl methods are used: Trial, HOLF, Pollard's Rho, Pollard's p-1, and 2-stage ECM.
  • Math::Prime::Util::GMP. This module contains C+GMP code for various tasks and is meant to be a backend for Math::Prime::Util -- install it if you have GMP and most big integer functions speed up. For factoring this is much faster than Perl+Math::BigInt::GMP. It uses a combination of Trial, native SQUFOF, 2-stage Pollard p-1, 2-stage ECM, Pollard Rho, and a simple Quadratic Sieve.

I took a fresh 5.16.2 perlbrew on an x86 Ubuntu machine and installed Math::BigInt::GMP to start, then tried various modules on some simple examples. I installed fresh versions of each module with CPAN before running. Timing results, in seconds, from the command line:

               38      90     150        62      95     195     250
------------ ------  ------  ------    ------  ------  ------  ------
Math::Big     13.7   369     >600      >600    >600
MF::XS         0.02                      1.78
M::Factoring   0.22    0.12    7.86    493     302     >600
MPU            0.05    0.16    1.39      0.05    2.03  120+    600+
MPU::GMP       0.05    0.05    0.06      0.05    0.06    0.40    1.57
Math::Pari     0.05    0.06    0.06      0.06    0.06    0.35    5.65

This is just a gross look, factoring a single number, so anything under 0.1 seconds is reflecting the overhead of starting Perl more than the actual time taken. Both Math::Pari and Math::Prime::Util do some startup work such as creating a small set of primes used for other functions, which the other modules don't do.

Math::Big::Factor: I tried different sizes for the factor wheel and 5 was the fastest for these examples. It took over 6 minutes to factor 2^90-1, which a trial division loop can do in under a second. I stopped it after 15 minutes on 2^150-1, which a trial division loop can do in a bit over 5 minutes.

Math::Factor::XS is super fast for the small example, but certainly slower than need be for 2^62-1. Its maximum size is MAX(sizeof(unsigned long), sizeof(UV)), so either 32-bit or 64-bit.

Math::Factoring will be pretty interesting once algorithms get added, and I'm thinking of submitting a patch to support some of the simpler methods from MPU like Pollard Rho (with Brent's improvements), 2-stage p-1, Fermat, HOLF, and possibly ECM. Using Math::GMPz directly should make it faster than using Math::BigInt objects.

Math::Prime::Util using pure Perl does a pretty good job on most of the numbers. The two largest examples are found using ECM, which is why the '+' sign on the time -- exactly how fast they get found depends on the random curves selected.

Math::Pari and Math::Prime::Util::GMP are the clear time winners, finding all the factors of the small examples basically instantly. Both of them factor random 30-digit numbers in under 10ms, and average under a second for random 60 digit numbers.

Edit for reference: command lines used:
perl -Mbigint=lib,GMP -MMath::Big::Factors=factors_wheel -E 'my $n = 2**38-1; say join ",", factors_wheel($n, 5);'
perl -MMath::Factor::XS=prime_factors -E 'my $n = 2**38-1; say join ",", prime_factors($n);'
perl -MMath::Factoring=factor -Mbigint=lib,GMP -E 'my $n = 2**38-1; say join ",", factor(Math::GMPz->new("$n"));'
perl -MMath::Prime::Util=factor -Mbigint=lib,GMP -E 'my $n = 2**38-1; say join ",", factor($n);
perl -MMath::Pari=factorint -Mbigint=lib,GMP -E 'my $n = 2**38-1; my ($pn,$pc) = @{factorint($n)}; say join ",", map { ($pn->[$_]) x $pc->[$_] } 0 .. $#$pn'

1 Comment

Leave a comment

About Dana Jacobsen

user-pic I've been using Perl since 1991 (4.010 or so). At some point I realized that while I have used many languages, I enjoy using Perl more than any others.