Actions from Dana Jacobsen

Commented on Alternatives to rand() in Dana Jacobsen

2017-04-20T01:00:42Z

Thanks for the note. You are correct that if run on machines with AES support (which is a *lot*, and OpenSSL supports a huge number of them) and if the software supports it, then indeed it's very fast. I added a parenthetical note.

None of the Perl modules do this. Crypt::Rijndael's AES C code is the standard reference C code. CryptX uses LibTomCrypt which is based on the original reference code. So they are not very fast (relative to other PRNGs). Fortuna also includes running SHA2, which further slows down the output. Crypt::PRNG (Fortuna) runs at 0.14 GB/s on this machine, and a naive use of Crypt::Rijndael runs at 0.07 GB/s. ChaCha20 (standard C) runs at 0.394 GB/s, while an AVX2 version found on github runs at 2.1 GB/s.

The ChaCha20 implementation I wrote is bog standard from the spec. It runs very slightly faster than OpenBSD's implementation, but 5x slower than the 'chacha-opt' AVX2 version. OpenSSL includes asm, SSE2, SSE3, AVX2, and AVS512 (!) accelerations.

Commented on Alternatives to rand() in Dana Jacobsen

2017-04-18T13:50:35Z

Peter, some modules offer this (ntheory, Crypt::Random, Crypt::Urandom) but none of them do a rand() style interface to it. It would be easy enough to create. On my Macbook, bulk reads from /dev/urandom run at 12 MB/s, which would make it quite slow compared to most of these modules. It is very system dependent -- my Fedora 20 desktop runs at a similarly slow 17 MB/s while the Fedora 23 desktop at 185 MB/s. A new CentOS 7 160-core Power 8 machine runs at 5 MB/s.

In raw speed the older /dev/urandom device is about 40x slower than standard ChaCha20. Fast PRNGs like we would expect in the Perl core would be another 10x faster.

In quality it should match the CSPRNG methods for underlying data, with the (big) exception that it's skipping a middle userspace layer. The new Linux and BSD /dev/urandom are nearly identical to the ChaCha20 code I use, though we differ in things like seeding and reseeding. For rand() calls without excessive overhead we'd need to either buffer or use a system call rather than literally opening the file each time.

Posted Alternatives to rand() to Dana Jacobsen

2017-04-17T20:34:18Z

I recently added some functionality for random number generation to my modules, which led me on a digression about rand(). This is a short look at some modules for getting random floating point values. A later one will look at random ints and bytes.

Modules for rand()
Module Algorithm	Stat Quality	Interval	Bits in output	Auto Seed	Predictable	Speed
CORE::rand drand48	Bad	[0,1)	48	OK	Yes	29000 k/s
Math::Random::MTwist MT XS	OK	[0,1)	53	OK	Yes	14000 k/s
ntheory ChaCha20 XS	Good	[0,1)	53-113	Full	No	12000 k/s
Math::Prime::Util::GMP ISAAC XS	Good	[0,1)	53-64	Good	No	14000 k/s
Crypt::PRNG Fortuna XS	Good	[0,1)	53	Good	No	700 k/s
Math::Random::MT::Auto MT XS	OK	[0,1)	52	Full	Yes	5000 k/s
Math::Random::MT MT XS	OK	[0,1)	32	rand()	Yes	2200 k/s
Math::Random::Secure Math::Random::ISAAC	Good	[0,1)	32	Full	No	460 k/s
Math::Random::ISAAC ISAAC XS or PP	Good	[0,1]	32	None	No	5700 k/s 480 k/s
Math::Random::Xorshift Xorshift XS	OK	[0,1]	32	Time	Yes	16000 k/s

]]> Some things we generally want from our random number generators:

Statistically random. Any good CSPRNG will qualify, as will a much faster modern PRNG such as PCG or Xoroshiro128+. The Mersenne Twister is quite good though inferior to the newer choices in a few ways. Xorshift has been found to not be very good but the latest in its family, Xoroshiro128+ is considered excellent. drand48, used by Perl for CORE::rand since 5.19.4, has numerous issues. It is, however, better than the crap-shoot we had in older Perls. Both are listed as having "serious defects" in Jones' guide.

Consistent semantics. rand() returns a uniform floating point number in the half-open interval [0,1). This means we can get 0 but will not every get a 1. If given an argument, then our range goes up to but not including it -- we will get a random representable real from 0 up to but not including the number. This makes things like int(rand(6)) return 0-5 and not occasionally give 6. This also lets us combine adjacent intervals knowing there is no overlap. Two of the modules get this wrong. This has nothing to do with the underlying generator, but with the method chosen to create the return value.

Quality outputs. An NV in Perl is, by default, a double which can store up to 53 bits in the range [0,1). Some modules only fill the result with 32 bits. While this is fine for rolling dice, it won't give good results in some other scenarios. A Perl build with long double support may have 64 bits of mantissa, and with Quadmath support in 5.22 and later we get a full 113 bits to fill. I decided in my module that the best solution was to fill up the NV, giving them the maximum number of bits. It's no extra work for double and most long double builds. It is a little slower for quadmath, but clearly the customer wants lots of precision so we should honor that. I think anything less than 52 bits is substandard.

Good seeding available. Some modules strive to seed with an O/S entropy source including Windows and other fallbacks if /dev/urandom isn't available. Perl's internal seed function tries to use /dev/urandom as well as a bit of random process state. Its main issue is that it only uses 32 bits. A couple modules use the current time, which is generally not a good method. Math::Random::ISAAC punts the problem to the caller on the theory that no seeding is better than bad seeding, so modules using it need a seeding layer. There are some subtleties in seeding ISAAC that some ignore.

Repeatable. For testing or simulations we need to be able to repeat the result, therefore we have to be able to manually seed. All the modules except Crypt::PRNG support this. For true cryptographic use we really ought to have a forced-system-entropy-seed mode plus periodic reseeding. Crypt::PRNG operates in this way with no ability for user seeding.

Unpredictibility. For many uses, such as simulations, this isn't necessary at all. For others it is absolutely critical (breaking online Poker in 1999). This is one of the things that separates CSPRNGs from their lesser (but faster) brethren. It is analagous to using a cryptographic hash (e.g. SHA2) vs. generic hash (e.g. SipHash). The former has stronger guarantees but is often an order of magnitude or more slower. For drand48 or Xorshift, after seeing a single output we know every number that will come out in the future as well as all the previous outputs. With Mersenne Twister it takes 624 results. Methods like PCG and Xoroshiro128+ are much harder to predict though it is not impossible. This feature is a CSPRNG requirement so ISAAC, Salsa20, ChaCha20, and Fortuna completely meet it.

Performant. All other things being equal, we'd like it to be fast. A vast amount of time can be spent in the Perl / Module interface, making some of the solutions much slower than the underlying algorithm requires. Even when the interface is well optimized, it masks a lot of the difference, allowing algorithms like ChaCha20 and ISAAC to have not-dissimilar speeds from some fast regular PRNGs and substantially outperform much faster underlying algorithms in some modules. With a bulk interface such as random_bytes() things look slightly different. Note that the worst performing solution is still 100,000 calls per second, so this may be moot for your application if you just need a few numbers.

CORE::rand. Very fast. Separate context for each thread. 32-bit seed, 48-bit state. Very poor quality -- for example the 48th bit will always alternate 1,0,1,0,1,0,... while the 47th goes 0,0,1,1,0,0,1,1,... etc. The bottom bits are not at all random, which is a fundamental property of the type of generator using a non-prime modulus (2^48 in this case). It would be very interesting to see some alternatives (e.g. PCG and Xoroshiro128+) in the Perl core, similar to what we do with hash functions.

Math::Random::MTwist. Excellent alternative to core. It's fast and good quality. Random quality is what one would expect from Mersenne Twister. Lots of options. Threading should be done with OO interface.

ntheory. My module, which I added a ChaCha20 CSPRNG to. It was originally intended for speeding up large random prime construction vs. using Bytes::Random::Secure, but it is useful for a standard rand interface as well (srand, rand, irand, irand64, random_bytes, plus some more). ChaCha20 is the CSPRNG which is being used in new arc4random functions in *BSD, is the chosen method for /dev/urandom in Linux 4.18, is used in some new TLS methods, and is increasingly popular as a CSPRNG or stream cipher. The standard implementation isn't as fast as the ISAAC core, but there are AVX2 implementations that are a lot faster. Performance here doesn't suffer much vs. other alternatives, and it's in the top performance tier (albeit at the bottom of the leaders, but 2x faster than many other modules). Seeds at startup using a number of possible sources, and allows easy reseeding with a decent amount of system entropy. It also contains a full ChaCha20 implementation in Perl, as well as matching XS and Perl implementations of ISAAC, though using ISAAC is a build-time configuration option likely to be removed. Uses a single state and mutex lock for threads, which will be replaced by per-thread state in a later release.

Math::Prime::Util::GMP. Uses ISAAC in XS and runs quite a bit faster than the other ISAAC modules. Seeds at startup using /dev/urandom if possible. It wasn't really intended to be a primary module for this work, but it's an interesting comparison with the other ISAAC modules, especially the performance. No threading support.

Crypt::PRNG. An API for LibTomCrypt and part of the large CryptX package. It has a few choices, with the default being Fortuna -- an entropy pool framework with multiple entropy input streams, using AES in counter mode for the actual stream. Very high quality results, well done interface. The only downside is the slow performance, which is typical of the AES-CTR algorithm (when implemented without specific hardware support).

Math::Random::MT::AutoNot a bad module, but it seems to be superceded by MTwist, in that the latter is over 2x faster and doesn't seem to be missing many (any?) features.

Math::Random::MT. An earlier module than the above. Simpler, slower, and seeds from CORE::rand. Only allows 32-bits of state to be set by the user. Only fills returned doubles with 32-bits.

Math::Random::Secure. Gives a (srand,irand,rand) interface to Math::Random::ISAAC, including proper seeding. Large dependency list. Only fills returned doubles with 32-bits. Very slow, and even slower if Math::Random::ISAAC::XS isn't installed.

Math::Random::ISAAC. The first pure Perl implementation of ISAAC, widely copied or used by other modules. There is also an XS back end. Creates double output for rand() by dividing a random 32-bit int by 2^32-1 which makes the non-standard interval [0,1] and only uses 32-bits. No auto seeding is done.

Math::Random::Xorshift. A thin wrapper around Marsaglia's 2003 Xorshift PRNG. As the module author points out, this has some statistical flaws, though the later xorshift* algorithm fixed most of them and xoroshiro128+ is significantly better. Per thread context. Creates double output for rand() by dividing a random 32-bit int by 2^32-1 which makes the non-standard interval [0,1] and only uses 32-bits. Seeds with time. The primary advantage is that it runs faster than the other modules.

Commented on Base conversion in Dana Jacobsen

2016-08-18T16:25:58Z

Thanks! I added a small paragraph about Perl 6. The example on RosettaCode (http://rosettacode.org/wiki/Non-decimal_radices/Convert#Perl_6) shows how easy it is for standard bases 2-36, and I love how there's no wonkiness with int/double/bigint.

Posted Base conversion to Dana Jacobsen

2016-08-17T16:59:47Z

Earlier this year I added some commands to the ntheory module, inspired by Pari/GP and Mathematica, to split numbers into digits and put them together again. Of course this isn't terribly exciting by itself since Perl has split and join, but with optional bases it gets more interesting.

I recently took a look at the various modules that do base conversion (at least 9 modules, plus various standalone subroutines). Each has slightly different features and interfaces, and the performance at the extremes differs by over 10,000x. I've made some internal changes to ntheory based on my tests, which should show up in the next release.

Add: This is for Perl 5. Perl6 has native support for base conversions for bases 2-36 and seamless Bigint support. It Just Works. For larger bases or alternate encodings, bbkr's TinyID module can be used.

]]> Base conversion is used for different reasons. My interest has been purely for math reasons, rather than the more (in my view) web-oriented things such as base-64, base-85, and ASCII. ntheory will do base conversions independent of encoding if desired (an array of digit values), while most of the other modules here only deal with encoded numbers. Some modules are very flexible with encoding character sets, while others (including ntheory) stick with 0..9,a..z. Bigint handling varies greatly.

Since I'm always interested in performance, here are some benchmarks. 1000 random 60-bit integers are created, then converted to the input base. The benchmark sub does @output = map { convert } @input, which helps remove overhead vs. converting only one in a sub. The output arrays are verified to match. ntheory is an XS module, the rest (other than builtin and POSIX) are all pure Perl.

Base 6 to base 26:

Mastering Alg     3.35e-02/s
simple               0.602/s
Math::BaseConvert    0.605/s
Math::BaseCnv        0.611/s
Convert::AnyBase      57.4/s
Math::NumberBase      78.4/s
Math::BaseCalc        89.3/s
Math::Int2Base         105/s
ntheory               1627/s

Convert to base 2:

Mastering Alg       0.113/s
Math::BaseConvert   0.299/s
Math::BaseCnv       0.301/s
Math::Base::Convert  27.9/s
Convert::AnyBase     36.4/s
Math::NumberBase     46.7/s
simple               64.0/s
Math::BaseCalc       72.7/s
Math::Int2Base       91.7/s
builtin              1953/s
ntheory              2424/s

Convert from base 16:

Mastering Alg       0.271/s
Math::BaseConvert    1.92/s
simple               1.95/s
Math::BaseCnv        1.97/s
Math::Base::Convert  22.5/s
Math::BaseCalc        180/s
Math::NumberBase      190/s
Math::Int2Base        216/s
Convert::AnyBase      380/s
POSIX                3099/s
ntheory              3709/s

There are some builtins for common conversions. For instance:

sub to2  { sprintf "%b", shift; }
sub to16 { sprintf "%x", shift; }
sub from2  { unpack("N", pack("B32", substr("0" x 32 . shift, -32))); }
sub from16 { hex(shift); }

though they are limited to bases 2,8,10,16 and limited to the native word size (hex will warn if given a 33+bit input). The POSIX core module includes strtoul that will work with bases 2-36, but limited to the size of an unsigned long (which may or may not be the same size as Perl's UV integers!). These are quite fast.

There is some simple Perl code on Rosetta Code: Non-decimal radices/Convert. It works for bases 2-36, though not terribly quickly. It should be clear how to adjust it to handle other bases / encodings.

The book "Mastering Algorithms with Perl" (1999, errata1, errata2) has a base conversion routine. It's interesting, it works, but it's very slow -- over 10x slower than the much shorter RosettaCode routines. Bases 2-36, no bigints.

ntheory (2012-2016) is the one XS module in this list, so it's no surprise that it is the fastest, typically over 10x faster than the next fastest, and about the same speed as the builtins. It supports bases 2-2³¹ for arrays, but string encodings only 2-36 (lower case on output, either case for input).

fromdigits( $n );                  # implied base 10, number/string/bigint
fromdigits( "1c8", 16 );           # bases 2-36, any length string
fromdigits( [1,2,0,0,2,1], 3 );    # bases 2-2^31, digits of number

This takes a number in the given base and turns it into a single base-10 integer, possibly a bigint.

For todigits, the input $n is always an integer, hence base 10. It can be an actual int, a string, or a bigint. The result is an array of digits in the optional base, which is 2-2³¹.

todigits( $n );          # implied base 10, returns array of digits
todigits( $n, 2);        # turns $n into an array of binary digits
todigits( $n, 23, 14);   # exactly 14 base-23 digits

The optional third argument will either pad with leading zeros or truncate leading digits. It would be most common to use for things like needing exactly 32 binary digits.

Rather than rely on context (which too often goes wrong or requires wrapping in scalar()), I decided to use a separate function to return a string.

todigitstring( $n );       # implied base 10, basically a no-op.
todigitstring( $n, 23 );   # bases 2-36, encoded lower case
todigitstring( $n, 2, 4);  # bases 2-36, pads or truncates to 4 characters

Putting them together, we can convert strings in base 6 to base 27 like:

$n27 = todigitstring( fromdigits( $n6, 6 ), 27 );

Math::Int2Base (2008-2011) is a nice simple module that does base conversions with bases 2-62. In my benchmarks it is typically one of the fastest (barring ntheory and builtins), and the interface is very simple.

$s = int2base( $n, $b );   # converts $n (base-10) to $s (base-$b)
$n = base2int( $s, $b );   # converts $s (base-$b) to $n (base-10)

An optional third argument for int2base will pad with leading zeros if desired. Input and output are hard-coded to 0..9,A..Z,a..z, so bases under 36 must use upper case for input, and upper case will be output.

BigInts are supported for int2base if the input number is a bigint, and for base2int if the base is a bigint, with a caveat. Math::BigInt and Math::Pari work, but Math::GMP and Math::GMPz will both give incorrect results due to some weird interaction with int(). v0.59 and earlier of ntheory has the exact same issue.

With v5.21 and newer, int2base outputs a warning. I've filed an RT.

Math::BaseCnv (2003-2016) is also easy to use, and supports bases 2-64 by default, and many more encodings by name as well as custom settings if desired. By default upper case must be used for input and will be used in output.

$n27 = cnv( $n6, 6, 27 );   # convert input from first base into second.

As mentioned, really easy to use. The downside is that it is quite slow. If speed wasn't an issue and I wanted web-ish or arbitrary encodings, this is probably the module I'd use.

Math::BaseConvert is a fork of Math::BaseCnv to fix version/metadata issues that seem to now be resolved in the original. No surprise, the performance and functionality are almost identical. While the description says "fast functions [...]", it is not fast (Math::BaseCnv changed to "simple functions" a while back).

Math::BaseCalc (1999-2013) creates a conversion object that is then used in conversions to/from the base. There are few presets, but typically an array ref of characters is given to denote the encoding.

my $cnv7 = Math::BaseCalc->new(digits=>[0..6]);
$n = $cnv7->from_base("6543210");

Two objects are needed to convert between two non-decimal bases. Speed isn't bad. Bigints are supported by to_base but not from_base.

Math::NumberBase (2009) also uses a conversion object. Bases 2-36 use default lower-case symbols, but alternate sets can be given. There is a way to wrap two conversions together, though still using two objects.

my $cnv7 = Math::NumberBase->new(7);
$n = $cnv7->to_decimal("6543210");

Math::Base::Convert (2012-2015) has quite a few features, including both function and object interfaces. The object method performs pre-optimizations, and it has a lot of layers trying to map to fast conversion functions.

Strangely it does not seem to support bases other than powers of 2 and its named bases. I have filed an RT, because this is not the documented functionality. The cnv function returns either a value or an array depending on context, meaning I had to remember to wrap scalar() around it in most uses. I'm not a fan of using context here.

My benchmarks did not show it as particularly fast in most conversions, whether using the function or object method. For Bigints it does look better, and I made changes to ntheory based on this.

Convert::AnyBase (2009) uses Moose, which makes it a monster for dependencies compared to anything else. On the other hand, it uses this for some unique features. In particular, a user-defined sub can be applied to any input string, which allows things like case adjustment, handling negative inputs, invalid input, transforms (e.g. o to 0), etc.

A convenience downside is that a full symbol set must be given when creating the object. I used a shortcut thus:

my $fb=Convert::AnyBase->new(set => substr("0123456789abcdefghijklmnopqrstuvwxyz", 0, $base));

and as usual with most OO modules, two objects are needed for conversion between non-decimal bases.

Convert::BaseN (2008) is quite different than the other modules here, as it expects input and output as binary strings. I did not use it, as it really doesn't fit the use model here.

Math::Fleximal (2001-2005) is another module that doesn't really fit with the others. One creates flex objects that have a particular base, then one can do math with them, or convert to flexes with different bases. I didn't use this much, as it is overkill for simple individual number base conversions.

Posted Short survey of modules for combinations and permutations to Dana Jacobsen

2015-02-14T05:32:58Z

This is a short look at some modules for generating combinations and permutations. There are likely more modules that aren't listed. RosettaCode has examples of writing the combinations and permutations functions by hand.

Combinations and Permutations
Module	Impl	Comb	Perm	Comb w/rep	Perm w/rep	Derange	Speed	Order	Comments
Algorithm::Combinatorics	XS	yes	yes	yes	yes	yes	+	Lexico	Fast iterator or array
ntheory	XS	yes	yes	no	no	no	++	Lexico	Fast block call
Math::Combinatorics	Perl	yes	yes	no	no	yes	- -	Impl	Iterator or array
Algorithm::FastPermute	XS	no	yes	no	no	no	+++	Impl	Fast block call
Algorithm::Permute	XS	no	yes	no	no	no	+	Impl	Iterator or fast block call
Algorithm::Loops	Perl	no	yes	no	no	no	+	Impl	Iterator
List::Permutor	Perl	no	yes	no	no	no	-	Lexico	Iterator
Iterator::Misc	Perl	no	yes	no	no	no	- -	Lexico	Iterator
Math::Permute::Array	Perl	no	yes	no	no	no	- -	Impl	Iterator or index
Math::Permute::List	Perl	no	yes	no	no	no		Impl	Block call
Math::GSL::Permutation	XS	no	yes	no	no	no	-	Lexico	function interface
Math::Disarrange::List	Perl	no	no	no	no	yes		Impl	Block call
Math::GSL::Combination	XS	yes	no	no	no	no	+	Lexico	iterator or by index

]]> For counting the total number rather than generating actual vectors, Math::Counting, ntheory, or Math::Pari probably have what you need. Also see the RosettaCode task.

Some modules such as Algorithm::Combinatorics, ntheory, and List::Permutor give results in guaranteed lexicographic order. The other modules return data in an order corresponding to whatever internal algorithm is used. For an example unsorted 7 element array, each of the "Impl"-order modules gave a unique sequence (that is, each modules gave a different sequence from any other), while all "Lexico"-order modules gave identical sequences.

The speed is an approximate rating of how fast the permutations or combinations are generated with a relatively large set. Looping over the 479 million permutations of a 12 item set takes only 12 seconds for Algorithm::FastPermute, 1 minute for ntheory, 6 minutes for Algorithm::Permute, 12 minutes for Algorithm::Combinatorics and Algorithm::Loops, 30 minutes for List::Permutor, 37 minutes for Math::Combinatorics, 39 minutes for Math::GSL::Permutation, 42 minutes for the common example tsc-permute, 48 minutes for Iterator::Misc.

The perlfaq recommends List::Permutor, Algorithm::Permute, and Algorithm::Loops. I believe Algorithm::Combinatorics to be a better choice to point people to, as it is likely to cover all needs and calling styles, not just permutations. The results come in lexicographic order rather than implementation defined. It also has excellent documentation.

In all examples, assume we have done something like this setup, and wish to see all permutations of the data, or all combinations of 3 elements.

        use feature 'say';
        my @data = (qw/apple bread curry donut éclair/);

Algorithm::Combinatorics. This is probably what you're looking for. It has nearly everything you need and is pretty fast. Recommended. If you need the highest speed, Algorithm::FastPermute and ntheory are faster.

        use Algorithm::Combinatorics qw/combinations permutations/;

        my $citer = combinations(\@data, 3);

        while (my $c = $citer->next) { say "@$c"; }

        my $piter = permutations(\@data);

        while (my $p = $piter->next) { say "@$p"; }

ntheory. XS and Perl block calls for permutations and combinations. Note that the source array isn't directly used -- each block invocation is given an array of indices rather than a direct permutation/combination of the source array.

        use ntheory qw/forcomb forperm/;

        forcomb { say "@data[@_]" } scalar(@data),3;

        forperm { say "@data[@_]" } scalar(@data);

Math::Combinatorics. One of the slowest of the modules tested, but it does combinations, permutations, and derangements all without XS.

        use Math::Combinatorics;
        my $comb = Math::Combinatorics->new(count => 3, data => [@data]);

        while (my @c = $comb->next_combination) { say "@c" }

        while (my @p = $comb->next_permutation) { say "@p" }

Algorithm::FastPermute. The fastest permutation generator -- for large arrays it is about 10x faster than ntheory, 50x faster than Algorithm::Permute, 60-500x faster than the other modules. Modifies the source array, but after a full permutation it will be in the original order.

        use Algorithm::FastPermute qw/permute/;
        permute { say "@data" } @data;

Algorithm::Permute. Decent permutation iterator. Also includes the FastPermute block generator that works just like that example.

        use Algorithm::Permute;
        my $perm = Algorithm::Permute->new(\@data);
        while (my @set = $perm->next) { say "@set" }

Algorithm::Loops. Permutations through an iterator that modifies the source array. Combinations possible with some code. The array must be sorted to work correctly, and if using sorted numbers, you must use NextPermuteNum.

        use Algorithm::Loops qw/NextPermute/;
        do { say "@data" } while NextPermute(@data);

List::Permutor. Yet another permutation iterator.

        use List::Permutor;
        my $perm = List::Permutor->new(@data);
        while (my @set = $perm->next) { say "@set" }

Iterator::Misc. Yet another permutation iterator, also includes various other iteration functions.

        use Iterator::Misc;
        my $iter = ipermute(@data);
        while (!$iter->is_exhausted) { say "@{$iter->value}" }

Math::Permute::Array. Yet another permutation iterator. Has a rather different syntax than most other modules. Also has a block call. Also allows direct access to permutation index, though without defined order.

        use Math::Permute::Array;
        my $perm = Math::Permute::Array->new(\@data);
        say "@{$perm->cur()}";
        for (1 .. $perm->cardinal()-1) { say "@{$perm->next()}" }

Math::Permute::List. A block permutation generator. Permission issue recently fixed, so it installs correctly now.

        use Math::Permute::List;
        permute { say "@_" } @data;

Math::GSL::Permutation. Uses the GSL permutation API, which is function based, with a few (incomplete) helper methods. Not recommended for this task unless you're already using GSL. Uses a different API than Combination. Permutes reasonably fast, but no quick way to retrieve the array of permutations (calling as_list takes 95% of the time). Also note we have to use private class data.

        use Math::GSL::Permutation qw/:all/;
        my $p = Math::GSL::Permutation->new(scalar(@data));
        do {
          say "@data[$p->as_list]";
        } while !gsl_permutation_next($p->{_permutation});

Math::Disarrange::List. A block derangement generator. Permission issue recently fixed, so it installs correctly now.

        use Math::Disarrange::List;
        disarrange { say "@_" } @data;

Math::GSL::Combination. Documentation a bit incomplete. Not recommended for this task unless you're already using GSL. Inconsistent API with Permutation.

        use Math::GSL::Combination qw/:all/;
        my $c = Math::GSL::Combination->new(scalar(@data),3);
        do {
          say "@data[$c->as_list]";
          $c->next()
        } while !$c->status();

Commented on Stop Putting AUTO_INCREMENT IDs in URLs in Ovid

2014-11-06T19:13:42Z

laumers, by "guessable hashes" you mean 32 random hex digits? How is that guessable? Note I did not say to hash the id, as that would indeed be guessable. I said encrypt, e.g. $cbc->encrypt($id,$key,$iv);

Commented on Stop Putting AUTO_INCREMENT IDs in URLs in Ovid

2014-11-05T18:36:36Z

If you want to use random-like ids, you could use your convenient autoincrement internal application number, then do a simple encryption of the number. That makes more sense to me than a random UUID, in that it will look and act random for external users, but it is deterministic and reversible on your side in case you want that. No need to store what big random UUID mapped to what id, for instance.

That said, I like the human-readable bit more than random hex digits, but it's something to consider.

Commented on Perl6 Pi and continued fractions in Brad Gilbert

2014-10-31T21:32:27Z

This is tangentially related as it isn't looking at rationals.

I've seen a few places that use something like: Pi(n) that gives n digits of Pi. I added that to the Perl5 ntheory library a couple months ago, though I'm sure a clean Perl6 implementation wouldn't be as fractured. The performance difference is minor at a few digits, but at larger sizes it's quite dramatic (e.g. for 1M digits it can be a couple seconds vs. hours vs. weeks).

Chudnovsky / Ramanujan binary splitting. Looks like 4-5x faster than AGM, though more complicated. Pari uses this (they have AGM commented out after doing both and comparing). This is also what is used in the example on the GMP library site. I've tried it with Math::BigFloat and it was slower than AGM for me (probably due to more but smaller operations, which kills you with overhead).
AGM (Gauss-Brent-Salamin). Fast and easy (~15 lines of Perl 5), not too many lines with GMP. Obviously the Perl5 implementation is oodles slower than C+GMP, but it still has the good growth rate. MPFR uses this, as does ntheory's GMP and Perl5 code. There's a patch for Math::BigFloat to use this for larger inputs, but it hasn't been accepted. There's a Perl6 implementation on RosettaCode, but pretty slow.
Machin type formulas. Math::BigFloat uses this. It's pretty good for small amounts but AGM kills it for many digits because the growth rate is better. With the default Calc backend there are faster ways at all sizes.
Spigot style. The nice thing is that it doesn't need bigints and can be easily done in standard C or Perl. There is something similar for Perl6 on RosettaCode. Terrible growth rate vs. the others, but it's pretty fast for small sizes: at 2000 digits the pure Perl version is over 10x faster than either Machin or AGM with Math::BigFloat's Calc backend.

As for a rational, it would be nice to see something that produced a Rational or FatRat of the desired accuracy. Rationals are limited to 64-bit denominators (S02) so the fixed continuous fractions would seem fine. I think using the Chudnosky formula plus the RC trick for square roots could do this for FatRats.

As an aside, I'm happy that Perl6 is using arbitrary size ints, but it seems Rat vs. FatRat is akin to the native vs. bigint that drives me nuts with Perl5 and Perl6 gets rid of. For a library that expects exact answers, now I have to constantly deal with input and output conversion, and performance vs. correctness tradeoffs.

Posted Partitions of an Integer to Dana Jacobsen

2014-10-05T00:01:09Z

The subject of integer partitions comes up regularly, with multiple threads over the years at PerlMonks, a big chunk of Chapter 5 of Higher-Order Perl, and some pages from Knuth TAOCP volume 4A. Last year I added a partitions function to my ntheory Perl module, for counting integer partitions (OEIS A000041). A few months ago I added a partition iterator (forpart) since I liked the one Pari/GP added last year.

Integer partitions
Solution	Impl	Order anti-lex	Order lexico	Restrict count	Restrict size	Max in 10s	Count in 10s
ntheory 0.45	XS	yes	no	yes	yes	87	223,000
ntheory 0.45	Perl	yes	no	yes	yes	72	7,300
Integer::Partition 0.05	Perl	yes	yes	no	no	67	-
Integer::Partition::Unrestricted (unreleased, from Limbic 2004)	Perl	no	yes	no	no	62	6,000
MJD 2013	Perl	no	no	no	no	71	-
blokhead 2007	Perl	yes	no	no	no	63	-
kvale 2004	Perl	yes	no	no	no	62	-
sfink 2004	Perl	yes	no	no	no	58	-
tye 2001	Perl	no	no	no	no	58	-
jimt/liverpole/blazar 2006 (golfed, 73 chrs)	Perl	no (73) yes(90)	no	no	no	21	-
Pari/GP 2.8.0 (not a Perl module!)	C/Pari	no	no	yes	yes	100	34,000,000

]]> This is a look at some different solutions, including two CPAN modules, one unreleased module, and six solutions from forum posts. I also include the latest Pari/GP for comparison. All forum solutions were modified if necessary to take a sub called with the partition. This matches the behavior of the two modules and allows printing, filtering, counting, etc. without modifying the original code. Only iterators were considered, as the number of partitions grows quite rapidly.

For counting, the fastest solutions use the Hardy-Ramanujan-Rademacher formula. The state of the art is Johansson's ARB, which is thousands of times faster than Pari. Pari also uses the Rademacher formula and is quite fast. Jonathan Bober has GPL code using GMP and MPFR that is a little faster than Pari, but MPFR isn't installed on most platforms (meaning it's hard to incorporate into a portable library). I'm using a triangle solution, which isn't too bad in C+GMP compared to Perl's bigints, but way off the fast formula. Integer::Partitions doesn't have a counting method.

ntheory (aka Math::Prime::Util) has a function taking a block (aka an anonymous sub), the number, and an optional hash of restriction parameters. The block is called for each partition with the argument list set to the partition. The restriction parameters are similar to Pari/GP's, with min/max count and min/max element size. This can save quite a bit of time doing filtering (sometimes with shortcuts) inside the XS code.

I debated call by value (a new argument list for each call) vs. call by reference (giving the caller access to internal data). Some XS iterators, e.g. Algorithm::Permute's fast permute, do the latter, and it is faster. I decided on the former because I like the idea that the caller can manipulate its arguments as desired without worrying about segfaults, incorrect iteration, infinite iteration, etc.

Typically the XS code would be used, but there are also pure Perl implementations for everything. They are used if the XS Loader fails, the environment variable MPU_NO_XS exists and is true, or in cases where the arguments would overflow or are not properly parseable.


        use ntheory qw/forpart partitions/;
        forpart { say "@_"; } 8;                     # All partitions of 8
        forpart { say "@_"; } 10,{nmin=>5,amax=>4};  # Only 5+ parts, all <= 4
        say partitions(2000);                        # Counting

The Integer::Partition module has been on CPAN for a number of years, and is the only solution giving both lexicographic and anti-lexicographic ordering choices. It's reasonably fast unless you need larger values with restrictions, or want counts.


        use Integer::Partition;
        my $iter = Integer::Partition->new(8);
        while (my $p = $iter->next) {
          say "@$p";
        }

Math::Pari isn't in the table because it builds with Pari 2.1.7 by default. numbpart (partition count) was added in version 2.2.5 (Nov 2003), and forpart was added in 2.6.1 (Sep 2013). It's possible to build Math::Pari with version 2.3.5 so we could get numbpart but not forpart.

Pari's forpart is quite fast, and has some nice optimizations for restrictions as well. The ordering is by number of elements rather than a lexicographic ordering. The only way to use this from Perl would be a system call to gp.

There are also golfed solutions in under 100 characters. We can even add a callback sub and anti-lexicographic sort and still come in at 93 characters. As usual with heavily golfed code, these are quite slow, managing only 21 partitions in under 10 seconds. This also uses an internal hash for the partitions, which means memory use will grow (though time grows faster so this isn't really an issue).

Here is my simple modification to the golfed solutions, taking 90 characters
for integer partitions in anti-lexicographic order with a callback sub. It's
very slow however, so just for fun.



sub p{my($s,@e,$o,@f)=@_;@f=sort{$b<=>$a}@e;$_{"@f"}||=$s->(@f);p($s,++$o,@e)while--$e[0]}

p(5);

Commented on Learning Perl 6 - Find in a Large File in :m)

2014-09-10T11:31:32Z

I was having a similar issue, but on a larger file (32M 64-bit integers, about 120x larger than /usr/share/dict/words on my machine). This is more than I'd like to pull into an array, as it uses 13.5GB of memory.

Using the 'for $in -> $x { ... }' style was going quite slow, but the helpful people on #perl6 got be to try .get in a loop, e.g. 'while (my $x = $in.get) { ... }' which turns out to be much faster. Not only does it use almost no memory, it's 50% faster than the latest @d = "file".IO.lines.

BTW, this was meant to share my experience with using .get for reading a large file. Big thanks to Liz and others for speeding up Str.lines!

Commented on A comparison of memory use for primality modules in Dana Jacobsen

2014-09-01T05:51:18Z

I had tried adding trial division with no better performance, but that last was the clue. For small sizes, when not using Montgomery math, I have trial division using unsigned ints. This almost halves the time on the AIX/Power7 machine -- down to 5.0s. As I expected, trial division is slower on x86_64. I'm interested in hearing how your tests went.

Commented on A comparison of memory use for primality modules in Dana Jacobsen

2014-08-31T21:28:25Z

Perl 5.21.3 built on AIX 7.1 Power 7, 64-bit. Unfortunately building things on AIX is a PITA -- I gave up trying to get GMP, Math::Pari, or ExtUtils::CppGuess (used by Math::Prime::FastSieve) to work. Measuring memory use is non-obvious for fast-running tasks. I'll just do some times for the same 1-10M isprime to at least show the different architecture.

132.1 Perl trial division mod 6
291.7 Perl trial division
9.8 Math::Prime::Util
2.5 Math::Prime::Util with precalc
6.7 Math::Prime::XS

On this machine, Math::Prime::XS's simple trial division loop is faster than the non-cached routine I use in MPU until 3e7. Part of this is that MPU uses UV internally while MPXS uses "unsigned long". On this machine UV is "unsigned long long" (64-bit) and unsigned long is only 32-bit. That means MPXS is 32-bit, so doesn't work past 2^32 and probably explains the speed difference as well.

Commented on Link to public version control system (GitHub or otherwise) in Gábor Szabó - גאבור סבו

2014-08-31T01:36:37Z

Thanks for the reminder. It looks like two modules I updated for CPAN day aren't showing up because I forgot to add the meta-spec entry. I made sure I added the repository, but it seems without that it just tosses everything but the homepage.

Commented on A comparison of memory use for primality modules in Dana Jacobsen

2014-08-30T01:10:32Z

Brian, that's really a different question. This started out as just a "how much memory is used just to load the module without doing anything" but then I figured I should call a single function. Then figured I may as well do a toy task.

These aren't huge numbers, but from the Math::Prime::Util documentation:

is_prime from 10^100 to 10^100 + 0.2M
    2.2s  Math::Prime::Util          (BPSW + 1 random M-R)
    2.7s  Math::Pari w/2.3.5         (BPSW)
   13.0s  Math::Primality            (BPSW)
   35.2s  Math::Pari                 (10 random M-R)
   38.6s  Math::Prime::Util w/o GMP  (BPSW)
   70.7s  Math::Prime::Util          (n-1 or ECPP proof)
  102.9s  Math::Pari w/2.3.5         (APR-CL proof)

Math::Prime::Util with the GMP backend will support hundreds of thousands of digits, and is probably the fastest code for large numbers other than OpenPFGW's Fermat test, and is substantially faster than any of the other Perl modules. See this stackexchange challenge, or Nicely's list of first occurrence prime gaps where I used this module.

Caveat being that without Math::Prime::Util::GMP installed, it uses Math::BigInt (with GMP or Pari backend), which is super slow. My todo list has some sort of replacement to get a bigint solution that is both (1) portable assuming XS, and (2) reasonably fast. Also, there are some nice optimizations for x86_64 as well as 64-bit in general. It is still fast on non-x86 machines, but it will miss some of the better optimizations (asm mulmod, montgomery math).

Math::Pari, Math::GMP, Math::GMPz, and Math::Primality will support bigints pretty well. For the two GMP methods you'll have to decide how many tests to use. Math::Pari really needs to be updated to use a newer Pari by default -- the current version will do 10 M-R tests and is quite a bit slower than when built with Pari 2.3.5.

Math::Prime::XS does not support bigints. For 64-bit primes it is about 3-4 million times slower than MPU on my machine (but should be fast for most composites).

Math::Prime::FastSieve is going to eat a lot of memory and time making the sieve once we're past 10^8 or so. The answers are fast once done, but it's not the best solution. It took me 2 minutes to sieve to 10^10, and beyond that will take GB of memory.

Trial division is exponential time so even with C+GMP is not going to be practical past 25-30 digits (and is hideously slow at those sizes). The Perl code is just going to get worse.

Time for primality proofs is another discussion -- I'm writing some web pages on that since I realized I keep writing the same thing on forums.

For the largest known primes, we'd want to use a Lucas-Lehmer test since they are Mersenne primes. I have not added any special form tests (nor have the other modules), but the LL test is pretty straightforward. They would still take a long time. The largest currently known prime has 17,425,170 digits. Using code specifically made for this, it took 6 days on a 32-core server and 3.6 days on a GPU.

For a general form numbers, last year some people ran tests on a couple Wagstaff PRPs with ~4 million digits. OpenPFGW took 4-70 hours to show they were Fermat PRPs, and 5 days for the Lucas test. A fast Frobenius test implemented with GMP took slightly over one month.