PDL in Rust -- A Native Reimplementation of the Perl Data Language

Logo.png


A few days ago, when we announced pperl's native module strategy
on Reddit, someone asked about PDL support.
We replied:
"PDL will be supported." — to which u/fuzzmonkey35
responded: "We will live in glorious times when this happens."

Well. It happened.

We (as in: we and our AIs) reimplemented PDL (the Perl Data Language) from scratch in Rust — not a binding, not an FFI wrapper, but a ground-up reimplementation of the core engine. 15 data types, N-dimensional arrays, broadcasting, operator overloading, reductions, linear algebra, transcendental math — all in pure Rust, integrated as a native module into our pperl next-gen Perl5 platform.

use PDL;
my $a = pdl([1, 2, 3]);
my $b = pdl([10, 20, 30]);
say $a + $b;           # [11 22 33]
say ($a * $b)->sum;    # 140
say sin(pdl([0, 3.14159/2, 3.14159]));

45 tests, all green. Same Perl syntax. No XS. No C. No libpdl.

Why Reimplement PDL?

PDL is one of Perl's crown jewels — a fast, expressive N-dimensional array language that has served the scientific Perl community since 1996. It is also one of the significantly entangled XS modules in the ecosystem: C code, PP code generation, Inline::C, compiler toolchain dependencies, and a build process that assumes intimate knowledge of perl's internals.

Our Perl runtime — pperl (Parallel Perl) — is a from-scratch Perl 5 interpreter written in Rust with a Cranelift JIT and automatic parallelization via Rayon. It does not support XS. That is a deliberate architectural decision: XS ties modules to CPython-style C internals, which makes JIT compilation across module boundaries impossible and thread safety a minefield.

But "no XS" means PDL cannot simply be loaded. It must be reimplemented. And since we already have a Rust runtime, the natural answer is: reimplement PDL's engine in Rust, with zero C dependencies, and expose it through pperl's native module interface — the same mechanism we use for List::Util, Storable, Fcntl, and other core XS modules.

What's Implemented

The Rust PDL crate (rust-pdl) implements the core engine. The pperl native module (src/native/PDL/) bridges it into Perl space as a blessed PDL object, with operator overloading and method dispatch working exactly as you'd expect.

Core Infrastructure

  • 15 data types: Byte, SByte, Short, Ushort, Long, ULong, LongLong, ULongLong, Indx, Float, Double, LDouble, CFloat, CDouble, CLDouble
  • N-dimensional arrays: arbitrary rank, with dims/strides/nelem metadata
  • Automatic type promotion: integer → float → complex, following PDL's complexity ordering
  • Broadcasting (formerly "threading"): implicit dimension negotiation for binary operations
  • Bad value handling: sentinel values for integers, NaN for floats

Constructors

my $v = pdl([1, 2, 3]);           # 1D vector
my $m = pdl([[1,2],[3,4]]);       # 2D matrix
my $z = zeroes(4);                # [0 0 0 0]
my $o = ones(3);                  # [1 1 1]
my $s = sequence(5);              # [0 1 2 3 4]
my $x = xvals(3, 3);             # x-coordinate grid

Arithmetic & Comparison (Overloaded)

my $a = pdl([1, 2, 3]);
my $b = pdl([10, 20, 30]);
$a + $b;      # [11 22 33]
$b - $a;      # [9 18 27]
$a * $b;      # [10 40 90]
$b / $a;      # [10 10 10]
$a == $b;     # [0 0 0]   (element-wise, returns Byte PDL)
sqrt($a);     # [1 1.414.. 1.732..]

Reductions

my $x = pdl([1, 2, 3, 4, 5]);
$x->sum;       # 15
$x->avg;       # 3
$x->min;       # 1
$x->max;       # 5

my $m = pdl([[1,2],[3,4]]);
$m->sumover; # [3 7] (reduce along last dimension)

Linear Algebra & Selection

pdl([1,2,3])->inner(pdl([4,5,6]));   # 32 (dot product)

my $mask = pdl([0, 1, 0, 1, 1]);
$mask->which; # [1 3 4] (indices of true)

Transcendental Math

sin(pdl([0, 3.14159/2]));    # [0 1]
cos(pdl([0, 3.14159]));      # [1 -1]
exp(pdl([0, 1]));             # [1 2.71828..]
log(pdl([1, 2.71828]));      # [0 1]

Additional math functions include tan, asin, acos, atan, sinh, cosh, tanh, ceil, floor, rint, erf, erfc, lgamma.

Architecture: Two Layers

The implementation is split into two clean layers:

1. The rust-pdl crate — a standalone Rust library (~7,100 lines across 23 modules) that knows nothing about Perl. It implements the PDL data structure, all operations, type dispatch, broadcasting, and display. It could be used from any Rust project.

2. The pperl native module (src/native/PDL/) — the bridge layer. It converts between Perl SVs and Rust Pdl structs, registers XS-style functions, manages operator overloading, and handles the blessed hash pattern that Perl-side PDL code expects. PDL objects are blessed hashrefs containing a raw pointer to a heap-allocated Box<Pdl>.

The module hierarchy mirrors PDL's own:

PDL              — top-level boot, exports pdl/zeroes/ones/sum/...
PDL::Core        — constructors, accessors, DESTROY
PDL::Ops         — arithmetic & comparison operators
PDL::Ufunc       — reductions (sum, avg, min, max, sumover, ...)
PDL::Math        — transcendental functions
PDL::Primitive   — inner, matmult, which, where, clip, append
PDL::Slices      — slice("1:3"), slice("0:-1:2"), etc.
PDL::Basic       — sequence, xvals, yvals, zvals

Why Not Just Bind to C PDL?

Three reasons:

  1. JIT integration. pperl's Cranelift JIT can compile hot loops operating on PDL data — but only if the data structures are Rust-native. A C library behind FFI is an opaque wall to the JIT. With rust-pdl, the JIT can in principle inline array operations into compiled machine code.

  2. Thread safety. pperl's Rayon-based
    auto-parallelization needs data structures that are safe to share
    across threads. Rust's ownership model provides this at compile
    time. C PDL's reference-counted, globally-mutated state does
    not.

  3. No toolchain dependency. A pure Rust PDL means
    no C compiler, no make, no XS, no Inline::C, no PP code
    generation. cargo build and it's done. The entire PDL
    engine compiles as part of the pperl binary.


Performance

Look at the test harness output:

PDL/010-constructor.t ... +5:ok (17/17) (p5: 81ms / +5: 5ms)
PDL/020-arithmetic.t  ... +5:ok (12/12) (p5: 82ms / +5: 4ms)
PDL/030-reductions.t  ... +5:ok (10/10) (p5: 81ms / +5: 5ms)
PDL/040-math.t        ... +5:ok  (6/6)  (p5: 85ms / +5: 4ms)

These numbers deserve context. The p5 column is standard perl5 with PDL loaded from CPAN — 81-85ms per test file because PDL's startup pulls in dozens of modules, compiles PP code, and initializes the type system. The +5 column is pperl: 4-5ms. That's a 16-20x startup advantage, because the entire PDL engine is compiled into the pperl binary at build time. There is no module loading, no PP compilation, no dynamic linking — just a function pointer table registration.

For large-array compute workloads, the Rust engine's performance matches PDL's C core — unsurprisingly, since tight numeric loops in Rust compile to essentially the same LLVM IR as C. The real win comes when pperl's JIT can fuse Perl-level loops with PDL array operations — something that is structurally impossible with XS-based PDL.

What's Next

The current implementation covers PDL's core operations — enough to run real scientific code. The roadmap:

  • Slicing and dataflow: The slice() infrastructure exists but the full lazy-evaluation dataflow engine (parent↔child transformations) is Phase 3
  • PDL::PP equivalent: A Rust macro system for declaring typed operations — the foundation for community-contributed domain modules
  • Domain modules: Signal processing, image processing, fitting — following PDL's module ecosystem
  • JIT fusion: Cranelift compilation of PDL operations within Perl loops — the ultimate payoff of the pure-Rust architecture

The Bigger Picture

PDL in Rust is a proof point for pperl's native module strategy. The Perl community has long been told that "no XS support" means "no real modules". We disagree. List::Util, Storable, Fcntl, Scalar::Util, Sub::Util, Sys::Hostname, PadWalker — all reimplemented as native Rust modules in pperl, all passing their test suites. PDL is the most ambitious one yet: a full numerical computing engine, not just a utility module.

The pattern is always the same: read the original C/XS implementation, understand the data structures and control flow, and transliterate faithfully into Rust. No shortcuts. No "simplified subset". The Perl interface must behave identically — because existing Perl code must run unchanged.

Pure Perl will never be as fast as XS? We are not sure about that. When the JIT can see through the module boundary and the parallelizer can distribute across cores, "pure Perl + Rust engine" may well surpass "Perl + C library behind an opaque FFI wall".

PDL in Rust is the beginning of that argument.

The rust-pdl crate and its pperl integration are part of the pperl codebase, maintained by PetaMem s.r.o.

— Richard C. Jelinek, PetaMem s.r.o.

Leave a comment

About PetaMem

user-pic All things Perl.