PDL in Rust -- A Native Reimplementation of the Perl Data Language

A few days ago, when we announced pperl's native module strategy
on Reddit, someone asked about PDL support.
We replied:
"PDL will be supported." — to which u/fuzzmonkey35
responded: "We will live in glorious times when this happens."
Well. It happened.
We (as in: we and our AIs) reimplemented PDL (the Perl Data Language) from scratch in Rust — not a binding, not an FFI wrapper, but a ground-up reimplementation of the core engine. 15 data types, N-dimensional arrays, broadcasting, operator overloading, reductions, linear algebra, transcendental math — all in pure Rust, integrated as a native module into our pperl next-gen Perl5 platform.
use PDL; my $a = pdl([1, 2, 3]); my $b = pdl([10, 20, 30]); say $a + $b; # [11 22 33] say ($a * $b)->sum; # 140 say sin(pdl([0, 3.14159/2, 3.14159]));
45 tests, all green. Same Perl syntax. No XS. No C. No libpdl.
Why Reimplement PDL?
PDL is one of Perl's crown jewels — a fast, expressive N-dimensional array language that has served the scientific Perl community since 1996. It is also one of the significantly entangled XS modules in the ecosystem: C code, PP code generation, Inline::C, compiler toolchain dependencies, and a build process that assumes intimate knowledge of perl's internals.
Our Perl runtime — pperl (Parallel Perl) — is a from-scratch Perl 5 interpreter written in Rust with a Cranelift JIT and automatic parallelization via Rayon. It does not support XS. That is a deliberate architectural decision: XS ties modules to CPython-style C internals, which makes JIT compilation across module boundaries impossible and thread safety a minefield.
But "no XS" means PDL cannot simply be loaded. It must be reimplemented. And since we already have a Rust runtime, the natural answer is: reimplement PDL's engine in Rust, with zero C dependencies, and expose it through pperl's native module interface — the same mechanism we use for List::Util, Storable, Fcntl, and other core XS modules.
What's Implemented
The Rust PDL crate (rust-pdl) implements the core engine.
The pperl native module (src/native/PDL/) bridges it into
Perl space as a blessed PDL object, with operator
overloading and method dispatch working exactly as you'd expect.
Core Infrastructure
- 15 data types: Byte, SByte, Short, Ushort, Long, ULong, LongLong, ULongLong, Indx, Float, Double, LDouble, CFloat, CDouble, CLDouble
- N-dimensional arrays: arbitrary rank, with dims/strides/nelem metadata
- Automatic type promotion: integer → float → complex, following PDL's complexity ordering
- Broadcasting (formerly "threading"): implicit dimension negotiation for binary operations
- Bad value handling: sentinel values for integers, NaN for floats
Constructors
my $v = pdl([1, 2, 3]); # 1D vector my $m = pdl([[1,2],[3,4]]); # 2D matrix my $z = zeroes(4); # [0 0 0 0] my $o = ones(3); # [1 1 1] my $s = sequence(5); # [0 1 2 3 4] my $x = xvals(3, 3); # x-coordinate grid
Arithmetic & Comparison (Overloaded)
my $a = pdl([1, 2, 3]); my $b = pdl([10, 20, 30]); $a + $b; # [11 22 33] $b - $a; # [9 18 27] $a * $b; # [10 40 90] $b / $a; # [10 10 10] $a == $b; # [0 0 0] (element-wise, returns Byte PDL) sqrt($a); # [1 1.414.. 1.732..]
Reductions
my $x = pdl([1, 2, 3, 4, 5]); $x->sum; # 15 $x->avg; # 3 $x->min; # 1 $x->max; # 5my $m = pdl([[1,2],[3,4]]);
$m->sumover; # [3 7] (reduce along last dimension)
Linear Algebra & Selection
pdl([1,2,3])->inner(pdl([4,5,6])); # 32 (dot product)my $mask = pdl([0, 1, 0, 1, 1]);
$mask->which; # [1 3 4] (indices of true)
Transcendental Math
sin(pdl([0, 3.14159/2])); # [0 1] cos(pdl([0, 3.14159])); # [1 -1] exp(pdl([0, 1])); # [1 2.71828..] log(pdl([1, 2.71828])); # [0 1]
Additional math functions include tan,
asin, acos, atan,
sinh, cosh, tanh,
ceil, floor, rint,
erf, erfc, lgamma.
Architecture: Two Layers
The implementation is split into two clean layers:
1. The rust-pdl crate — a standalone
Rust library (~7,100 lines across 23 modules) that knows nothing about
Perl. It implements the PDL data structure, all operations, type
dispatch, broadcasting, and display. It could be used from any Rust
project.
2. The pperl native module
(src/native/PDL/) — the bridge layer. It converts between
Perl SVs and Rust Pdl structs, registers XS-style
functions, manages operator overloading, and handles the blessed hash
pattern that Perl-side PDL code expects. PDL objects are blessed
hashrefs containing a raw pointer to a heap-allocated
Box<Pdl>.
The module hierarchy mirrors PDL's own:
PDL — top-level boot, exports pdl/zeroes/ones/sum/...
PDL::Core — constructors, accessors, DESTROY
PDL::Ops — arithmetic & comparison operators
PDL::Ufunc — reductions (sum, avg, min, max, sumover, ...)
PDL::Math — transcendental functions
PDL::Primitive — inner, matmult, which, where, clip, append
PDL::Slices — slice("1:3"), slice("0:-1:2"), etc.
PDL::Basic — sequence, xvals, yvals, zvals
Why Not Just Bind to C PDL?
Three reasons:
JIT integration. pperl's Cranelift JIT can compile hot loops operating on PDL data — but only if the data structures are Rust-native. A C library behind FFI is an opaque wall to the JIT. With
rust-pdl, the JIT can in principle inline array operations into compiled machine code.Thread safety. pperl's Rayon-based
auto-parallelization needs data structures that are safe to share
across threads. Rust's ownership model provides this at compile
time. C PDL's reference-counted, globally-mutated state does
not.No toolchain dependency. A pure Rust PDL means
no C compiler, no make, no XS, no Inline::C, no PP code
generation.cargo buildand it's done. The entire PDL
engine compiles as part of the pperl binary.
Performance
Look at the test harness output:
PDL/010-constructor.t ... +5:ok (17/17) (p5: 81ms / +5: 5ms) PDL/020-arithmetic.t ... +5:ok (12/12) (p5: 82ms / +5: 4ms) PDL/030-reductions.t ... +5:ok (10/10) (p5: 81ms / +5: 5ms) PDL/040-math.t ... +5:ok (6/6) (p5: 85ms / +5: 4ms)
These numbers deserve context. The p5 column is
standard perl5 with PDL loaded from CPAN — 81-85ms per test file
because PDL's startup pulls in dozens of modules, compiles PP code,
and initializes the type system. The +5 column is pperl:
4-5ms. That's a 16-20x startup advantage, because the
entire PDL engine is compiled into the pperl binary at build time.
There is no module loading, no PP compilation, no dynamic linking — just
a function pointer table registration.
For large-array compute workloads, the Rust engine's performance matches PDL's C core — unsurprisingly, since tight numeric loops in Rust compile to essentially the same LLVM IR as C. The real win comes when pperl's JIT can fuse Perl-level loops with PDL array operations — something that is structurally impossible with XS-based PDL.
What's Next
The current implementation covers PDL's core operations — enough to run real scientific code. The roadmap:
- Slicing and dataflow: The
slice()infrastructure exists but the full lazy-evaluation dataflow engine (parent↔child transformations) is Phase 3 - PDL::PP equivalent: A Rust macro system for declaring typed operations — the foundation for community-contributed domain modules
- Domain modules: Signal processing, image processing, fitting — following PDL's module ecosystem
- JIT fusion: Cranelift compilation of PDL operations within Perl loops — the ultimate payoff of the pure-Rust architecture
The Bigger Picture
PDL in Rust is a proof point for pperl's native module strategy. The Perl community has long been told that "no XS support" means "no real modules". We disagree. List::Util, Storable, Fcntl, Scalar::Util, Sub::Util, Sys::Hostname, PadWalker — all reimplemented as native Rust modules in pperl, all passing their test suites. PDL is the most ambitious one yet: a full numerical computing engine, not just a utility module.
The pattern is always the same: read the original C/XS implementation, understand the data structures and control flow, and transliterate faithfully into Rust. No shortcuts. No "simplified subset". The Perl interface must behave identically — because existing Perl code must run unchanged.
Pure Perl will never be as fast as XS? We are not sure about that. When the JIT can see through the module boundary and the parallelizer can distribute across cores, "pure Perl + Rust engine" may well surpass "Perl + C library behind an opaque FFI wall".
PDL in Rust is the beginning of that argument.
The rust-pdl crate and its pperl integration are part
of the pperl codebase, maintained by PetaMem s.r.o.
— Richard C. Jelinek, PetaMem s.r.o.
Leave a comment