Deep Cloning

I've been using Storable's dclone for years, but there's a module on CPAN called Clone that is said to be much faster. However, it doesn't seem to work.


use strict;
use warnings;
use Clone qw(clone);
use Data::Printer;
use 5.010;

my $a = { blue => '#0000ff', one => 1 };

say "A";
p $a; 

my $b = clone($a);

say "B";
p $b; 

$b->{car} = 'Ford';

say "B";
p $b; 

say "A";
p $a; 

Given the above code, I'd expect $a to retain it's structure, but $b to have the added new element of car = Ford. That's what happens when I use dclone, but for some reason it doesn't happen using  clone. Am I doing something wrong? 

Instead what happens for me with clone is this:



A
\ {
    blue   "BLUE",
    one   1
}
B
\ {
    blue   "BLUE",
    one   1
}
B
\ {
    blue   "#0000ff",
    car   "Ford",
    one   1
}
A
\ {
    blue   "#0000ff",
    car   "Ford",
    one   1
}

13 Comments

It looks like you've found a bug in Data::Printer.

use strict;
use warnings;
use Clone qw(clone);
use Data::Dumper;
use 5.010;

my $a = { blue => '#0000ff', one => 1 };

say "A";
say Dumper( $a );

my $b = clone($a);

say "B";
say Dumper( $b );

$b->{car} = 'Ford';

say "B";
say Dumper( $b );

say "A";
say Dumper( $a );
__END__
A
$VAR1 = {
'one' => 1,
'blue' => '#0000ff'
};

B
$VAR1 = {
'blue' => '#0000ff',
'one' => 1
};

B
$VAR1 = {
'blue' => '#0000ff',
'one' => 1,
'car' => 'Ford'
};

A
$VAR1 = {
'one' => 1,
'blue' => '#0000ff'
};

Data::Printer does some weird stuff to $a and $b. Use Devel::Peek to Dump the variables after calling p().

use strict;
use warnings;
use Clone qw(clone);
use Data::Printer;
use Devel::Peek;
use 5.010;

my $a = { blue => '#0000ff', one => 1 };

say "A";
p $a;

my $b = clone($a);

say "B";
p $b;

$b->{car} = 'Ford';

say "B";
p $b;

Dump( $b );

say "A";
p $a;

Dump( $a );


Then make the same change in the Data::Dumper version and you'll see a huge difference.

Here's a script which shows it explicitly:

use strict;
use warnings;
use Clone qw(clone);
#use Data::Printer;
use 5.010;

my $a = { blue => '#0000ff', one => 1 };

#p $a;

my $b = clone($a);

say "$a";
say "$b";
if ($a == $b) {
say "Bad stuff dude.";
}
else {
say "The world works again";
}

print eval {
use Data::Printer;
my $c = { red => '#ff0000', two => 2, };
p $c;
my $d = clone($c);
say "$c";
say "$d";
if ($c == $d) {
say "Bad stuff dude";
}
else {
say "The world works again";
}
}

My current favorite is Data::Clone. Quite a bit faster than Storable's dclone.

$ bench -MStorable=dclone -MData::Clone 'clone([1..10])' 'dclone([1..10])'
Benchmarking a => sub { clone([1..10]) }, b => sub { dclone([1..10]) } ...
a: 875001 calls (841945/s), 1.039s (0.0012ms/call)
b: 239130 calls (238848/s), 1.001s (0.0042ms/call)
Fastest is a (3.525x b)

It's definitely Data::Printer doing it. It replaces your IV with a PVMG.

use strict;
use warnings;
use Data::Printer;
use Devel::Peek;
use 5.010;

my $a = { one => 1 };
Dump $a;
p $a;
Dump $a;

__END__
SV = IV(0x43e920) at 0x43e924
REFCNT = 1
FLAGS = (PADMY,ROK)
RV = 0x54c18c
SV = PVHV(0x4300fc) at 0x54c18c
REFCNT = 1
FLAGS = (SHAREKEYS)
ARRAY = 0x43bad4 (0:7, 1:1)
hash quality = 100.0%
KEYS = 1
FILL = 1
MAX = 7
RITER = -1
EITER = 0x0
Elt "one" HASH = 0x7f17f79a
SV = IV(0x54c228) at 0x54c22c
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 1
\ {
one 1
}
SV = PVMG(0x4f655c) at 0x43e924
REFCNT = 1
FLAGS = (PADMY,RMG,ROK)
IV = 0
NV = 0
RV = 0x54c18c
SV = PVHV(0x4300fc) at 0x54c18c
REFCNT = 1
FLAGS = (RMG,OOK,SHAREKEYS)
MAGIC = 0x444a94
MG_VIRTUAL = 0
MG_PRIVATE = 18756
MG_TYPE = PERL_MAGIC_ext(~)
MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x25a7114
SV = PVIV(0x43205c) at 0x25a7114
REFCNT = 2
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 5554572
PV = 0x25cefdc "5554572"\0
CUR = 7
LEN = 8
ARRAY = 0x25cb954 (0:7, 1:1)
hash quality = 100.0%
KEYS = 1
FILL = 1
MAX = 7
RITER = -1
EITER = 0x0
BACKREFS = 0x25a70f4
SV = PVAV(0x25432dc) at 0x25a70f4
REFCNT = 2
FLAGS = ()
ARRAY = 0x25cd1f4
FILL = 0
MAX = 3
ARYLEN = 0x0
FLAGS = ()
Elt No. 0
SV = PVMG(0x4f65bc) at 0x25a7104
REFCNT = 1
FLAGS = (GMG,SMG,ROK,WEAKREF)
UV = 0
NV = 0
RV = 0x54c18c
SV = PVHV(0x4300fc) at 0x54c18c
REFCNT = 1
FLAGS = (RMG,OOK,SHAREKEYS)
MAGIC = 0x444a94
MG_VIRTUAL = 0
MG_PRIVATE = 18756
MG_TYPE = PERL_MAGIC_ext(~)
MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x25a7114
SV = PVIV(0x43205c) at 0x25a7114
REFCNT = 2
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 5554572
PV = 0x25cefdc "5554572"\0
CUR = 7
LEN = 8
ARRAY = 0x25cb954 (0:7, 1:1)
hash quality = 100.0%
KEYS = 1
FILL = 1
MAX = 7
RITER = -1
EITER = 0x0
BACKREFS = 0x25a70f4
SV = PVAV(0x25432dc) at 0x25a70f4
REFCNT = 2
FLAGS = ()
ARRAY = 0x25cd1f4
FILL = 0
MAX = 3
ARYLEN = 0x0
FLAGS = ()
PV = 0x54c18c ""
CUR = 0
LEN = 0
MAGIC = 0x444ac4
MG_VIRTUAL = &PL_vtbl_uvar
MG_TYPE = PERL_MAGIC_uvar(U)
MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x25a70e4
SV = PVAV(0x25432f4) at 0x25a70e4
REFCNT = 1
FLAGS = ()
ARRAY = 0x25cd21c
FILL = 1
MAX = 3
ARYLEN = 0x0
FLAGS = (REAL)
MG_LEN = 12
MG_PTR = 0x25cd244 "\0\0\0\0\0205!\"\0\0\0\0"
Elt "one" HASH = 0x7f17f79a
SV = IV(0x54c228) at 0x54c22c
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 1
PV = 0x54c18c ""
CUR = 0
LEN = 0
MAGIC = 0x444a04
MG_VIRTUAL = &PL_vtbl_backref
MG_TYPE = PERL_MAGIC_backref( MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x54c29c
SV = PVAV(0x254327c) at 0x54c29c
REFCNT = 2
FLAGS = ()
ARRAY = 0x25ccb64
FILL = 0
MAX = 3
ARYLEN = 0x0
FLAGS = ()
Elt No. 0
SV = PVMG(0x4f657c) at 0x54c27c
REFCNT = 1
FLAGS = (GMG,SMG,ROK,WEAKREF)
UV = 0
NV = 0
RV = 0x43e924
SV = PVMG(0x4f655c) at 0x43e924
REFCNT = 1
FLAGS = (PADMY,RMG,ROK)
IV = 0
NV = 0
RV = 0x54c18c
SV = PVHV(0x4300fc) at 0x54c18c
REFCNT = 1
FLAGS = (RMG,OOK,SHAREKEYS)
MAGIC = 0x444a94
MG_VIRTUAL = 0
MG_PRIVATE = 18756
MG_TYPE = PERL_MAGIC_ext(~)
MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x25a7114
SV = PVIV(0x43205c) at 0x25a7114
REFCNT = 2
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 5554572
PV = 0x25cefdc "5554572"\0
CUR = 7
LEN = 8
ARRAY = 0x25cb954 (0:7, 1:1)
hash quality = 100.0%
KEYS = 1
FILL = 1
MAX = 7
RITER = -1
EITER = 0x0
BACKREFS = 0x25a70f4
SV = PVAV(0x25432dc) at 0x25a70f4
REFCNT = 2
FLAGS = ()
ARRAY = 0x25cd1f4
FILL = 0
MAX = 3
ARYLEN = 0x0
FLAGS = ()
PV = 0x54c18c ""
CUR = 0
LEN = 0
MAGIC = 0x444a04
MG_VIRTUAL = &PL_vtbl_backref
MG_TYPE = PERL_MAGIC_backref( MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x54c29c
SV = PVAV(0x254327c) at 0x54c29c
REFCNT = 2
FLAGS = ()
ARRAY = 0x25ccb64
FILL = 0
MAX = 3
ARYLEN = 0x0
FLAGS = ()
MAGIC = 0x25926d4
MG_VIRTUAL = 0
MG_PRIVATE = 18756
MG_TYPE = PERL_MAGIC_ext(~)
MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x43e884
SV = PVIV(0x43203c) at 0x43e884
REFCNT = 2
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 4450596
PV = 0x25ceefc "4450596"\0
CUR = 7
LEN = 8
PV = 0x43e924 ""
CUR = 0
LEN = 0
MAGIC = 0x4449a4
MG_VIRTUAL = &PL_vtbl_uvar
MG_TYPE = PERL_MAGIC_uvar(U)
MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x25a9b0c
SV = PVAV(0x2543294) at 0x25a9b0c
REFCNT = 1
FLAGS = ()
ARRAY = 0x25cce5c
FILL = 1
MAX = 3
ARYLEN = 0x0
FLAGS = (REAL)
Elt No. 0
SV = PVIV(0x43203c) at 0x43e884
REFCNT = 2
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 4450596
PV = 0x25ceefc "4450596"\0
CUR = 7
LEN = 8
Elt No. 1
SV = PVHV(0x24fb87c) at 0x25a9cec
REFCNT = 1
FLAGS = (SHAREKEYS)
ARRAY = 0x25a017c (0:7, 1:1)
hash quality = 100.0%
KEYS = 1
FILL = 1
MAX = 7
RITER = -1
EITER = 0x0
MG_LEN = 12
MG_PTR = 0x25cce84 "\0\0\0\0\0205!\"\0\0\0\0"
MAGIC = 0x25926d4
MG_VIRTUAL = 0
MG_PRIVATE = 18756
MG_TYPE = PERL_MAGIC_ext(~)
MG_FLAGS = 0x02
REFCOUNTED
MG_OBJ = 0x43e884
SV = PVIV(0x43203c) at 0x43e884
REFCNT = 2
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 4450596
PV = 0x25ceefc "4450596"\0
CUR = 7
LEN = 8

So the workaround is to hold off on calling p() until after you've cloned.

use strict;
use warnings;
use Clone qw(clone);
use Data::Printer;
use 5.010;

my $a = { blue => '#0000ff', one => 1 };

my $b = clone($a);

say "A";
p $a;

say "B";
p $b;

$b->{car} = 'Ford';

say "B";
p $b;

say "A";
p $a;

$a and $b are global variables, which are reassigned on any sort invocation. I am pretty sure Data::Printer does not localize $a and $b when it does its sorting, hence the weird effects you see.

The real workaround is to never use $a and $b unless in a sort{} block.

Peter, if you had actually tried it you would find that it happens regardless of what variables are used.

Hey guys, thanks for the feedback. I think there are two issues at play here:

  1. why is Data::Printer changing the internal representation of the variable from an IV to a PVMG;
  2. why is Clone not able to clone that representation.

colink was kind enough to open an RT ticket in Data::Printer's queue, which is how I learned about the issue and started investigating it on my end. But it's 2am here so I'm going to have to call it a night and continue digging tomorrow.

Feel free to update the ticket there, email me or ping me on irc.perl.org if you have any insights that might help me nail this one out and make Data::Printer better every day.

(for the record, I also updated the ticket in Clone's own queue regarding why clone() doesn't seem able to clone the variable)

Thanks!

Ok guys. The more I look at this the more I feel this is *not* a bug in Data::Printer.

First of all, as I'm sure you know, this is not "Data::Printer's doing" in the pure sense of the word. Data::Printer does not write to your variable (it would defeat the purpose, right?), and indeed your variable's contents remain absolutely the same before and after using it.

Perl itself, on the other hand, does not guarantee that the internal representation of your variable will remain the same, and that's expected. What's happening is that, for whatever reason, Clone is unable to understand a particular representation (my wild guess? unblessed IVs with MAGIC data) and clone it properly.

Below is an example code that reproduces the issue without using Data::Printer at all. It died on my perl 5.16.1 with Clone 0.31 and the core module Hash::Util::FieldHash 1.10:


----------8 use strict;
use warnings;

use Hash::Util::FieldHash qw(fieldhash);
use Clone qw(clone);

my $var = {};

fieldhash my %hash;

# this line is changing the
# internal representation of $var
exists $hash{ \$var };

my $cloned = clone($var);

die "uh-oh" if $var == $cloned;
---------->8--------

If you add Devel::Peek calls into the code above, you'll see a similar change as the original dumps from Mr.Muskrat.

Could be a bug in the Hash::Util::FieldHash (core) module, in the perl binary (I don't know enough internals at this point to understand whether the dumped representation contains invalid or corrupted data), or in Clone's clone() function. Definitely not a bug in Data::Printer itself :)

Sadly, the only thing I can do at this point (other than notify the proper authors/maintainers) would be change Data::Printer to not use fieldhash() to spot circular references. But this approach would only mask the problem.

As an extra bit of information, fieldhash() changing the internal representation of the variable is expected.

This issue with Clone seems to be another manifestation of a 2-year old open ticket in Clone's RT queue.

Data::Printer is not even close to being the only module using Hash::Util::FieldHash, so I guess JT's initial assessment is likely right, and Clone is not working properly in all cases. Maybe for now the best approach to avoid surprises would be to use dclone, Clone::PP (which doesn't seem to manifest the issue), Data::Clone and friends.

Please don't hesitate to let me know if you think found a bug in Data::Printer, or if you have any feature requests. I'm putting a lot of effort into trying to make it as reliable and thorough as possible :)

Cheers!

So, turns out it really was a long-standing bug in Clone. Florian Ragwitz was kind enough to spot and fix it.

Now we just have to wait for the patch to be applied. Until it happens, you should be mindful of the data you clone with clone(), and might be better off using other sorts of deep cloning modules/techniques.

Thanks!

Leave a comment

About JT Smith

user-pic My little part in the greater Perl world.