[sf-perl] Perl references to arrays vs. references to shared arrays
David Christensen
dpchrist at holgerdanske.com
Tue Aug 11 16:55:14 PDT 2020
sanfrancisco-pm:
I have a computer:
2020-08-11 16:02:50 dpchrist at tinkywinky ~/sandbox/perl
$ cat /etc/debian_version ; uname -a ; perl -V | head -n 1
9.13
Linux tinkywinky 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1+deb9u1
(2020-06-07) x86_64 GNU/Linux
Summary of my perl5 (revision 5 version 24 subversion 1) configuration:
I have been working with nested data structures. Some data structures
can contain loops -- e.g. a data structure that contains a reference to
itself, a data structure that contains a reference to another data
structure that refers to the first data structure, etc.
To detect circular loops, I need a way to uniquely identify each data
structure. I have noted that if I stringify a reference to a data
structure, the result appears to be the type and a hexadecimal memory
address. For example:
2020-08-11 16:09:21 dpchrist at tinkywinky ~/sandbox/perl
$ perl -e 'my @a; print \@a, "\n"'
ARRAY(0x55d9a7410be0)
This also works if another data structure contains a reference to the first:
2020-08-11 16:35:31 dpchrist at tinkywinky ~/sandbox/perl
$ perl -e 'my (@a, @b); print \@a, "\n"; $b[0] = \@a; print $b[0], "\n"'
ARRAY(0x55b10cedbc10)
ARRAY(0x55b10cedbc10)
But the technique fails for data structures built from shared variables.
It appears that Perl is copying and/or moving shared arrays, hashes,
etc., whenever their references are accessed (read):
2020-08-11 16:18:39 dpchrist at tinkywinky ~/sandbox/perl
$ cat circular-arrayref-vs-shared-arrayref.pl
#!perl
use strict;
use warnings;
use threads;
use threads::shared;
my @a0;
my @a1;
my @sa0 :shared;
my @sa1 :shared;
sub d {
no warnings 'uninitialized';
printf "%i %s=[%21s] %s=[%21s]\n",
(caller)[2], \@a0, $a0[0], \@a1, $a1[0];
}
sub e {
no warnings 'uninitialized';
printf "%i %s=[%21s] %s=[%21s]\n",
(caller)[2], \@sa0, $sa0[0], \@sa1, $sa1[0];
}
print "\nArrays with circular references:\n";
d; d; d;
$a0[0] = \@a1; d; d; d;
$a1[0] = \@a0; d; d; d;
print "\nShared arrays with circular references:\n";
e; e; e;
$sa0[0] = \@sa1; e; e; e;
$sa1[0] = \@sa0; e; e; e;
2020-08-11 16:18:42 dpchrist at tinkywinky ~/sandbox/perl
$ perl circular-arrayref-vs-shared-arrayref.pl
Arrays with circular references:
27 ARRAY(0x5569415e7df8)=[ ] ARRAY(0x5569415e7d80)=[
]
27 ARRAY(0x5569415e7df8)=[ ] ARRAY(0x5569415e7d80)=[
]
27 ARRAY(0x5569415e7df8)=[ ] ARRAY(0x5569415e7d80)=[
]
28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[
]
28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[
]
28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[
]
29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)]
ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]
29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)]
ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]
29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)]
ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]
Shared arrays with circular references:
32 ARRAY(0x5569415e7de0)=[ ] ARRAY(0x5569415e7e70)=[
]
32 ARRAY(0x5569415e7de0)=[ ] ARRAY(0x5569415e7e70)=[
]
32 ARRAY(0x5569415e7de0)=[ ] ARRAY(0x5569415e7e70)=[
]
33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de0d8)] ARRAY(0x5569415e7e70)=[
]
33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)] ARRAY(0x5569415e7e70)=[
]
33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416a00c0)] ARRAY(0x5569415e7e70)=[
]
34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)]
ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c40)]
34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416a00c0)]
ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c58)]
34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)]
ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c88)]
For the non-shared arrays, the stringified references are unchanging and
match up -- e.g. both \@a0 and $a1[0] stringify to
ARRAY(0x5569415e7df8), similarly so for \@a1 and $a0[0].
But the stringified references for shared types do not match, and the
references in another data structure change -- \@sa0 stringifies to
ARRAY(0x5569415e7de0) every time, but what should be the same reference
in $sa1[0] stringifies to ARRAY(0x556941690c40, ARRAY(0x556941690c58,
and ARRAY(0x556941690c88 (!). Similarly so for \@sa1 and $sa0[0].
Consequently, loop detection fails and my code falls into an infinite
loop when walking such data structures.
Comments or suggestions?
David
More information about the SanFrancisco-pm
mailing list