[sf-perl] Perl references to arrays vs. references to shared arrays

David Christensen dpchrist at holgerdanske.com
Tue Aug 11 16:55:14 PDT 2020


sanfrancisco-pm:

I have a computer:

2020-08-11 16:02:50 dpchrist at tinkywinky ~/sandbox/perl
$ cat /etc/debian_version ; uname -a ; perl -V | head -n 1
9.13
Linux tinkywinky 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1+deb9u1 
(2020-06-07) x86_64 GNU/Linux
Summary of my perl5 (revision 5 version 24 subversion 1) configuration:


I have been working with nested data structures.  Some data structures 
can contain loops -- e.g. a data structure that contains a reference to 
itself, a data structure that contains a reference to another data 
structure that refers to the first data structure, etc.


To detect circular loops, I need a way to uniquely identify each data 
structure.  I have noted that if I stringify a reference to a data 
structure, the result appears to be the type and a hexadecimal memory 
address.  For example:

2020-08-11 16:09:21 dpchrist at tinkywinky ~/sandbox/perl
$ perl -e 'my @a; print \@a, "\n"'
ARRAY(0x55d9a7410be0)


This also works if another data structure contains a reference to the first:

2020-08-11 16:35:31 dpchrist at tinkywinky ~/sandbox/perl
$ perl -e 'my (@a, @b); print \@a, "\n"; $b[0] = \@a; print $b[0], "\n"'
ARRAY(0x55b10cedbc10)
ARRAY(0x55b10cedbc10)


But the technique fails for data structures built from shared variables. 
  It appears that Perl is copying and/or moving shared arrays, hashes, 
etc., whenever their references are accessed (read):

2020-08-11 16:18:39 dpchrist at tinkywinky ~/sandbox/perl
$ cat circular-arrayref-vs-shared-arrayref.pl
#!perl

use strict;
use warnings;
use threads;
use threads::shared;

my @a0;
my @a1;

my @sa0 :shared;
my @sa1 :shared;

sub d {
     no warnings 'uninitialized';
     printf "%i %s=[%21s] %s=[%21s]\n",
	(caller)[2], \@a0, $a0[0], \@a1, $a1[0];
}

sub e {
     no warnings 'uninitialized';
     printf "%i %s=[%21s] %s=[%21s]\n",
	(caller)[2], \@sa0, $sa0[0], \@sa1, $sa1[0];
}

print "\nArrays with circular references:\n";
				d; d; d;
$a0[0]  = \@a1;			d; d; d;
$a1[0]  = \@a0;			d; d; d;

print "\nShared arrays with circular references:\n";
				e; e; e;
$sa0[0] = \@sa1;		e; e; e;
$sa1[0] = \@sa0;		e; e; e;

2020-08-11 16:18:42 dpchrist at tinkywinky ~/sandbox/perl
$ perl circular-arrayref-vs-shared-arrayref.pl

Arrays with circular references:
27 ARRAY(0x5569415e7df8)=[                     ] ARRAY(0x5569415e7d80)=[ 
                     ]
27 ARRAY(0x5569415e7df8)=[                     ] ARRAY(0x5569415e7d80)=[ 
                     ]
27 ARRAY(0x5569415e7df8)=[                     ] ARRAY(0x5569415e7d80)=[ 
                     ]
28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[ 
                     ]
28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[ 
                     ]
28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[ 
                     ]
29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] 
ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]
29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] 
ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]
29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] 
ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]

Shared arrays with circular references:
32 ARRAY(0x5569415e7de0)=[                     ] ARRAY(0x5569415e7e70)=[ 
                     ]
32 ARRAY(0x5569415e7de0)=[                     ] ARRAY(0x5569415e7e70)=[ 
                     ]
32 ARRAY(0x5569415e7de0)=[                     ] ARRAY(0x5569415e7e70)=[ 
                     ]
33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de0d8)] ARRAY(0x5569415e7e70)=[ 
                     ]
33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)] ARRAY(0x5569415e7e70)=[ 
                     ]
33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416a00c0)] ARRAY(0x5569415e7e70)=[ 
                     ]
34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)] 
ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c40)]
34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416a00c0)] 
ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c58)]
34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)] 
ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c88)]


For the non-shared arrays, the stringified references are unchanging and 
match up -- e.g. both \@a0 and $a1[0] stringify to 
ARRAY(0x5569415e7df8), similarly so for \@a1 and $a0[0].


But the stringified references for shared types do not match, and the 
references in another data structure change -- \@sa0 stringifies to 
ARRAY(0x5569415e7de0) every time, but what should be the same reference 
in $sa1[0] stringifies to ARRAY(0x556941690c40, ARRAY(0x556941690c58, 
and ARRAY(0x556941690c88 (!).  Similarly so for \@sa1 and $sa0[0]. 
Consequently, loop detection fails and my code falls into an infinite 
loop when walking such data structures.


Comments or suggestions?


David


More information about the SanFrancisco-pm mailing list