[sf-perl] Perl references to arrays vs. references to shared arrays

Joseph Brenner doomvox at gmail.com
Thu Aug 13 14:01:01 PDT 2020


Well, the result you've found is pretty much what I would've expected
from other things I've heard...

For example, back when people were experimenting a lot with "inside
out objects", which use the stringified form of a scalar ref as an
object ID, the word was that this had problems if you were trying to
use threads.   That was supposed to be a problem with Damien Conway's
"Class::Std", for example.

I see that Class::InsideOut and Object::InsideOut both claim to be threadsafe:

  https://metacpan.org/pod/Class::InsideOut
  https://metacpan.org/pod/distribution/Object-InsideOut/lib/Object/InsideOut.pod#THREAD-SUPPORT#

You might take a look under the hood there to see how they manage
it...  I would guess if you're rolling your own object system it's
relatively easy to deal, you'd just need your own system of unique IDs
for each object.

I would think you could do pretty well with just a probablistic
solution where collisions are extremely unlikely, like combining a
memory location, the current time in seconds, and a "random" number.

Alternately, if you have some sort of central data store for node ids,
you could check for collisions when you create them, but then you need
to worry about how you're going to handle race-conditions.

(One reason I like relational databases is that most headaches like
these have been handled already by someone else.)

But this all depends on being willing to munge your data-structure,
and store a node Id in them somewhere.

If you didn't want to go there, it occurs to me that if you're just
trying to detect circular references you could check many times (1000?
More?) and call it circular if you find a loop just once.  Oddities
with threads might cause a detection failure *some* times, but I don't
think they'd do so consistently, would they?

(Interestingly, Conway's more recent attempt Dios doesn't seem to say
anything about thread support: https://metacpan.org/pod/Dios).


On 8/11/20, David Christensen <dpchrist at holgerdanske.com> wrote:
> sanfrancisco-pm:
>
> I have a computer:
>
> 2020-08-11 16:02:50 dpchrist at tinkywinky ~/sandbox/perl
> $ cat /etc/debian_version ; uname -a ; perl -V | head -n 1
> 9.13
> Linux tinkywinky 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1+deb9u1
> (2020-06-07) x86_64 GNU/Linux
> Summary of my perl5 (revision 5 version 24 subversion 1) configuration:
>
>
> I have been working with nested data structures.  Some data structures
> can contain loops -- e.g. a data structure that contains a reference to
> itself, a data structure that contains a reference to another data
> structure that refers to the first data structure, etc.
>
>
> To detect circular loops, I need a way to uniquely identify each data
> structure.  I have noted that if I stringify a reference to a data
> structure, the result appears to be the type and a hexadecimal memory
> address.  For example:
>
> 2020-08-11 16:09:21 dpchrist at tinkywinky ~/sandbox/perl
> $ perl -e 'my @a; print \@a, "\n"'
> ARRAY(0x55d9a7410be0)
>
>
> This also works if another data structure contains a reference to the
> first:
>
> 2020-08-11 16:35:31 dpchrist at tinkywinky ~/sandbox/perl
> $ perl -e 'my (@a, @b); print \@a, "\n"; $b[0] = \@a; print $b[0], "\n"'
> ARRAY(0x55b10cedbc10)
> ARRAY(0x55b10cedbc10)
>
>
> But the technique fails for data structures built from shared variables.
>   It appears that Perl is copying and/or moving shared arrays, hashes,
> etc., whenever their references are accessed (read):
>
> 2020-08-11 16:18:39 dpchrist at tinkywinky ~/sandbox/perl
> $ cat circular-arrayref-vs-shared-arrayref.pl
> #!perl
>
> use strict;
> use warnings;
> use threads;
> use threads::shared;
>
> my @a0;
> my @a1;
>
> my @sa0 :shared;
> my @sa1 :shared;
>
> sub d {
>      no warnings 'uninitialized';
>      printf "%i %s=[%21s] %s=[%21s]\n",
> 	(caller)[2], \@a0, $a0[0], \@a1, $a1[0];
> }
>
> sub e {
>      no warnings 'uninitialized';
>      printf "%i %s=[%21s] %s=[%21s]\n",
> 	(caller)[2], \@sa0, $sa0[0], \@sa1, $sa1[0];
> }
>
> print "\nArrays with circular references:\n";
> 				d; d; d;
> $a0[0]  = \@a1;			d; d; d;
> $a1[0]  = \@a0;			d; d; d;
>
> print "\nShared arrays with circular references:\n";
> 				e; e; e;
> $sa0[0] = \@sa1;		e; e; e;
> $sa1[0] = \@sa0;		e; e; e;
>
> 2020-08-11 16:18:42 dpchrist at tinkywinky ~/sandbox/perl
> $ perl circular-arrayref-vs-shared-arrayref.pl
>
> Arrays with circular references:
> 27 ARRAY(0x5569415e7df8)=[                     ] ARRAY(0x5569415e7d80)=[
>                      ]
> 27 ARRAY(0x5569415e7df8)=[                     ] ARRAY(0x5569415e7d80)=[
>                      ]
> 27 ARRAY(0x5569415e7df8)=[                     ] ARRAY(0x5569415e7d80)=[
>                      ]
> 28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[
>                      ]
> 28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[
>                      ]
> 28 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)] ARRAY(0x5569415e7d80)=[
>                      ]
> 29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)]
> ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]
> 29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)]
> ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]
> 29 ARRAY(0x5569415e7df8)=[ARRAY(0x5569415e7d80)]
> ARRAY(0x5569415e7d80)=[ARRAY(0x5569415e7df8)]
>
> Shared arrays with circular references:
> 32 ARRAY(0x5569415e7de0)=[                     ] ARRAY(0x5569415e7e70)=[
>                      ]
> 32 ARRAY(0x5569415e7de0)=[                     ] ARRAY(0x5569415e7e70)=[
>                      ]
> 32 ARRAY(0x5569415e7de0)=[                     ] ARRAY(0x5569415e7e70)=[
>                      ]
> 33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de0d8)] ARRAY(0x5569415e7e70)=[
>                      ]
> 33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)] ARRAY(0x5569415e7e70)=[
>                      ]
> 33 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416a00c0)] ARRAY(0x5569415e7e70)=[
>                      ]
> 34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)]
> ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c40)]
> 34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416a00c0)]
> ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c58)]
> 34 ARRAY(0x5569415e7de0)=[ARRAY(0x5569416de048)]
> ARRAY(0x5569415e7e70)=[ARRAY(0x556941690c88)]
>
>
> For the non-shared arrays, the stringified references are unchanging and
> match up -- e.g. both \@a0 and $a1[0] stringify to
> ARRAY(0x5569415e7df8), similarly so for \@a1 and $a0[0].
>
>
> But the stringified references for shared types do not match, and the
> references in another data structure change -- \@sa0 stringifies to
> ARRAY(0x5569415e7de0) every time, but what should be the same reference
> in $sa1[0] stringifies to ARRAY(0x556941690c40, ARRAY(0x556941690c58,
> and ARRAY(0x556941690c88 (!).  Similarly so for \@sa1 and $sa0[0].
> Consequently, loop detection fails and my code falls into an infinite
> loop when walking such data structures.
>
>
> Comments or suggestions?
>
>
> David
> _______________________________________________
> SanFrancisco-pm mailing list
> SanFrancisco-pm at pm.org
> https://mail.pm.org/mailman/listinfo/sanfrancisco-pm
>


More information about the SanFrancisco-pm mailing list