[Melbourne-pm] Perl forking & Copy-on-write

Alex Balhatchet kaoru at slackwise.net
Thu Jan 19 00:23:49 PST 2017

I've had very good experiences using Linux Copy on Write semantics in Perl
programs, both in the context of Apache 2.4 + mod_perl2 and Perl scripts
using Parallel::ForkManager.

The key is to set things up in the parent process - load modules and config
files, pre-warm in-process caches or whatever. Then do not touch those
variables again after you fork. Watch out for writes that look like reads,
eg. hash key autovivification! Make good use of exists()

In Apache you can use the PerlPostConfigHandler config option to run things
pre-fork in the parent.[0]

As an aside you can use PerlChildInitHandler for running things post-fork
in the child - I recommend running 'srand' at the very least.[1]

One of the most useful things you can do is to learn how different ops
tools treat CoW. Unfortunately the answer is basically that they all[2]
ignore CoW and will show all your child processes as using lots of memory.
The most reliable source of truth is /proc/<pid>/smaps which has
{Shared/Private}/{Clean/Dirty} values.[4]

One final point - I specified at the top of the email that I was talking
about "Linux Copy on Write". That's because Perl has its own concept of
Copy on Write[5] which is used to optimise assignments which are
semantically copies but might not need to be.

sub foo {
    my $arg = $_[0];
    return $arg + 4;

@_ contains aliases, and $arg = $_[0] makes a new scalar containing a
"copy" of the contents. But actually Perl doesn't copy immediately, it sets
up a CoW variable which avoids actually duplicating the memory until it
absolutely has to.

Hope that helps :) I also hope it's all accurate... if anybody notices any
errors please correct me :)

- Alex

[1] https://perl.apache.org/docs/2.0/user/handlers/server.html
[3] top, free, etc.
[4] https://www.brightbox.com/blog/2012/11/28/measuring-shared-ram-usage/
[5] http://perldoc.perl.org/perlguts.html#Copy-on-Write

On 19 January 2017 at 06:03, Dean Hamstead <dean at fragfest.com.au> wrote:

> perhaps TonyC on #australia (irc.perl.org) can provide some feedback
> he is seemingly perpetually working on tpf grants fixing core bugs
> D
> On 19/01/17 12:24, Mathew Robertson wrote:
> Here is an interesting read on how Python's garbage-collector, causes
> Linux's copy-on-write, to become less effective than would otherwise:
> https://engineering.instagram.com/dismissing-python-garbage-
> collection-at-instagram-4dca40b29172#.25rzyh6im
> Essentially it amounts to the GC walking over Python's read-only
> variables, but still adjusting the reference-counters... thus causing Linux
> to see that the memory that is backing the Python instance, to be written
> to.
> Python's cycle-detection could also cause memory-writes due to the
> mark+sweep job.
> And it also has a generational collection, for short-term vs long-term
> objects. Adjusting lifetime lengths, would also cause a memory-write.
> Perl also uses reference counting. So I can also see that Perl's
> reference-counting would cause the references to be written, when a given
> Perl variable is copied.
> The question is... does anyone have any insight into the same CoW failings
> within the context of Perl 5 ?
> Perl 6 uses a generational collector (does it *also* use reference
> counting?), so the generational migration would impact CoW. But given the
> rather smart people working on that project, I can envisage that some
> solution may eventually get implemented.
> _______________________________________________
> Melbourne-pm mailing listMelbourne-pm at pm.orghttp://mail.pm.org/mailman/listinfo/melbourne-pm
> _______________________________________________
> Melbourne-pm mailing list
> Melbourne-pm at pm.org
> http://mail.pm.org/mailman/listinfo/melbourne-pm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/melbourne-pm/attachments/20170119/8026fab3/attachment-0001.html>

More information about the Melbourne-pm mailing list