[Philadelphia-pm] Devel-Git-MultiBisect: follow-up to Dec 12 2016 meeting

Thu Dec 15 07:32:52 PST 2016

This is a follow-up to the discussion of CPAN distribution 
Devel-Git-MultiBisect from the Philadelphia Perlmongers meeting on 
Monday, December 12 2016.  I would once again like to thank 
phiadelphia.pm for offering me that opportunity to present my work.

At approximately this point in the slides -- 
http://thenceforward.net/perl/talks/phlpm20161212/slide034.html -- Walt 
Mankowski asked why I felt it necessary, after testing commit 110 out of 
220 -- i.e., the first bisection point -- I then proceeded to test 
commit 109 out of 220 rather than proceeding directly to the next 
bisection point which would, in this case, be commit 55 out of 220.  The 
inference was that some of my test runs might have been unnecessary.  I 
believe Geoff Avery and others -- speak up here, so you all get credit! 
-- concurred.

Another way of putting this:  Suppose that we populated a list of the 
index numbers of commits visited in the course of multisection.  Suppose 
that that list was 0-based rather than 1-based.  Then that list would 
start like this:

     [ 0, 219, 109, 108, 54, 81, 80, 67, 66, 60, 59, 57, 56, 55 ... ]

Walt -- correct me if I'm misstating this -- was wondering why visits to 
steps 108, 80, 66, etc., were needed.

I didn't have a good response then, and I still don't now.  I can state 
that, in terms of getting to my desired final result (see 
http://thenceforward.net/perl/talks/phlpm20161212/slide050.html), my 
multiple bisection code is correct, even if it is somewhat inefficient.

Thinking about this over the past few days, I realized that it would be 
helpful to create, for my benefit and yours, a simplified framework for 
approaching the problem.  In my github repo for this distribution 
(though not yet on CPAN), I have added this file: 
https://github.com/jkeenan/devel-git-multibisect/blob/master/examples/generalized-multisection.t. 
  This program abstracts away the presenting problem and git in order to 
just focus on the procedure for multisection of a list of strings in the 
same format as produced by 
Devel::Git::MultiBisect::Transitions::multisect_all_targets().  The 
bisection procedure I am currently using in that method can be found in 
multisect_list(), which starts at line 265, and more specifically in the 
if block starting at line 317.

Now, I suspect that some of the bisection steps could probably be 
eliminated -- but different steps from those people were pointing out 
during the meeting.  If we took a look at the complete list of indexes 
visited, it would be:

     [ 0, 219, 109, 108, 54, 81, 80, 67, 66, 60, 59, 57, 56, 55,
       137, 136, 96, 95, 75, 74, 65, 64, 58 ]

That holds the full 23 steps taken to identify all 3 transition points 
in the 220-item commit range discussed at the meeting.  But if we know 
that the value at the first real bisection point (index 109) is 
identical to that at the end-of-range point (index 219), then we should 
be able to avoid ever visiting any index between 109 and 219.  Hence, 
indexes 137 and 136 could be eliminated.

But that's a different set of potentially eliminatable steps from what 
was raised at the meeting.

So, if any of you would like to stare at the code and offer a 
suggestion, feel free to clone that repo, poke around and offer a patch.

Thank you very much.
Jim Keenan