From kehoea at parhasard.net Tue Oct 12 09:48:21 2004 From: kehoea at parhasard.net (Aidan Kehoe) Date: Tue Oct 12 09:48:24 2004 Subject: [Dub-pm] Quick pack question. Message-ID: <16747.61109.379775.387459@ns5.nestdesign.com> How do I take a string formated like so, bitwise-speaking; 11111112 22222233 33333444 55556666 ... (that is, the first seven bits of the first octet correspond to the first character, the last bit of the first octet concatenated with the first six of the next correspond to the second character ...) and convert it into a Perl string? I can do it algorithmically, but something tells me unpack() _should_ have the capability to do this. I've yet to have an eureka moment from reading the docs, though. -- Like the early Christians, Marx expected the millennium very soon; like their successors, his have been disappointed--once more, the world has shown itself recalcitrant to a tidy formula embodying the hopes of some section of mankind. (Russell) From John.Mcnamara at snamprogetti.eni.it Wed Oct 13 03:43:47 2004 From: John.Mcnamara at snamprogetti.eni.it (Mcnamara John) Date: Wed Oct 13 03:43:56 2004 Subject: [Dub-pm] Quick pack question. Message-ID: <14DEDFCCB554B04CBA517637B61D781F4083B8@spsv00r6.snamprogettirf.res.prirf> > How do I take a string formated like so, bitwise-speaking; > > 11111112 22222233 33333444 55556666 ... > > (that is, the first seven bits of the first octet correspond to the > first character, the last bit of the first octet concatenated with > the first six of the next correspond to the second character ...) Here is one way to do it (if I have understood correctly): #!/usr/bin/perl -wl my $str = sprintf "%032b", 123456789; print $str, "\n"; my $offset = 0; while ($offset < length $str) { my $septet = unpack 'C', pack 'B7', substr ($str, $offset, 7); $septet >>= 1; printf "%07b %d\n", ($septet) x 2; $offset += 7; } __END__ Prints: 00000111010110111100110100010101 0000011 3 1010110 86 1111001 121 1010001 81 0101000 40 It should also be possible to generate a longer "B7" template and have only one call to pack(). I'll post an example of that if the above is close to what you want. John. -- *************************E-MAIL CONFIDENTIALITY FOOTER********************** This message may contain confidential information and must not be copied, disclosed or used by anybody other than the intended recipient. If you have received this message in error, please notify us writing to postmaster@snamprogetti.eni.it and delete the message and any copies of it. Thank you for your assistance. From John.Mcnamara at snamprogetti.eni.it Wed Oct 13 04:07:36 2004 From: John.Mcnamara at snamprogetti.eni.it (Mcnamara John) Date: Wed Oct 13 04:07:45 2004 Subject: FW: [Dub-pm] Quick pack question. Message-ID: <14DEDFCCB554B04CBA517637B61D781F4083B9@spsv00r6.snamprogettirf.res.prirf> > How do I take a string formated like so, bitwise-speaking; > > 11111112 22222233 33333444 55556666 ... > > (that is, the first seven bits of the first octet correspond to the > first character, the last bit of the first octet concatenated with > the first six of the next correspond to the second character ...) Or without pack: #!/usr/bin/perl -wl my $str = sprintf "%032b", 123456789; print $str, "\n"; my $offset = 0; while ($offset < length $str) { my $septet = oct 'b' . substr($str, $offset, 7); printf "%07b %d\n", ($septet) x 2; $offset += 7; } __END__ Prints: 00000111010110111100110100010101 0000011 3 1010110 86 1111001 121 1010001 81 0000101 5 This handles the trailing bits less than 7 differently from the other example. Bug or feature, you decide. ;-) John. -- *************************E-MAIL CONFIDENTIALITY FOOTER********************** This message may contain confidential information and must not be copied, disclosed or used by anybody other than the intended recipient. If you have received this message in error, please notify us writing to postmaster@snamprogetti.eni.it and delete the message and any copies of it. Thank you for your assistance. From kehoea at parhasard.net Wed Oct 13 08:17:06 2004 From: kehoea at parhasard.net (Aidan Kehoe) Date: Wed Oct 13 08:17:11 2004 Subject: [Dub-pm] Quick pack question. In-Reply-To: <14DEDFCCB554B04CBA517637B61D781F4083B8@spsv00r6.snamprogettirf.res.prirf> References: <14DEDFCCB554B04CBA517637B61D781F4083B8@spsv00r6.snamprogettirf.res.prirf> Message-ID: <16749.10962.550796.36712@ns5.nestdesign.com> Ar an tri? l? d?ag de m? Deireadh F?mhair, scr?obh Mcnamara John: > Here is one way to do it (if I have understood correctly): Hmm, thanks for that, but there's relatively little pack magic compared to what I had a gut feeling, was possible. Ah well. Pack seems more oriented to quanities greater than a byte. I ended up doing it algorithmically; here's the PHP (don't look so shocked, the Perl pack works fine in PHP, so asking here was relevant). If you read it closely, you'll realise I didn't even pose the question here properly. :-) /* Take $input, which is a GSM-encoded string packed according to the 3G TS 23.038 specifications section 6.1.2, and return a string where each octet corresponds to one and only one character, and no octet has the high bit set. */ function seven_bit_packed_to_octets($input) { $from_array = preg_split('//', $input, -1, PREG_SPLIT_NO_EMPTY); $output = ''; $count_from = count($from_array); /* Move through the octets in the input string. */ for ($index = 0; $index < $count_from; ++$index) { /* We start taking seven bits from the current, none from the previous, then take six from this, one from the previous, then five from this, one from the previous. .. */ $bits_from_previous = $index % 7; /* Can't use -1 as an array index, so we check that $index is non-zero before taking the bits from the previous entry in the array. */ if (0 != $index) $from_previous = (0xff & ord($from_array[$index - 1])) >> (8 - $bits_from_previous); else $from_previous = 0; $from_this = ord($from_array[$index]) << $bits_from_previous; $output .= chr(0x7f & ($from_this | $from_previous)); /* If we're taking six bits from the previous octet, then there are seven bits in this octet that we need to consider as a character as well. */ if (6 == $bits_from_previous) $output .= chr(0x7f & (ord($from_array[$index]) >> 1)); } return $output; } -- Like the early Christians, Marx expected the millennium very soon; like their successors, his have been disappointed--once more, the world has shown itself recalcitrant to a tidy formula embodying the hopes of some section of mankind. (Russell) From John.Mcnamara at snamprogetti.eni.it Wed Oct 13 08:39:52 2004 From: John.Mcnamara at snamprogetti.eni.it (Mcnamara John) Date: Wed Oct 13 08:39:59 2004 Subject: [Dub-pm] Quick pack question. Message-ID: <14DEDFCCB554B04CBA517637B61D781F4083BB@spsv00r6.snamprogettirf.res.prirf> Aidan Kehoe wrote: > Hmm, thanks for that, but there's relatively little pack magic > compared to what I had a gut feeling, was possible. Ah well. Pack > seems more oriented to quanities greater than a byte. Yes, in fact here is a laconic entry from perltodo (5.8.0): "bitfields in pack" Anyway, it is a little late but here is a pack-ier example just to generate some traffic: #!/usr/bin/perl -w print my $str = sprintf "%032b", 123456789; my $template_A = 'A7' x (length($str) / 7); my $template_B = 'B7' x (length($str) / 7); my @nums = map {$_ >> 1} unpack 'C*', pack $template_B, unpack $template_A, $str; printf "%07b %d\n", $_, $_ for @nums; __END__ Prints: 00000111010110111100110100010101 0000011 3 1010110 86 1111001 121 1010001 81 > here's the PHP (don't look so shocked, "My eyes, my eyes. The goggles do nothing". :-) John. -- *************************E-MAIL CONFIDENTIALITY FOOTER********************** This message may contain confidential information and must not be copied, disclosed or used by anybody other than the intended recipient. If you have received this message in error, please notify us writing to postmaster@snamprogetti.eni.it and delete the message and any copies of it. Thank you for your assistance. From fergal at esatclear.ie Wed Oct 13 09:01:00 2004 From: fergal at esatclear.ie (Fergal Daly) Date: Wed Oct 13 09:01:11 2004 Subject: [Dub-pm] Quick pack question. In-Reply-To: <16749.10962.550796.36712@ns5.nestdesign.com> References: <14DEDFCCB554B04CBA517637B61D781F4083B8@spsv00r6.snamprogettirf.res.prirf> <16749.10962.550796.36712@ns5.nestdesign.com> Message-ID: <20041013140100.GD2844@dyn.fergaldaly.com> On Wed, Oct 13, 2004 at 02:17:06PM +0100, Aidan Kehoe wrote: > > Ar an tri? l? d?ag de m? Deireadh F?mhair, scr?obh Mcnamara John: > > > Here is one way to do it (if I have understood correctly): > > Hmm, thanks for that, but there's relatively little pack magic compared to > what I had a gut feeling, was possible. Ah well. Pack seems more oriented to > quanities greater than a byte. pack does have some bit oriented features but it always takes a list and unpack always returns a list so it would be out of character for pack to take one kind of string and produce another. You can do what you're after with an unpack, a substitution and then a pack. Here's a subroutine to do it, another to undo it and some code to show it in action. I don't know about the performance. I'd guess it's fairly snappy although very long strings might chew memory a bit. If you need to handle long strings, you could always split them into smaller chunks. Not sure how it ports to PHP, F sub seven2eight { my $in = shift; $in = unpack("b*", $in); # turns it into a string of 1s and 0s $in =~ s/(.{7})/${1}0/g; # slip a 0 in after every 7th bit return pack("b*", $in); # turn it back into binary } sub eight2seven { my $in = shift; $in = unpack("b*", $in); $in =~ s/(.{7})./${1}/g; # remove the 0 that comes after every 7th bit return pack("b*", $in); } my $string = "abcdefgh"; print unpack("b*", $string)."\n"; my $new_string = eight2seven($string); print unpack("b*", $new_string)."\n"; my $undone = seven2eight($new_string); print "reversing the process\n"; print "'$undone' eq '$string'\n"; if ($undone eq $string) { print "ok\n"; } else { print "not ok!!!\n"; } From fergal at esatclear.ie Wed Oct 13 09:19:06 2004 From: fergal at esatclear.ie (Fergal Daly) Date: Wed Oct 13 09:19:19 2004 Subject: [Dub-pm] Quick pack question. In-Reply-To: <20041013140100.GD2844@dyn.fergaldaly.com> References: <14DEDFCCB554B04CBA517637B61D781F4083B8@spsv00r6.snamprogettirf.res.prirf> <16749.10962.550796.36712@ns5.nestdesign.com> <20041013140100.GD2844@dyn.fergaldaly.com> Message-ID: <20041013141906.GE2844@dyn.fergaldaly.com> On Wed, Oct 13, 2004 at 03:01:00PM +0100, Fergal Daly wrote: > On Wed, Oct 13, 2004 at 02:17:06PM +0100, Aidan Kehoe wrote: > > > > Ar an tri? l? d?ag de m? Deireadh F?mhair, scr?obh Mcnamara John: > > > > > Here is one way to do it (if I have understood correctly): > > > > Hmm, thanks for that, but there's relatively little pack magic compared to > > what I had a gut feeling, was possible. Ah well. Pack seems more oriented to > > quanities greater than a byte. > > pack does have some bit oriented features but it always takes a list and > unpack always returns a list so it would be out of character for pack to > take one kind of string and produce another. Actually that's not true at all, in my own code below unpack returns a string (not a list) and pack takes string (not a list)! It's probably more correct to say that pack usually takes something nice and readable and turns it into binary and unpack usually takes something binary and turns it into something nice and readable. So going from binary to binary requires one of each, F From kehoea at parhasard.net Wed Oct 13 10:29:56 2004 From: kehoea at parhasard.net (Aidan Kehoe) Date: Wed Oct 13 10:29:59 2004 Subject: [Dub-pm] Quick pack question. In-Reply-To: <20041013141906.GE2844@dyn.fergaldaly.com> References: <14DEDFCCB554B04CBA517637B61D781F4083B8@spsv00r6.snamprogettirf.res.prirf> <16749.10962.550796.36712@ns5.nestdesign.com> <20041013140100.GD2844@dyn.fergaldaly.com> <20041013141906.GE2844@dyn.fergaldaly.com> Message-ID: <16749.18932.268022.420648@ns5.nestdesign.com> Ar an tri? l? d?ag de m? Deireadh F?mhair, scr?obh Fergal Daly: > It's probably more correct to say that pack usually takes something nice > and readable and turns it into binary and unpack usually takes something > binary and turns it into something nice and readable. So going from > binary to binary requires one of each, [OT, and not to pick on you, but this use of "binary" drives me mad. All the data in question are on a computer, they've thus got an inherent digital, binary nature, and octets with the high bit set can be as valid as text as those in the range 0x20-0x7e. And it's this sort of limited thinking that means people in China can't use Chinese in their domain names.] -- Like the early Christians, Marx expected the millennium very soon; like their successors, his have been disappointed--once more, the world has shown itself recalcitrant to a tidy formula embodying the hopes of some section of mankind. (Russell) From fergal at esatclear.ie Wed Oct 13 11:50:47 2004 From: fergal at esatclear.ie (Fergal Daly) Date: Wed Oct 13 11:50:56 2004 Subject: [Dub-pm] Quick pack question. In-Reply-To: <16749.18932.268022.420648@ns5.nestdesign.com> References: <14DEDFCCB554B04CBA517637B61D781F4083B8@spsv00r6.snamprogettirf.res.prirf> <16749.10962.550796.36712@ns5.nestdesign.com> <20041013140100.GD2844@dyn.fergaldaly.com> <20041013141906.GE2844@dyn.fergaldaly.com> <16749.18932.268022.420648@ns5.nestdesign.com> Message-ID: <20041013165047.GA3825@dyn.fergaldaly.com> On Wed, Oct 13, 2004 at 04:29:56PM +0100, Aidan Kehoe wrote: > > Ar an tri? l? d?ag de m? Deireadh F?mhair, scr?obh Fergal Daly: > > > It's probably more correct to say that pack usually takes something nice > > and readable and turns it into binary and unpack usually takes something > > binary and turns it into something nice and readable. So going from > > binary to binary requires one of each, > > [OT, and not to pick on you, but this use of "binary" drives me mad. All the > data in question are on a computer, they've thus got an inherent digital, > binary nature, and octets with the high bit set can be as valid as text as > those in the range 0x20-0x7e. And it's this sort of limited thinking that > means people in China can't use Chinese in their domain names.] I didn't say that binary implies "octets with the high bit set" or vice versa. In fact in the seven2eight() subroutine the "binary" output of pack() is actually an ascii string. Binary here just means a sequence of raw bits which need further interpretation as opposed to a list of integers or string of characters (these being the "nice and readable" things I was referring to). Someone could add an encoding to pack/unpack similar to the "b" encoding but instead of taking strings of "0" and "1"s it took strings of "零" and "一"s (the Chinese characters for "0" and "1" - assuming my mailer doesn't mangle them). I'd still say it's output is binary and it's input isn't (even though it's input is distinctly non-ascii). F From david at cantrell.org.uk Tue Oct 19 05:46:42 2004 From: david at cantrell.org.uk (David Cantrell) Date: Tue Oct 19 05:46:46 2004 Subject: [Dub-pm] Phone numbers Message-ID: <20041019104640.GA2015@bytemark.barnyard.co.uk> I'm putting together a big patch to bring Number::Phone::Country up to date. It takes a phone number in "international" format (ie, with country code) and tells you what country it is in. This is "interesting" for all the countries in +1, and there's a few other codes with similar issues - +7 is shared by Russia and Kazakhstan, Gibraltar (+350) is also an area code (+34 9567) in Spain, Vatican City is both +379 and also accessible via Italy's code. Now, I know that it *used* to be possible to dial phones in Northern Ireland from the Republic using a shortcut in the Republic's national numbering plan instead of dialling internationally using +44 blah. Is this still possible? The latest data I have is from 2000, and indicates that 048 NNNNNNNN is equivalent to +44 28 NNNNNNNN. -- David Cantrell | http://www.cantrell.org.uk/david Perl may be the best solution for processing a text file, but asking a group of Perl Mongers clearly isn't -- aef, in #london.pm From paul.barry at itcarlow.ie Tue Oct 19 05:59:41 2004 From: paul.barry at itcarlow.ie (Paul Barry) Date: Tue Oct 19 06:01:11 2004 Subject: [Dub-pm] Phone numbers In-Reply-To: <20041019104640.GA2015@bytemark.barnyard.co.uk> References: <20041019104640.GA2015@bytemark.barnyard.co.uk> Message-ID: <4174F39D.3020907@itcarlow.ie> Duh! Forgot to copy the list on this reply ... sorry. David Cantrell wrote: > Now, I know that it *used* to be possible to dial phones in Northern > Ireland from the Republic using a shortcut in the Republic's national > numbering plan instead of dialling internationally using +44 blah. Is > this still possible? The latest data I have is from 2000, and indicates > that 048 NNNNNNNN is equivalent to +44 28 NNNNNNNN. Yes, I use 04890 plus the number for Belfast. Interestingly, Vodafone bar my attempts to use 00+44+, but let me use 04890 ... Paul. -- Paul Barry, Dept. of Computing & Networking, Institute of Technology, Carlow, Kilkenny Road, Carlow, Ireland. E-mail: paul.barry@itcarlow.ie Telephone: +353+59+9170400. Website: http://glasnost.itcarlow.ie/~barryp/index.html Public-key is: http://glasnost.itcarlow.ie/~barryp/paulbarry-key.asc From nick at netability.ie Tue Oct 19 06:04:38 2004 From: nick at netability.ie (Nick Hilliard) Date: Tue Oct 19 06:04:50 2004 Subject: [Dub-pm] Phone numbers In-Reply-To: <20041019104640.GA2015@bytemark.barnyard.co.uk> References: <20041019104640.GA2015@bytemark.barnyard.co.uk> Message-ID: <4174F4C6.1070508@netability.ie> David Cantrell wrote: > The latest data I have is from 2000, and indicates > that 048 NNNNNNNN is equivalent to +44 28 NNNNNNNN That rule still holds. There was some talk some while back of having a common prefix system for the entire island of Ireland, but the regulators backed away from that on the grounds that it would be too troublesome to implement. Nick From dermot at directski.com Tue Oct 19 08:36:32 2004 From: dermot at directski.com (Dermot McNally) Date: Tue Oct 19 08:34:49 2004 Subject: [Dub-pm] Phone numbers In-Reply-To: <20041019104640.GA2015@bytemark.barnyard.co.uk> References: <20041019104640.GA2015@bytemark.barnyard.co.uk> Message-ID: <41751860.9040704@directski.com> David Cantrell wrote: > I'm putting together a big patch to bring Number::Phone::Country up to > date. It takes a phone number in "international" format (ie, with > country code) and tells you what country it is in. What the other folks said about 048, stands, but there's a snag trying to incorporate enough clevers to handle this into Number::Phone::Country. The lookup expects the number you provide to be in full international format. Arguably, the number +353-48-12345678 is only a pseudo-international number, as I'm not sure if you can dial it from anywhere outside ROI. What are you planning? To have numbers of that format as well as the +44-28-12345678 equivalent map to a mock-ISO code for NI? FWIW, a while ago I went looking for some modules that would help me process phone numbers and help me to decompose them into their constituent country, area and base number. (Which is hard without country-specific logic to tell you how to infer the area code). I ended up growing my own for the most part. My work in progress also has - unlike Number::Phone::Country - the concept of where-in-the-world you are running it from, which was designed to allow you to render numbers in your own country according to local convention. Oh, it supports pretty-printing with support for national norms. However, I ended up pulling some nasty stunts in the country-specific classes in ways that are not very scalable. It might benefit from a fresh set of eyes, so I'll see if I can pull it together if anybody is interested. It's a POD-free zone, but not hard to see what's going on. Dermot -- ------------------------------------------------------------------------ Dermot McNally, Chief Technical Officer, Directski.com dermot@directski.com http://www.directski.com - ski for less From david at cantrell.org.uk Tue Oct 19 09:21:29 2004 From: david at cantrell.org.uk (David Cantrell) Date: Tue Oct 19 09:21:45 2004 Subject: [Dub-pm] Phone numbers In-Reply-To: <41751860.9040704@directski.com> References: <20041019104640.GA2015@bytemark.barnyard.co.uk> <41751860.9040704@directski.com> Message-ID: <20041019142128.GA6240@bytemark.barnyard.co.uk> On Tue, Oct 19, 2004 at 02:36:32PM +0100, Dermot McNally wrote: > What the other folks said about 048, stands, but there's a snag trying > to incorporate enough clevers to handle this into > Number::Phone::Country. The lookup expects the number you provide to be > in full international format. Arguably, the number +353-48-12345678 is > only a pseudo-international number, as I'm not sure if you can dial it > from anywhere outside ROI. I'll have to try it. Anyone got a company in the north that they particularly hate and whose time I can waste trying it out? :-) > What are you planning? To have numbers of that format as well as the > +44-28-12345678 equivalent map to a mock-ISO code for NI? Look at the source for N::P::Country, and how he handles the craziness that is +1. I'd handle +353 the same way. And +7, +34, and +39. > FWIW, a while ago I went looking for some modules that would help me > process phone numbers and help me to decompose them into their > constituent country, area and base number. (Which is hard without > country-specific logic to tell you how to infer the area code). I ended > up growing my own for the most part. My work in progress also has - > unlike Number::Phone::Country - the concept of where-in-the-world you > are running it from, which was designed to allow you to render numbers > in your own country according to local convention. Oh, it supports > pretty-printing with support for national norms. Check out my Number::Phone base class, and Number::Phone::UK for an example that uses it. Yes, you need country-specific logic. Lots of it. And for most countries, a database. The authors of some of the other Number::Phone::* modules are very slowly moving over to that new API, but implementations for more countries are more than welcome :-) -- David Cantrell | http://www.cantrell.org.uk/david Deck of Cards: $1.29. "101 Solitaire Variations" book: $6.59. Cheap replacement for the one thing Windows is good at: priceless -- Shane Lazarus From fergal at esatclear.ie Tue Oct 19 10:00:30 2004 From: fergal at esatclear.ie (Fergal Daly) Date: Tue Oct 19 10:00:58 2004 Subject: [Dub-pm] Phone numbers In-Reply-To: <20041019142128.GA6240@bytemark.barnyard.co.uk> References: <20041019104640.GA2015@bytemark.barnyard.co.uk> <41751860.9040704@directski.com> <20041019142128.GA6240@bytemark.barnyard.co.uk> Message-ID: <20041019150030.GA14189@dyn.fergaldaly.com> On Tue, Oct 19, 2004 at 03:21:29PM +0100, David Cantrell wrote: > On Tue, Oct 19, 2004 at 02:36:32PM +0100, Dermot McNally wrote: > > > What the other folks said about 048, stands, but there's a snag trying > > to incorporate enough clevers to handle this into > > Number::Phone::Country. The lookup expects the number you provide to be > > in full international format. Arguably, the number +353-48-12345678 is > > only a pseudo-international number, as I'm not sure if you can dial it > > from anywhere outside ROI. > > I'll have to try it. Anyone got a company in the north that they > particularly hate and whose time I can waste trying it out? :-) > > > What are you planning? To have numbers of that format as well as the > > +44-28-12345678 equivalent map to a mock-ISO code for NI? > > Look at the source for N::P::Country, and how he handles the craziness > that is +1. I'd handle +353 the same way. And +7, +34, and +39. That certainly does look crazy. I did this a few years ago and I did what you're doing, see if the whole number matches, then chop off the last number and try again, then chop off and try again... I also tried building a trie (although I didn't know it at the time). If you have the following prefixes 12 = GB 13 = IT 121 = IE 245 = CA your trie looks like 1 - 2 = GB - 1 = IE - 3 = IT 2 - 4 - 5 = CA and say the number 1278453434 comes along. Then you try to go into 1, which succeeds, then you try to go into the 2 branch of 1 which succeeds, then you try to go into the 7 branch of 2 which doesn't succeed because there is no 7 branch so you're finished and the matching prefix is 12 and the country code is GB. You can actually do all this with a regex and a hash. For the above, the regex is /^(121|12|13|245)/ the key is that the longer prefix comes before the shorter prefix. If you want more efficiency you can make this /^(1(?:2(?:1?)|3)|245)/ Obviously you don't want to go generating them by hand but once you've got it, it's very nippy. I know someone working for a telco who implemented his tries in C which was much more efficient, time and space-wise but he can't release the code, F From david at cantrell.org.uk Tue Oct 19 10:58:05 2004 From: david at cantrell.org.uk (David Cantrell) Date: Tue Oct 19 10:58:11 2004 Subject: [Dub-pm] Phone numbers In-Reply-To: <20041019150030.GA14189@dyn.fergaldaly.com> References: <20041019104640.GA2015@bytemark.barnyard.co.uk> <41751860.9040704@directski.com> <20041019142128.GA6240@bytemark.barnyard.co.uk> <20041019150030.GA14189@dyn.fergaldaly.com> Message-ID: <20041019155805.GA7892@bytemark.barnyard.co.uk> On Tue, Oct 19, 2004 at 04:00:30PM +0100, Fergal Daly wrote: > On Tue, Oct 19, 2004 at 03:21:29PM +0100, David Cantrell wrote: > > Look at the source for N::P::Country, and how he handles the craziness > > that is +1. I'd handle +353 the same way. And +7, +34, and +39. > That certainly does look crazy. +1 is just a silly idea. Pity that the NANP countries' phone systems are so backward that they can't realistically upgrade to something that makes sense :-) > I did this a few years ago and I did what you're doing, see if the whole > number matches, then chop off the last number and try again, then chop off > and try again... I also tried building a trie (although I didn't know it at > the time). Yes, that works too. > I know someone working for a telco who implemented his > tries in C which was much more efficient, time and space-wise but he can't > release the code, The technique that I use (given 123456789, first look for 123456789, then for 12345678, then 1234567 and so on until you get a match) was certainly quick enough that I could switch a call - using perl* and mysql! - in hundredths of a second. OK, admittedly we cheated by skipping a chunk in the middle. Given a number like 0ABCDEFGHIJ we'd look up the whole number, then go straight to 0ABCDEF cos we knew that the intervening digits weren't significant, as the smallest block assigned to us was 10000 numbers. [yes, I've simplified international calls out of this] When I was using similar code for billing, I used the same technique, but the time taken was negligible compared to the database lookups required to figure out tariffs. In the worst case I could still bill calls 30 times faster than customers were making them, on pretty low endian hardware. * a couple of years ago I worked for a telco which used perl for pretty much everything -- David Cantrell | Reality Engineer, Ministry of Information attractivating: inducing the quality of being attractive, especially to members of the appropriate sex. -- Henrik Levkowetz