[VPM] Comparing XML File

abez abez at abez.ca
Wed Dec 8 14:36:02 CST 2004


Parse the XML into a tree composed of arrays (because you have more than
one subelement of the same type). Since I can't care less about XML I'll
discuss the algorithm.

You want to see if a is in b and b is in a (therefore they are equal).
Foreach element in a that element should exist in b. You have a problem
that references will not be equal.

This is actually a graph isomorphism problem on a tree. Graph
Isomorphisms tests are generally believed to be NP (Non-Polynomial Time
(aka they take a LONG TIME to compute)). I would've rather written this
in ML :(

There are probably cases where the tests fail, and if you parsed XML
you'd probably want to enforce the first element of the array be the tag
name.

my ($a,$b,$c) = qw(a b c);
#true
print isEqual(
        [
                [qw(a b c)],
                [$a,[$b],$c],
                [[$a],$b,$c],
                [[[$a],$b],$c]
        ],
        [
                [[$a],$b,$c],
                [[$b],$a,$c],
                [$c,[$b,[$a]]],
                [qw(a b c)],
        ]
),$/;
#true
print isEqual(
        [ [ [ [$a,$b], [$a,$b], [$a,$b], ] ], ],
        [ [ [ [$b,$a], [$b,$a], [$b,$a], ] ], ]
),$/;
#false
print isEqual(
        [ [ [ [$a,$b], [$a,$b], [$a,$b], ] ], ],
        [ [ [ [$b,$a], [$b,$a], ] ], ]
),$/;
#false
print isEqual(
        [ [ [ [$a,$b],[$c,$b],[$a,$b] ] ], ],
        [ [ [ [$b,$a],[$c,$a],[$b,$a] ] ], ]
),$/;
#false
print isEqual(
        [
                [qw(a b c)],
                [qw(a b c)],
                [$a,[$b],$c],
                [[$a],$b,$c],
                [[[$a],$b],$c]
        ],
        [
                [[$a],$b,$c],
                [[$b],$a,$c],
                [$c,[$b,[$a]]],
                [qw(a b c)],
        ]
),$/;
                                                                                                              
sub isEqual {
        my ($a,$b) = @_;
        return contains($a,$b) && contains($b,$a);
}
sub notRefTest {
        my ($e,$list) = @_;
        foreach my $elm (@$list) {
                if ($elm eq $e) {
                        return 1;
                }
        }
        return 0;
}
sub refTest {
        my ($ref,$list) = @_;
        foreach my $elm (@$list) {
                if (ref($elm) eq 'ARRAY') {
                        if (contains($ref,$elm) && contains($elm,$ref))
{
                                return 1;
                        }
                }
        }
        return 0;
}
sub contains {
        my ($arra,$arrb) = @_;
        if (@$arra != @$arrb) { return 0; }
        foreach my $elm (@$arra) {
                my $ret = 0;
                if (ref($elm) eq 'ARRAY') {
                        $ret = refTest($elm,$arrb);
                } else {
                        $ret = notRefTest($elm,$arrb);
                }
                if (!$ret) { return 0; }
        }
        return 1;
}




On Wed, 8 Dec 2004, Philip Yuson wrote:

> Anyone knows of a script that compares 2 xml files
> The tags will be in a different order.
> 
> Thanks.
> 
> 

-- 
abez ------------------------------------------
http://www.abez.ca/ Abram Hindle (abez at abez.ca)
------------------------------------------ abez



More information about the Victoria-pm mailing list