# [VPM] Comparing XML File

abez abez at abez.ca
Wed Dec 8 14:36:02 CST 2004

Parse the XML into a tree composed of arrays (because you have more than
one subelement of the same type). Since I can't care less about XML I'll
discuss the algorithm.

You want to see if a is in b and b is in a (therefore they are equal).
Foreach element in a that element should exist in b. You have a problem
that references will not be equal.

This is actually a graph isomorphism problem on a tree. Graph
Isomorphisms tests are generally believed to be NP (Non-Polynomial Time
(aka they take a LONG TIME to compute)). I would've rather written this
in ML :(

There are probably cases where the tests fail, and if you parsed XML
you'd probably want to enforce the first element of the array be the tag
name.

my (\$a,\$b,\$c) = qw(a b c);
#true
print isEqual(
[
[qw(a b c)],
[\$a,[\$b],\$c],
[[\$a],\$b,\$c],
[[[\$a],\$b],\$c]
],
[
[[\$a],\$b,\$c],
[[\$b],\$a,\$c],
[\$c,[\$b,[\$a]]],
[qw(a b c)],
]
),\$/;
#true
print isEqual(
[ [ [ [\$a,\$b], [\$a,\$b], [\$a,\$b], ] ], ],
[ [ [ [\$b,\$a], [\$b,\$a], [\$b,\$a], ] ], ]
),\$/;
#false
print isEqual(
[ [ [ [\$a,\$b], [\$a,\$b], [\$a,\$b], ] ], ],
[ [ [ [\$b,\$a], [\$b,\$a], ] ], ]
),\$/;
#false
print isEqual(
[ [ [ [\$a,\$b],[\$c,\$b],[\$a,\$b] ] ], ],
[ [ [ [\$b,\$a],[\$c,\$a],[\$b,\$a] ] ], ]
),\$/;
#false
print isEqual(
[
[qw(a b c)],
[qw(a b c)],
[\$a,[\$b],\$c],
[[\$a],\$b,\$c],
[[[\$a],\$b],\$c]
],
[
[[\$a],\$b,\$c],
[[\$b],\$a,\$c],
[\$c,[\$b,[\$a]]],
[qw(a b c)],
]
),\$/;

sub isEqual {
my (\$a,\$b) = @_;
return contains(\$a,\$b) && contains(\$b,\$a);
}
sub notRefTest {
my (\$e,\$list) = @_;
foreach my \$elm (@\$list) {
if (\$elm eq \$e) {
return 1;
}
}
return 0;
}
sub refTest {
my (\$ref,\$list) = @_;
foreach my \$elm (@\$list) {
if (ref(\$elm) eq 'ARRAY') {
if (contains(\$ref,\$elm) && contains(\$elm,\$ref))
{
return 1;
}
}
}
return 0;
}
sub contains {
my (\$arra,\$arrb) = @_;
if (@\$arra != @\$arrb) { return 0; }
foreach my \$elm (@\$arra) {
my \$ret = 0;
if (ref(\$elm) eq 'ARRAY') {
\$ret = refTest(\$elm,\$arrb);
} else {
\$ret = notRefTest(\$elm,\$arrb);
}
if (!\$ret) { return 0; }
}
return 1;
}

On Wed, 8 Dec 2004, Philip Yuson wrote:

> Anyone knows of a script that compares 2 xml files
> The tags will be in a different order.
>
> Thanks.
>
>

--
abez ------------------------------------------
http://www.abez.ca/ Abram Hindle (abez at abez.ca)
------------------------------------------ abez