Santosh Benjamin's Weblog

Adventures with AppFabric, BizTalk & Clouds

Using INTERSECT with LINQ to XML

leave a comment »


In terms of hands-on coding (not general awareness) I’m a bit of a newbie to the world of LINQ actually, having only dabbled with a little LINQ to XML in MockingBird and even there I wasnt too impressed with it in the area of XPath queries. But I came across something yesterday that is a testimony to the power of LINQ.

My scenario was that I wanted to compare two XML documents that followed the same schema, but I wanted to do this in  a fairly generic way without writing code to explicitly pick up every element in the hierarchy. My requirement was to find all common elements between the two documents and also the elements in one and not the other.

Take the following example:

Document-1
<Authors>
  <Author ID=”1″ Name=”AuthorA” JoinDate=”3/1/2009″/>
  <Author ID=”2″ Name=”AuthorC” JoinDate=”3/1/2009″/>
</Authors>
Document-2
<Authors>
  <Author ID=”1″ Name=”AuthorA” JoinDate=”3/1/2009″/>
  <Author ID=”2″ Name=”AuthorB” JoinDate=”3/1/2009″/>
</Authors>

I quickly found that LINQ has this powerful INTERSECT function which would allow me to find the common elements and the EXCEPT function which will find the distinct elements.

My first attempt (at finding the common elements) was like this:

var commonFromA = aDoc.Descendants(“Authors”).Intersect(bDoc.Descendants(“Authors”));

But this did not work. After much more attempts and discussions with colleagues, it was beginning to look like i could only use INTERSECT with native types and I would either have to write a custom IEqualityComparer<T> or write more complex code involving anonymous types (which are , by the way, a brilliant feature of the framework).

But LINQ is supposed to be elegant, right? So I posted the question on the MSDN Forums and got an immediate reply from Martin Honnen   a MVP in this area, and yes, the solution was elegant and just in one line.

var commonFromA = aDoc.Descendants(“Author”).Cast<XNode>().Intersect(bDoc.Descendants(“Author”).Cast<XNode>(), new XNodeEqualityComparer());

As Martin explained, the set operators like INTERSECT and EXCEPT work on object identity not value comparisons and as I had distinct XElement objects in different documents my initial attempt would not work. However, the XNodeEqualityComparer comes to the rescue and casting the XElement to XNode was all that was required.

What’s even more interesting is that in .NET 4.0, we have something called “contravariance” which will allow the INTERSECT code above to work without the explicit cast. Martin explains this very well in this post on “Exploiting Contravariance with LINQ to XML”. I always wanted to understand what Covariance and Contravariance were all about and this is a great explanation.

Essentially, with Contravariance, you can pass in the base type XElement even though the comparison (with XNodeComparer) is expecting an XNode , (the derived type) and you dont need to mess with casting etc. With Contravariance you are also not mutating the object itself (actually, you cannot change the object) so this works.

On the same subject, also check out Eric Lipperts blog article.  I had come across that post earlier but didnt have any immediate need for that functionality so I didnt pay attention, but this time, I did.

So, there you have it. A one line solution for comparing XML documents. (The “EXCEPT” code was also one line). Of course if you want to find out specific attribute values and changes, then the code becomes more involved, but you’ve gotta admit that this is elegant. Can you imagine how much code this would need in the Xml DOM world!!

I’m starting to get hooked on LINQ!  🙂

Advertisements

Written by santoshbenjamin

November 28, 2009 at 6:21 PM

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: