Santosh Benjamin’s Weblog

November 28, 2009

Using INTERSECT with LINQ to XML

Filed under: Coding, General, LINQ — Tags: , , — santoshbenjamin @ 6:21 pm

In terms of hands-on coding (not general awareness) I’m a bit of a newbie to the world of LINQ actually, having only dabbled with a little LINQ to XML in MockingBird and even there I wasnt too impressed with it in the area of XPath queries. But I came across something yesterday that is a testimony to the power of LINQ.

My scenario was that I wanted to compare two XML documents that followed the same schema, but I wanted to do this in  a fairly generic way without writing code to explicitly pick up every element in the hierarchy. My requirement was to find all common elements between the two documents and also the elements in one and not the other.

Take the following example:

Document-1
<Authors>
  <Author ID=”1″ Name=”AuthorA” JoinDate=”3/1/2009″/>
  <Author ID=”2″ Name=”AuthorC” JoinDate=”3/1/2009″/>
</Authors>
Document-2
<Authors>
  <Author ID=”1″ Name=”AuthorA” JoinDate=”3/1/2009″/>
  <Author ID=”2″ Name=”AuthorB” JoinDate=”3/1/2009″/>
</Authors>

I quickly found that LINQ has this powerful INTERSECT function which would allow me to find the common elements and the EXCEPT function which will find the distinct elements.

My first attempt (at finding the common elements) was like this:

var commonFromA = aDoc.Descendants(“Authors”).Intersect(bDoc.Descendants(“Authors”));

But this did not work. After much more attempts and discussions with colleagues, it was beginning to look like i could only use INTERSECT with native types and I would either have to write a custom IEqualityComparer<T> or write more complex code involving anonymous types (which are , by the way, a brilliant feature of the framework).

But LINQ is supposed to be elegant, right? So I posted the question on the MSDN Forums and got an immediate reply from Martin Honnen   a MVP in this area, and yes, the solution was elegant and just in one line.

var commonFromA = aDoc.Descendants(“Author”).Cast<XNode>().Intersect(bDoc.Descendants(“Author”).Cast<XNode>(), new XNodeEqualityComparer());

As Martin explained, the set operators like INTERSECT and EXCEPT work on object identity not value comparisons and as I had distinct XElement objects in different documents my initial attempt would not work. However, the XNodeEqualityComparer comes to the rescue and casting the XElement to XNode was all that was required.

What’s even more interesting is that in .NET 4.0, we have something called “contravariance” which will allow the INTERSECT code above to work without the explicit cast. Martin explains this very well in this post on “Exploiting Contravariance with LINQ to XML”. I always wanted to understand what Covariance and Contravariance were all about and this is a great explanation.

Essentially, with Contravariance, you can pass in the base type XElement even though the comparison (with XNodeComparer) is expecting an XNode , (the derived type) and you dont need to mess with casting etc. With Contravariance you are also not mutating the object itself (actually, you cannot change the object) so this works.

On the same subject, also check out Eric Lipperts blog article.  I had come across that post earlier but didnt have any immediate need for that functionality so I didnt pay attention, but this time, I did.

So, there you have it. A one line solution for comparing XML documents. (The “EXCEPT” code was also one line). Of course if you want to find out specific attribute values and changes, then the code becomes more involved, but you’ve gotta admit that this is elegant. Can you imagine how much code this would need in the Xml DOM world!!

I’m starting to get hooked on LINQ!  :-)

November 6, 2009

To DAL or not to DAL

Filed under: Architecture, Biztalk, Coding — Tags: , , — santoshbenjamin @ 9:17 pm

Do BizTalk consultants need to care about Data Access Layers? Does a BizTalk solution really need a DAL?  These are the questions that I’ve been mulling over in the past few weeks. Let me explain.

There are a couple of places where a BizTalk solution encounters a DAL. The first is where the DAL acts as an integration enabler. Here the endpoint of the LOB application we are integrating with happens to be a database. The second is where the DAL acts as a process enabler. Here the DAL provides the underpinning of the business process (that is, as part of the business process, it is frequently necessary to update a database with the state of the business document being operated on).

In my current gig, we are using both BizTalk and SSIS. SSIS is great for the ETL and various data related actions. BizTalk then takes over and passes the data to an LOB application doing various business processes as part of that communication. The nature of the processes is such that there is a significant DAL. Early on in the project we went through the usual debate on whether a custom DAL was necessary or if we should just use the requisite database adapters. Isnt the database adapter an obvious choice?  Maybe, or maybe not. In an earlier post , i talked about just such a situation a few years ago where we had choose whether to link directly to the DB or wrap the system in a web-service first and as i explained, things didn’t turn out the way they were expected to.

So, what are the considerations?

  1. Firstly, (as I explained in the post and the follow up posts) one of the key issues is the level of abstraction you are given. Especially when dealing with the scenario of integration enablers, a database endpoint is very rarely coarse grained enough to support a service oriented approach. Its more likely that you will be provided with CRUD level interfaces. Even if you decide to direct all communication via an orchestration that wraps all this, how does the orchestration actually call the backend system? Go via the adapter or use a DAL?
  2. For the scenario of process enablers, abstraction comes into play again. You don’t want to be cluttering up your orchestrations with bits and pieces of database schema related stuff. You could choose to wrap the database calls in a coarser stored proc but this leads to the next key point which is
  3. Performance. If you have a number of send ports (for all these stored procs) in the middle of your orchestrations, there is a cost associated with all those persistence points. If your transaction handling requirements permit, you could think about wrapping some of those calls in atomic scopes, but you have to be  very careful with this. If you do encounter an issue and everything gets rolled back, are your processes really designed to start at the right place all over again without compromising data integrity?
  4. If your DAL is designed well, your orchestrations will benefit from having to call methods on business level entities and, just from a persistence point consideration, will, in my opinion, be better off.
  5. Transaction Bridging : There were a few situations where we had to bridge the transaction across the database and the segment of the business process. Fortunately, the DAL being of extremely high quality (courtesy of an expert colleague) made this very easy to do.

But, having said all this, a DAL doesn’t come free. You have to write code. Sometimes lots of it. The more code you write, the higher the probable bug density. If the functionality can be satisfied with a code-generator then that will reduce the code you have to write, but it DOES NOT reduce the amount of code you have to MAINTAIN. I think many developers forget about this last point. I’m all in favour of code-gen, but don’t forget the maintenance cost.  (Further, if the functionality in the middle of your processes can be satisfied with boiler plate code, perhaps it’s an opportunity to question what it’s doing there in the first place. Can it be pushed to a later stage and componentized? )

I must confess, at one point, when wading through a sea of DAL code early on in the project, I was quite tempted to throw it all away and go for the adapters, but the considerations above outweighed the pain at that point. Now much later, with everything having stabilized, we know just where to go to make any changes and the productivity is quite high.

But I’ve seen cases where BizTalk developers didn’t care about the SQL they wrote and they ended up in a mess with locking and poor performance. And it takes a really good developer to write a first class DAL and having interviewed and worked with a number of devs I can say that its hard to find good skills in this area. Pop quiz: Do you know how to use System.Transactions yet ?  :-)

There is always the option of using something like NHibernate. If you use some coarse grained stored procs and some business entities, you could kill all the “goo” in the middle by letting NH take care of the persistence. That, i wager would reduce the bug count in that area. But watch out for the maintenance times and the bug fixing. When there’s a component in the middle that you don’t know the internals of, it can make life very hard when trying to track down bugs.

That leads me on to the point of making choices based on knowledge and not ignorance. If you want to adopt “persistence ignorance”, don’t do it because you cant write proper DAL code yourself. Do it for the right reasons.

So I hope the points above have given some food for thought. Custom code is not always bad as long as it is approached and implemented correctly. Whether you choose to use a DAL or not, do it with careful thought on issues like the ones above. As always, your feedback is welcome.

Technorati Tags: ,,

June 4, 2009

Dev10 Dive: 1 – Emphasizing TestFirst

Filed under: Coding, Dev10 Dive, Training Kits, Visual Studio 2010, WF — Tags: , , — santoshbenjamin @ 3:43 pm

I recently downloaded and installed Dev10 Beta-1 and created some images for my team. The Channel-9 video guiding us step by step through the whole process was invaluable. One thing that had me in trouble was the installation of Full Text Search in SQL 2008 (as TFS requires this feature). When i captured the ISO image (as we usually do in Virtual PC) and installed from there, the installation failed. It turns out that the installation media needed to be inside the VM. That done, the rest of the installation was fine.

Anyway, I then got hold of the VS 10 Training Kit and started with the WF labs. Got one full exercise done. The thing that impressed me most, was not actually WF itself (at this particular time), but the fact that when writing the custom activity, the instructions were to first write a test to check the output of the activity. Not only that, there is also a nod to the BDD side of things as the name of the test was “ShouldReturnExpectedGreeting” (or something along those lines) . Now , if you’ve looked at the various blogs around BDD,  one of the first steps (or baby steps if you like) towards proper BDD is to start naming tests like this rather than staid old “TestGreeting” or “GreetingTest“. It may seem like a small thing (and that was my opinion when i started down this route as well), but to me, it made a lot of difference to the way I approached my tests and helped me nail the purpose of the test better , thus also keeping it concise. Aside from this it serves as a form of documentation so a quick glance over your code base (even for your own code when you look at it after a few weeks or months) will bring you or the reviewer upto speed faster than with dodgy or less meaningful names.

In keeping with this emphasis on the test first approach, there is another, older video on Channel 9, part of the same series titled “Code Focussed in VS10” which shows some of the new features that allow us to write the test first and then have the IDE generate the class stubs and method stubs from the test itself. Of course, for those devs using R# and other refactoring tools this is nothing new, but lots of developers dont use them and this is a nice addition to allow us to really write the tests first and stay within the test, fleshing out the class as we go along rather than just writing a failing stub and then switching attention to the class because, unless you are very disciplined once you start working on the class, you tend to leave the tests behind and revisit them later with the attendant refactoring of code and tests.

So, there it was a rather pleasant discovery of a development discipline in a rather unlikely area (considering how design and IDE driven WF is). I’m looking forward to the other labs and I hope this emphasis is in them as well.

June 1, 2009

VS Color Schemes : Rejuvenating Development

Filed under: Coding, General — santoshbenjamin @ 2:14 pm

Ok, so I’ve been really late to this particular party, but I gotta say, I’m absolutely thrilled with the effect that changing the color schemes of VS has on improving my coding morale!! I’ve been using several schemes from Tomas Restrepo’s collection and its done wonders for me  (specifically Ragnorak Blue, Grey and Moria Alternate).  Since I’m using VS 2005 and 2008 side by side, I have quite different color schemes for them and it makes things more interesting than the mundane white background. Maybe it’s also age and the fact that my eyes get tired more easily but hey, Consolas at 15pt looks awesome. :-)

Having said this, I also started work on Dev10 and I must say, the OOB color scheme is nice. The new WPF editor renders the fonts much crisper and neater so I’ve been content to leave it without changing to a dark background. I guess we’ll have to wait a while for some new schemes to emerge. Quite sure the new editor has various new options for color schemes.

Another thing that its done, aside from make my IDE look nicer, is that it’s given me a coding boost. In fact, my releasing BizUnitExtensions 3.0 is more down to the new color scheme than anything else :-)  .

So, if you havent taken this particular plunge yet, why not try it out?

February 5, 2009

BizTalk Testing and Mocks

Filed under: BizUnit, Biztalk, Coding, Mock Objects, Testing, Tools — santoshbenjamin @ 11:08 pm

In an earlier article , I had briefly mentioned that some folk had used mocks with Biztalk, notably to test pipeline components. Since I didnt have the bookmarks at hand then I didn’t provide the links, but I have since found the links again so here they are (and I can also now use this as a note to self if I want to refer to them again or expand on any of the material they have written).

While the  blog posts pointing to the Pipeline Testing Library are useful, if you want to go straight to the source, check out the WIKI page that Tomas has set up on GitHub. That page has more samples on how to use the API.

I’m going to have a play around with MoQ and pipeline components in the next couple of days . I think MoQ’s approach is a bit more elegant than Rhino (particularly, the absence of record and replay). I’m also going to link into Tomas’s excellent pipeline testing library from BizUnitExtensions. This has been a long overdue item on my roadmap.

UPDATE: Bram Veldhoen has already done some work on linking the Pipeline Testing Library into BizUnit and has very graciously contributed his code to be put into BizUnitExtensions so that will be released soon with Extensions 3.0.

Enjoy the links and if you find others of a similar ilk that are also useful feel free to put them in the feedback section here and I will update the post.

February 3, 2009

VS2008 – Generate XML Instances

Filed under: Automation, Coding, MockingBird Diary, Testing — santoshbenjamin @ 12:07 pm

It’s funny how we take things for granted. As Biztalk developers, we get used to the idea of being able to right click on a schema and generate an instance . In non Biztalk projects however, this couldnt be done. Till now.

I was playing around with writing some XML Instance Generation for MockingBird to finish off the next release and spent a lot of time poking around the Schema Object Model etc. While doing that, I quite accidentally opened the XML Schema Explorer tool window. Now I had seen that in the past and navigated through some types etc (and thinking it was just a simple add-on to the old VS i kind of took it for granted and didnt investigate further). 

What I did not realize is that for Elements, you can generate sample instances.  Check out the following screenshot.

Generate Sample XML

Generate Sample XML

As you can see from the tree behind the popup window, elements are colored differently as well.

This is a great time-saver. I think this can be done only in VS2008 SP1.

Unfortunately, the downside is that there is no API into this tool-window (or rather, the library behind it), so we cannot programmatically generate instances in bulk. Also it will not open WSDL files, so you have to extract the XSD from the WSDL (if not already available separately) in order to work with this tool window. But i think its cool as we no longer have to depend on third party XML editors to get sample instances.

By the way, if you are looking for help in this area (of instance generation) , there is some sample code available in the MSDN article, Generating XML Documents from Schemas which is quite well written. While there are license restrictions on modification/ derivation (and then redistribution), plain redistribution without modification, I gather, is permitted, so the easiest thing for MockingBird would be to just redistribute the binaries of that sample with the GUI. No sense in reinventing the wheel.

In terms of the Biztalk Schema Editor and its instance generation, if any Biztalk folks know if there’s a programmatic way of doing that, please let me know  (Update: I mean specifically for 2006 and R2). I did a lot of digging around in the Developer Tools folder for an assembly that would allow it, but all the classes were internal. I did come across one public class finally (dont remember the assembly off the top of my head now), which had a public method but required some interface to be passed in but didnt work when i tried calling it from custom code. It would be useful to do this programmatically so we can generate instances in bulk for  a given set of schemas (useful when updating instances to correspond to schema changes etc). So, if you’ve managed to do this and are happy to share info then drop me a line.

January 22, 2009

Waltzing with WSDL

Filed under: Coding, General, Mock Objects, MockingBird Diary, Tools, WCF — santoshbenjamin @ 12:55 am

One of my immediate goals for MockingBird is to have the “Configurator” UI done where you can just pick a WSDL on file or point to a URL and have the tool shred it and generate sample request and response messages into the correct folders (and generate the Handler Configuration) after which the user can then do any extra XPath based configurations to make the system more dynamic. Sounds easy enough, doesnt it? Well, thats what i thought too, until i delved into the intricacies of WSDL Parsing.

The class most people use is ServiceDescription and to be fair, it gives some information about the WSDL but navigating it is not easy at all and its not all that intuitive after you get to the point of listing all Operations inside the PortTypeCollection. There is also another big issue with trying to use ServiceDescription alone and that is the fact that WCF by default gives you multipart WSDLs. So getting hold of the WSDL from a WCF endpoint is not straightforward but one class that comes to the rescue is DiscoveryClientProtocol as Mike Hadlow points out in his article. This behavior can be changed as pointed out by Tomas Restrepo in hs post on “Inline XSD in WSDL with WCF” and Christian Weyer takes it up a notch in his article on Flat WSDL in WCF.

Anyway after pondering this a while, I decided it would be in my best interests to factor the code into a WSDLRetriever and a WSDLDescriptor. The Descriptor takes a dependency on WSDLRetriever (an interface of course, so its easy to mock) and expects  a single WSDL XDocument which it can work with. Now with this approach I dont care if the endpoint (if the WSDL is not on file) is a WCF endpoint or not, the Retriever does all that work. So the Descriptor can do its work and with that sorted i wrote up the first few tests for the Descriptor using a mock Retriever. So far so good.

Now, along with the requirement stated above, I also want to build a nice UI, where, you can not only do a single click to generate messages, but you can exercise finer control over the message content before storing them in the appropriate folders. So imagine if you will, a tree view which shows all the operations and the messages in them and immediately under the message name, the name of the complex type and additionally the body of the type (not in the tree node, but in an associated panel).

Of course, I’m not trying to write a completely generic WSDL parser, but something thats immediately usable by MockingBird and for other tools which can use a WSDL as input.

I had a look around at some blog posts and samples but none seemed to go as far as I wanted. I also took a look through the code for WSCF Blue  but couldnt make immediate use of that API. (I’ve since chatted with Buddhike about that so there may be something i could do with WSCF libraries in future). I looked at WebServiceStudio as the license allows reuse of the code there, but the code structure is simply awful. It works as is but everything is so deeply tied into the UI that it will take a feat of coding to extend it or pull out any reusable logic from it. I tried re-factoring a local copy but gave up pretty soon. :-( So I had to work through the requirement myself.

Anyway, parsing WSDL is not easy at all. Once you get to the operation and message, you then have to wak through all schemas and their associated object tables to pick out the correct complex type corresponding to the message. This took me the best part of a few hours. Given that the user will not supply any namespaces, it gets very hard indeed. Thankfully the Schema Object Model is very decent. I used LINQ to XML for little bits such as extracting endpoint information (which also required me to resort to XPath extension methods because LINQ insisted i give it qualified names but I didnt know what the namespaces were !!) but for the schemas in WSDL, i think the SOM is better than LINQ. But I’m happy to be proved wrong!!

So I’ve got it to the point where it brings back the complex types. Now to generate sample instances from that for which I’ve got some resources. Once thats done and a simple Retriever implemented it wont be hard to put a UI on that because none of this is tied to the UI.

Thats it for now, a sort of ‘progress’ post for those waiting for the GUI for MockingBird and also to point out some of the dragons that await those who venture into the WSDL arena. Talk about ‘design by committee.. its a minefield!!

Couple of really good articles are Understanding WSDL by Aaron Skonnard (an old one, but an excellent read) and Walking the SOM by Stan Kitsis

Till next time :-)

May 31, 2008

CodeStyleEnforcer

Filed under: Coding, General, Standards — Tags: , , — santoshbenjamin @ 7:33 pm

You gotta love this. Its like buses. You wait for one forever and then two come along at the same time. No sooner had i finished looking through StyleCop, I happened to go over to the Visual Studio Gallery (this is different from the MSDN Code Gallery) and came across the CodeStyleEnforcer. Downloads can be done from the homepage of the tool here.

According to the blurb

“Code Style Enforcer is a DXCore plug-in for Visual Studio 2005 / 2008 that checks the code against a configurable code standard and best practices. It is developed for C#, but some of the rules will also work for VB .NET, though not tested. The code standard is currently configurable.”

To paraphrase the rest – it supports name rules, visibility rules & implementation rules. It is based on the IDesign C# coding standard which is also a freely available and a popular standard.

On taking a look at the rather grainy screenshot provided, it appears that this is a ‘pro-active’ tool and highlights issues in code as they are written. Re-sharper provides some of this kind of checking as well but of course, thats commercial whereas this is free. I’ve also been impressed with the way the DXCore system almost blankets the entire codeModel allowing you to write plugins easily. More on this later.

It also seems , from various sources on the net, that loads of people are moving away from Resharper to Refactor Pro (written by the DXCore guys, DevExpress) especially with the instability of R# 3.x. I still use R# 2.5.2 and will wait for reviews of 4.x before i stump up for an upgrade, but in general, i love the tool. Still its good to have different options. Anyway, this post is not about R#. Check out CodeStyleEnforcer and see how you like it.

StyleCop – first impressions

Filed under: Coding, General — Tags: , — santoshbenjamin @ 2:20 pm

A friend alerted me to the fact that StyleCop (formally named Microsoft Source Analysis for C#) is now available in the public domain and can be downloaded from the MSDN Code Gallery . Its been used as an internal tool in Microsoft for a long time.

It differs from FxCop which is targeted at compiled assemblies while this looks at source code and is only aimed at C#. Information on the tool can be found here and the comments are quite interesting. It looks like a controversial tool (for example, the preference for spaces over tabs) so now we are going to have more layout wars. 

This is delivered as a VS Package so when you install it, it automatically gets loaded into the VS 2005 Tools Menu under the caption – Run Source Analysis. There is a SourceAnalysisSettingsEditor application in the Program Files directory. No shortcuts are available on the desktop.

There is a gotcha when you try to launch the editor application from that folder directly. It will immediately throw up an error message saying “The command line arguments are incorrect. Include a path to the Source Analysis Settings file.

To overcome this simply drag the “Settings.SourceAnalysis” file onto the application and it will open up correctly.  Alternatively, you can open up a command prompt at the folder and enter “SourceSettingsAnalysisEditor.exe Settings.SourceAnalysis” to launch the application. The third method, which i guess is the way it was intended to be used (as it is a VSPackage) is to simply right click on the Project and choose the Source Analysis Settings option.

Ultimately, however you choose to launch it, the main screen looks like this

StyleCop_UI

I ran it on a small application and it immediately threw up at least a hundred errors – a bit like FxCop – very daunting. The interesting thing about the settings file is that there is hardly anything there. Its an XML file (arent they all?) and i was expecting to see all the options that had been selected in the application showing up there. Actually the application launches with all the default settings and once you unselect or select one of them , the settings file immediately gets updated. However we are not expected to play around with this that much. According to the blurb on the MSDN post

The ultimate goal of Source Analysis is to allow you to produce elegant, consistent code that your team members and others who view your code will find highly readable. In order to accomplish this, Source Analysis does not allow its rules to be very configurable. Source Analysis takes a one-size-fits-all approach to code style, layout, and readability rules. 

Lone Ranger developers with strong opinions on code layout (esp if they differ from these) will most probably be up in arms at some of this, but the tool is aimed at team use and when you need consistency in a big code base, tools such as these are a definite asset (of course, in some teams you gotta be prepared to start arguing about the number of spaces /tabs etc :-) .. sigh!! you cant win ‘em all )

It is also not an extensible tool and so you cannot add your own rules to it. FxCop does allow that, but then you need a stout heart to even attempt writing custom rules for FxCop so i dont think the absence of a facility for defining custom rules is that much of a downside (in a free tool , that is.. commercial tools are a different story).

There’s also a couple of good posts on the MSDN blog documenting the following items in more detail.

There is also an MSBuild task available for this in the Microsoft.Sdc.Tasks library (Microsoft.Sdc.Tasks.Tools namespace) and the CHM file for the library indicates that the task has the following parameters

<Tools.StyleCop FullAnalysis=”fullAnalysis”
                            PathToAnalyze=”pathToAnalyze”
                            ViolationsFile=”violationsFile”
                            UseVSBuildFiles=”useVSBuildFiles”
                            FileToAnalyze=”fileToAnalyze”
                            OptionsFile=”optionsFile”
                            UseCoreXTFiles=”useCoreXTFiles”>
                <Output TaskParameter=”TotalViolations” ItemName=”itemName” />
            </Tools.StyleCop>

So, its nice to have this and i’m sure that now its in the open, there will be some enhancements made in due course. Perhaps someone will work out a way to plugin new rules (maybe using extension methods!!) into the existing code base or maybe we’ll get a new version if it starts getting popular. Check it out and see if it helps your team. 

 

 

 

Blog at WordPress.com.