SOA-2: Debating the nature of the Integration Layer
In a previous post on project goings on I talked about our choice of adapters and the lessons we learned. One thing we are never short on is stress, excitement and debate (mostly the last one). .
Before you read any further, let me warn you this is going to be a long one. I’ve been chewing over it for a couple of months now. Ok, back to business.
Once that project got over we had to start on the next assignment. The requirement was like this. There is an existing backend system, call it system A (which we integrated with in the previous project) which was used by a particular department and the business wanted to build another system based on some processes which had hitherto been paper based. Let’s call that System B.
The way they wanted it work was that System A and B would actually be subsystems of one big application and that they would share some core database entities such as person & address. Now we have a portal, which is a sort of mash-up presenting a view of the customer across multiple systems including A and B. (You could argue that this not really a mash-up because it processes transactions across systems rather than just presenting an aggregated view). System B did not need its own UI because it was to be presented via the portal so all we needed was some web services.
We take our lessons seriously and once we had decided that we would avoid the database adapter in the service layer, a more profound argument arose as to the value and maybe more importantly the location of the service layer.
In a general integration situation where you have existing systems and you are placing a hub into the mix, the design is fairly straightforward. The hub obviously goes in the middle. But when you have a total green field what do you do?
The key thing that got people tied in knots for a while is that if the portal is to call web services and the backend system exposes web services then why put a web service layer (backed by orchestrations) in the middle? If the backend system had a different granularity from what we needed, then the hub would provide the mapping and a reason for its existence, but we had full freedom to design the backend service with coarse enough granularity from the beginning. Yes, we had to send some notifications about data changes in system B to other applications as well, but why do that from a central mediation layer? Why not just emit notifications from system B to the hub and have it then publish them to other systems?
The options are shown in the block diagram below.
From the portal point of view, its just connecting to a web service and it doesn’t particularly care if Biztalk is behind the web service or some vanilla .NET components but when you are designing those services, what’s behind the service is very important.
So, what are the considerations in this situation? Here are some of my thoughts on the items that influenced our decision.
Granularity: The way I normally approach things is to is to categorise the backend service as an Application Service which can legitimately expose CRUD operations while the main Business Service that would span system A and B and others would be designed with more coarse grained functionality (for instance, ProcessCustomerNotification, which could be mapped on to one or more Application Service calls such as SearchCustomer & Insert/UpdateCustomer. Of course, in this particular project, as I mentioned above, the granularity was up for grabs so System B might as well expose a ProcessCustomerNotification instead of CRUD.
Consistency: From the point of view of consistency option 2 is awful. Just because you have web services doesn’t automatically mean you have SOA. It becomes a mess of point to point web service connections very easily which is why we go for hubs in the first place. “So what about “distributed SOA then?” you may legitimately ask. I have heard that term bandied about a lot in various blogs by folk who don’t seem to like hubs for various reasons. Now option 2 when extended to the nth degree may or may not resemble a distributed SOA but I don’t know because I haven’t actually seen a distributed SOA or a solid reference architecture for the same which covers all the bases.
Isolation of systems: One of the classic arguments for the hub is “what about the time when the backend system changes? You need to avoid breakage in consumers”. That still holds good.
One point offered by a colleague was that you could always introduce some mapping into the WS to cater to future changes in System B and it didn’t have to necessarily be in a Biztalk layer. I agree with that but the implementation of the backend service needs to be really well partitioned to allow inserting a translation layer without causing issues.
Cost: This is an interesting consideration. Option 1 may cost more when you consider that System B could well expose whatever the portal needs so why pay to build something in between that would only (at least initially) be a pass through layer? But then again, in the short term point to point is always attractive from the cost point of view and mediation is more expensive. It begins to bite soon enough when the inevitable changes begin. If it didn’t, there wouldn’t be an integration business at all. If you were to design System B to be well factored so you could cater to potential change, then the cost of doing that could well be equal to the cost of putting in the hub components in the first place (assuming that the hub already exists – if it didn’t, then the cost of acquisition could prove to be too much). Secondly, when it comes to the notifications to other systems, you are actually adding extra costs to write the notification emitting component (which is not always a simple database polling option).
Opportunity: (or is it opportunism :-)). This issue is somewhat related to cost. When things start to break and the business starts to worry, there is hardly any time to bring in a proper mediation layer. Further, if the business didn’t want to pay for something like Biztalk early on (because they were convinced that all they needed was the web services) they are hardly likely to change their minds and pay for it at the end. In our case it helps that the hub is existing infrastructure so we are plugging everything into it early and routing through it for all messages so there’s no last minute purchasing considerations to worry about.
Composition (vs.) Mediation: Depending on your needs, providing a joined up processing could take the form of Process Composition or just an ESB based mediation /routing. If all you need to do is inform other systems of an occurrence in one system, publishing a notification to a bus and transmitting from there will work fine. If the clients (like our portal) actually want to compose the data and view & act on it as a single unit, then elements like orchestrations (with various different patterns) come into play. This is where option 1 is more suitable.
Management & Operations: it’s a heck of a lot more work to try and track messages in option 2 than it is in option 1. Even if you call out to the BAM Interceptor (in the Biztalk world) from custom components, it will be a significant effort to track all the way. Option 1 would be much easier because the hub is in control right from the start.
The nature of the consumer: I used the term mash-up rather loosely when I described my portal above. IMO mash-up refers to more of a read-only approach and so you can use various techniques to pick up the data and present it but when the consumer is a transactional system (to an extent) and when the data elements on the portal (in various portlets) are linked in some way, then things are not so easy. Dealing with a set of business services in one place is much better than pointing all over the place to get/set data.
Performance: This is a big one especially when you consider that SOAP over HTTP doesnt make for rocket speed. Calling one web service can be bad enough but passing through one and onto another can be painful. I’ll post about some considerations here and where we have got to in my next post.
Our conclusions and the present: Congratulations if you have come this far! It may seem obvious as to what our conclusion was – we went for option 1. It has proved to be correct (as I knew all along :-)) because the core business service that links System A and B is now extending to pull in assorted data sets from related systems and display them in the portal as well as managing changes to those data sets. For the portal, there is no change at all (except new element blocks being added to the service schemas that relate to distinct segments on the UI) and managing them is much easier. The service layer which was initially a pass through is, as anticipated, much more solid. System B still needs only pass through maps because we designed the API to be very granular but we can handle any changes easily. System A, being an existing one, has an API that’s more fine grained so it’s easy to see where the responsibilities of all the components lie.
So, those are my opinions on the value and location of the services layer. I hope it was a good (and maybe even vaguely useful) read. I’d love to hear your thoughts on this issue and how you make similar decisions and if you have successfully done a distributed SOA and would like to share your thoughts and pointers to where I can learn more about it, do let me know.
Subscribe to comments with RSS.