Monday, November 12, 2007

Backwards compatibility in services

With all the hoopla over web services (including WS-* and REST varieties) and asynchronous messaging, there seems to be little discussion on how to make these services (I'm using the term very generically here) compatible with existing service consumers. This issue is really not a problem for vendors or tools providers to solve, as it really pertains to the problem domain at hand and how the engineers and architects want to manage it. In other words, it's yer own damn problem.

After speaking with engineers and architects about how they approach service evolution, there seem to be some approaches that are common:
  • the no-harm, no-foul approach
  • the big-bang approach
  • the multiple service instances approach
  • the maintain-it-yourself approach
First up, the no-harm, no-foul approach (a/k/a close-your-eyes-and-hope-everything-is-cool) updates the services and expects the clients to "just work". Funny, life just doesn't work out the nicely. What frequently happens is that after after the new service code rolls into qa (or, god forbid, production), clients start breaking - and then it's either revert the service or update the rogue clients.

Second, the big-bang approach is to update the service and clients at the same time. While certainly handy, you need to be in control of both the service and all of the clients, then have qa and systems teams who can accommodate this approach. If all the clients are internal, this might not be a bad approach necessarily, but that point notwithstanding, I've found that either not all the clients get properly updated (either a code migration problem or forgetting to update some client's code) or, more likely, not every client was tested thoroughly.

The multiple service instances approach is perhaps the most deployment I've seen. When you need to release a new version of the service, just deploy a new web-app with the updated code and a new URL (typically, a version number will appear in the URL itself). On the surface, this works rather well as old services are not removed (hence, their clients keep on working as they should), but I feel there are two major drawbacks to this. The first is the database. Assuming the service uses a relational database (as most do), database schema are not backwards compatible (unless you really work at it - triggers, synonyms, and so on. Check out Scott Ambler's Refactoring Databases: Evolutionary Database Design; it's truly brilliant). The second problem is resource consumption - memory, disk space, power, etc. While appearing most obviously for small shops, many medium to large enterprises can ignore these problems. However, large industry players like Dan Pritchett, architect at eBay, are now intimately concerned with these issues as they build out more and more redundant data centers.

Lastly, the maintain-it-yourself approach deploys just a single version of the service (the most recent one), and it is able to successfully process and reply to clients of at all version levels. Typically, a service version is indicated somewhere by the client (in the URL, XML namespace, and so on), then the service uses that piece of information of aggregate the incoming parameters as needed in a version-specific manner, perform its normal functions at the current operating version level, and finally return the result to the client in the version they are expecting. This approach clearly has impacts on the engineers that need to create new functionality yet maintain backwards compatibility, and that is the biggest cost. It could be argued that the processing cycles required to calculate requested version, convert up, process, then convert down could be a drag on the performance. I tend to think that with everything else going on in a service, this cost should not be a significant factor - or something else squirrly may be going on. This approach does, however, allow the database to evolve at a natural pace, and this cannot be understated (at least, while we use databases in the manner in which we have traditionally).

As a full disclaimer, I have, consciously or unconsciously, used all four of the these approaches at one time or another - and in the order of discovery as laid out above. After the initial birth of the body of services, I naturally needed to keep adding functionality. After making a whole lot of mistakes (I screwed up the production servers/services/web-app and databases several times, ) in the process of maturing the services, the best solution for my system was to maintain backwards compatibility for all clients. Granted, because we have quite a good view of what/who all the clients of the services are, so we can be a bit more proactive in phasing out the old client implementations.

1 comment:

Pankaj said...

In the "maintain-it-yourself" approach, don't we have the problem of DB not being backward compatible. You can write your service to be intelligent enough to return the result to the client in the version they are expecting. But how will you handle the DB version in your project?