6 April 2010

Before I write what I write before the next time I write

It seems my last poll revealed that there are still people in the library world who hasn't rejected me, or, perhaps a stronger theory, likes to watch road accidents. So my next piece is being written about why the library world fails so badly at technology and seeing the future (or even their own relevance to it), but I'm somewhat busy these days with real work, so a few more days, ok?

However, all is not lost. I've got a few things to say about, well, the stuff I work with, that bucket I stick my head in every day to see if the crap I put in it yesterday has turned to gold yet. No luck so far.

There's a peculiar discussion going on in the Semantic Web mailing-list at the W3C, of which Bernard Vatant will fill you in. It's funny to watch; where are the success stories, where is the commercial viability, does it even work in academia, has it got traction, what do we do now? Why aren't more people doing it? Why haven't the world adopted this specific and undoubtedly brilliant world-view yet? Are we all mad!?

I'm sure you can fill in with our own Topic Maps echo here, but the more you dig, the more you discover that most of the sillies put up as a reason or a scapegoat for the lack of world dominance are things that, frankly, the Topic Maps community have figured out long ago, and some of those missing features in their world is a dominant feature in ours. And we haven't taken over the world, either. Bummer.

It's frustrating, I know, but what can we do? There's no amount of technology suave that can beat any status quo that feeds upon itself. No new ideas can beat old ones that seem to work, because, well, the definition of "works" is so multi-faceted and complex and, eh, making lots of money for lots of people. Semantic Web and Topic Maps doesn't make lots of many. Heck, they don't make money, period. They're convenient little technologies that will stay small and insignificant.

I have a plan, though, and it will piss off some of the Topic Maps purist (or, let's face it, even pragmatists) and hopefully some Semantic Web people as well. First, I'll rename it something cool - maybe something like NoSQL or something - and then rename the integral concepts, strip away the jargon, and make it web-friendly by injecting it straight into HTML5 based technology, and relate all queries through SQL. Mwuahaha, I might even throw some REST API's in there, just to stir it up some more. And I shall call it ; the web.

Man, I hate these technical wars over standards and ways of doing things. The thing I love about Topic Maps isn't the standard or the specs. No, it's the thinking I'm forced to do in rejecting some parts, while loving others. It's what I take from it. It's the epiphanies it yields.

NoSQL? Semantic Web? Topic Maps? SQL? They're all just abstract interfaces into a set of memory positions shaped by various registers, stacks and pops. Standardizing our ways is just a step on the ladder of the future, not a platform upon which we have to stand firm.

Anyway, the whole NoSQL thing is something I'll have to write about more later. Right now dinner and kids and cleaning the house beckons.

Labels: , , ,

20 March 2009

Ressurection : xSiteable Framework

I've just started in my new job (yes, more on that later, I know, I know) and was flipping through a lot of my old projects over the years, big and small, and I was looking for some old Information Architecture / prototyping tool / website generator application I made with some help from IA superstar Donna Spencer (nee Maurer) back when I lived in Canberra, Australia.

I found three generations of the xSiteable project. Generation 1 is the one a lot of people have found online and used, the XSLT framework for generating Topic Maps based websites. I meant to continue with generation 2, the xSiteable Publishing Framework (which runs the Topic Maps-based National Treasures website for the National Library of Australia) but never got around to polishing it enough for publication, and before I came to my senses I was way into developing generation 3, which I now call the xSiteable Framework (which sports a full REST stack, Topic Maps. And yes, I'm still too lazy to polish it enough for publication (which includes writing tons of documentation), at least as of now, but I showed this latest incarnation to a friend lately, and he said I had to write something about it. Well, specifically how my object model is set up, because it's quite different from the normal way to deal with OO paradigms.

First of all, PHP is dynamic, and has some cool "magic" functions in the OO model which one can use for funky stuff. Instead of just extending the normal stuff with some extras I've gone and embraced it completely, and changed my programming paradigms and conventions at the same time. Let's just jump in with some example code;
// Check (and fetch) all users with a given email
$usercheck = $this->database->users->_find ( 'email', 'lucky@bingo.com' ) ;
Tables are are contextually defined in databases, so $this->database->users points directly to the 'users' table in the database. (Well, they're not really table names, but for this example it works that way) The framework checks all levels of granularity, and will always return FALSE or the part-object of which you want, so for example ;
// Get the domain of a users email address
$domain = $this->database->users->ajohanne->email->__after ( '@' ) ;
Again, it's like a tree-structure of data, a stream of granularity to get in and out of the data. This does require you to know the schema (and change the code if you change the schema), but apart from that, in a stable environment, this really is helpfull (it's also cached, so it's really fast, too).

You might also have noticed ... users->ajohanne->email .... Where did that "ajohanne" bit come from? Well, as things are designed, again the framework will try to find stuff that isn't already found, so "ajohanne" it will automatically look up in designated fields. All objects that extend the framework have two very important fields, one being the integer primary identifier, the second one the qualified unique name (so not a normal name as such, but a most often a computer generated one that isn't normally a number. Often systems will use things like a username, say, as a qualified name, and hence "ajohanne" was my username in one such system). Why do this?

Well, PHP is dynamic, so in my static example above, explicitly using 'ajohanne' as part of the query, isn't the best way to go in more flexible systems, but just pop your found user in dynamically instead;
$domain = $this->database->users->$username->email->__after ( '@' ) ;
Easy. And this applies to all parts of the tree, so this works as well ;
$domain = $this->database->$some_table->$some_id->$some_field->__after ( '@' ) ;
No, from the two examples above we might see a different pattern, too. All data parts has unrestrained names, all query operations use an underscore, and all string operations uses two underlines. (__after is a shortcut for substr ($str, strpos ( $str, $pattern ) ), and I've got a heap of little helpers like this built in ) Through this I always know what the type of the object interface is, and with PHP magic functions these types are easy to pull down and react to. As some of my objects are extendable, I need to pass _* and __* functionality up and down the object tree.

Traditionally, we use getters and setters ;
$u = $obj->getUsername() ;
$obj->setUsername ( $u ) ;
I turn them all into properties, so ;
$u = $obj->username ;
$obj->username = $u ;
But they are still full internal functions to the object, and this is another magic function in PHP ;
class obj extends xs_SimpleObject {
function getUsername () {
function setUsername ( $value ) {
The framework isn't just about object persistence. In fact, it is not about that. I hate ORMs in the sense that they still drag your OO applications back into the relational database stoneage with some sugar on top. In fact what I've done is to implement a TMRM model in a relational database layer, so it's a generic meta model (Topic Maps) driving that backend and not tables, table names, lookup tables, and all that mess. In fact, crazy as it sounds, there's only four tables in the whole darn thing. I'm relying on backend RDBM systems to be good at what they should be good at; clever indeces, and easier joins in a recursive environment (which, when all data is in the one table, it indeed is recursive), where systems use filters to derive joins instead of doing complex cross-operations (which takes lots of time and resources to pull off, and is the main bottleneck in pretty much any application ever created which has a database backend.

A long time ago I thought that the link between persistent URI's for identity management in Topic Maps and the URI (and using links as application state) in REST were made for eachother, and I wanted to try it out. In fact, that fact alone was the very inspiration for me to do the 3rd generation of xSiteable, hacking out code that basically has one URI for every part of the Topic Map, for every part of the TM API, and for other parts of your application. Here's some sample URIs ;
At each of these there are GET, PUT, POST and DELETE options, so when I create a new prospect, it's a POST to http://mysite.com/prospect or a direct PUT to http://mysite.com/prospect/[new_id], for example.

All in all, this means I have many ways into the system and its data, none of them more correct than the other as they all resolve to topics in the topic map. This lowers the need for error checking greatly, and the application is more like a huge proxy for a Topic Map with a REST interface. It's a cute and very effective way of doing it. I'm trying various scaling tests, and with the latest Topic Maps distribution protocols that I can use for distributing the map across any cluster, it's looking really sexy (although I still have some work to do in this area, but the basics rock!).

Anyway, there's a quick intro. I guess I should follow this up with some more coded details of examples. Yeah, maybe next week, as I need to get some other stuff done now, but I like the object model I've got in place, and it's so easy to work with without losing the need for complex stuff. Take care.

Labels: , , , ,

7 September 2007

REST and SOA as a process for application design

I'm going to stray a bit from the library theme, and talk about design of RESTful SOA. It's a topic close to my heart, as most SOA talk these days are full of vendors claiming money can buy you not only love, but immortality. With SOAP? Hah!

No, I think reinventing what the Web does really well already is a) a waste of time, b) doomed to make a bad copy (as the web is constantly moving, while the SOAP / WS-* stack is immersed in slow-moving standards), and c) over complicating things (I like elegant simplicity such as the innards of the Web).


Roy Fieldings' REST dissertation has swooped upon the middle and higher layers of the IT world lately, making a lot of them admit that, perhaps, this whole deal about using HTTP and loose XML (often XHTML) to create scalable, fast, simple and dynamic applications (well, as an architectural style, to be specific) might have something going for it. REST has been around for a long while, being the very fabric of what the internet is based on, slowly extended and refined over the last years 15 years (even though a lot of these concepts are again based on earlier technology).

Service Oriented Architecture (SOA) is a little bit tricker to define, especially these days when big corporations have discovered and use it as a buzzword, but basically it is technical architecture creating loosely coupled (meaning; the items in question knows very little of each other) services, and where a service is a piece of software that some other piece of software might use (as opposed to direct human usage). Now, a lot of people already talk about this stuff, so I'm not going to add to that. I'd rather talk about what I think when I do this stuff, to talk about actual implementation.

Working in both these two worlds, putting them together to design and create applications, is quite different from the normal software development processes that's so popular these days. The most striking difference is that during application design you think in terms of resource orientation (as opposed to object orientation, or functional design) and how to represent services (as opposed to a program, or a module).

You can either plan a big-bang approach to this (standard waterfall models) per service, or you timebox a more agile approach of creating one or several services that does the simplest thing needed to service your proposed application. The world spins around the axis of identifying application to solve problems; let's turn things around (and this is a big part of SOA) and see if we can come up with services that solves problems instead.

Typically you have a sleigh of applications that all have common functionality, such as user management, database storage, configuration, session handling, search and a few other bits and pieces depending on the business you're in. There's many ways to deal with reuse of these "things", and I deliberately call them "things" at this stage, because as soon as you call them "modules", or "libraries", or "reusable code" you're setting the scene for quite implementation specific stuff, such as what language you're going to use, or what platform it runs on. I don't want to deal with "libraries" for example, because if some library is written in Java then I need to make my other solutions in Java, too. If I have a "module" that does X in Windows using C#, the chance that "module" is linked to that technology is quite high.


No, I want to talk about "things". For example, let's talk about users. A lot of applications deal with users in some way or another, whether it's displaying information about them, for them, authenticating them, create properties on them, or otherwise work with their user data. How can we create a service that applications might have good use for?

Since we meddle here in all things REST, the first thing we do is to think of the service in terms of resources (as being resource oriented is extremely important; expose URIs for every resource, as small / atomic as need be). I usually create two sub domains to hold services, one for internal behind the firewall services (soa.domain) and one external (ws.domain; 'ws' for web services), and I also try to have a trim set of basic elements that express generic functionality (search, user, session, database, properties, etc) wrapped in an even smaller and more generic set of domains (x, y, z, a, o, a, etc.). Through this, the first part of my design process is to play around with URIs and hierachial taxonomical ideas to see what feels right ;


Balance this with ideas on premature optimization (what, you thought that was axiomatically bad? It's allowed to think about these things, you know :) in terms of request times for a domain (the more domains involved in a series of calls, the longer the overall response time, generally speaking) and what feels right.

In my case, the first one seemed the most right. I've developed a small set of root categories in which I "place" my services, such as /search, /publishing, /identity, and so on. These categories are not canon; they are placeholders for loose ideas and thoughts, bound to change in the future as your SOA evolves.


Evolution in your SOA is very important, so you should design for it in mind. For example, what about version control of services? Some talk about versioning being part of the XML schemas that services deliver, others talk about content negotiation (crazies :). I take a rather pragmatic and somewhat naughty approach (in the sense that you shouldn't put semantics in your URL's which humans will look at and try to pry apart and use / misuse) and put versioning into the URL at the base of the service defined. For example ;


I also set a rule to service development ; maintain backwards compatibility as far as you can. There's no need for an ever update to the version number if you design your XML schemas that pass through them in smart ways, and this reduce the overhead of deployment, introspection and dependency. Another rule to service digestion is to only react to what you understand, and ignore all that you don't; this again enables backwards compatibility as you, say, add a new (but non-critical) element to your metadata which older service users don't understand and simply ignore.

For proper development of a RESTful SOA, though, I'd suggest two things as a minimum ;
  1. use test-driven development for the service definitions (and use whatever methodology you like for the actual code for the service, although test-driven there too won't hurt you), so write your tests for your service (I use XPath with XSLT scripts for this) first and then develop the actual service until it passes all tests, and
  2. collect your services' tests into a large test suite ; whenever you add, subtract or change a service, make sure all tests pass. (If you can sneak this into a build farm of sorts, all the better. Automation for this type of development will probably save a lot of gray hairs) Through this you know what breaks and what's backwards compatible with your changes across the whole SOA. Don't deploy anything from development into test or production unless all tests pass. This is not a trivial task, and should be in the hands of someone who is full-time responsible for the SOA's well-being.
Now, in evolution of SOA's as well as in nature, don't be afraid of screwing things up. We don't want perfection. We will never get perfection. And we certainly won't get anything near it in the first go. All these services must be allowed to change over time, dramatically at first, even to the point of deleting it completely, and start from scratch making something different. (In fact, I'd advocate making all these first-generation services with version number /v0-ALPHA/ in all caps, as in http://soa.example.com/identity/user/v0-ALPHA[/{userid}] ; this will mark them as experimental and trigger other developers to tread gently. If they worked great, just update them to a /v1/ version)

Time management of this development is also important. Because services must be allowed to break, be allowed to screw up, we must also allocate time for these screw-ups to happen. Trust me, it's a good thing ; a smaller failure now ensure we don't screw up big time later. (And this very point is probably the cause of so much bad management and so many failed [enterprise] projects as it's very easy to overlook or not taken seriously enough. I can write a whole book on this topic alone!)

And people who have some sort of ownership of a service (as developers, or analysts, or whatever) must be given time for short iterative development, for little updates, modifications and tweaks. Services won't be successful if you treat them as small bangs (meaning; gather requirements, write spec, make it, sign it off), and probably only can work through continuous tinkering. Such tinkering doesn't have to be time-consuming nor difficult to manage, but it does require you to plan for it. When Bob goes on to his next project, remember that he's also needs a half-day per week to tweak and fiddle with his service.


One feature that I can't emphasize enough is service introspection, an area that most writers I've seen gloss over. And sure, you don't need it in order to create a SOA or a web service. But I'll assert that you need one if you're a) smart and b) want to create a healthy SOA that can stand the test of time.

Introspection in my world does three important things ;
  1. Handle the client state through hyperlinks (part of the REST paradigm)
  2. Documentation of interface, use and dependencies
  3. Provide test suite
Asking a service for introspection in my world goes something like this ;


or, if you want to split the three up ;


1. Handling state of a client through hyperlinks is a somewhat forgotten part of REST, which is easy to miss when your design is at an early stage (and it usually stays that way because you don't think you need it by the stage you're made aware of it). It basically comes down to either URL-driven or FORM-driven hyperlinks that takes you from whatever state the current URL gave you to the next one. For example, a resource soa.domain/search?q=fish might give back a list of URL's to pages of results, or a form to do a sub-search, all documented through hyperlinks. I personally think the use of XHTML is good for this, but a bit more formal and equally elegant is the use of the Atom Publishing Protocol (not to be confused with the Atom Syndication Format).

2. Documentation is important, and could be as easy as just returning an XHTML page with some text about what it is, how to use it, and so forth. However, I see a major part of documentation as to what dependencies the service has got, so I've got a section that looks a bit like this ;

<ul id="SOA-dependencies">
<li><a href="http://soa.domain/some_service/v2">Some service</a></li>
<li><a href="http://ws.google.com/wdsl/service/1.0">Some Google service</a></li>

Notice that this is perfect XHTML. All that's required to understand this list is understanding the identifier for the list, the "SOA-dependencies", which I can locate easily through DOM or XPath. Through this mechanism in services you can now map the whole dang thing, plot in your dependencies, check it against your test suite (talked about earlier) for ultimate coolness and power.

In this section I might add that I often incorporate a ping parameter which testing and monitoring systems can use to check the health of a running SOA, something like ;


or, if you've got the RESTful chutzpah required, use the HTTP method OPTIONS instead of a GET on a URL. I actually do both. The HTTP response code hence talks about the generic health of the service as far as it knows, and you can use this info not only for monitoring and testing, but also for automatic systems and smart clients.

3. It may seem a bit strange to ask a service to give you a test-suite, but it actually is a very encapsulating and clever thing to do, making sure that tests are all handled at the same place where development takes place. I can do ;


and I'll get back something like this ;

<test name="My first test"
is-true="2456325786234985" />
... [more tests here]

Basic test-case skills are probably a plus at this point to understand what this is about, but basically we assert that the XML/XHTML that the URL returns will give the result "2456325786234985" when the XPath expression "/response/item[@name='user']/id" is run.

Your testing framework for the SOA simply collects these test files at intervals to build a larger test-suite that stands as the controller for the whole system.


Just a few finishing thoughts about rigidity, complexity and management of a RESTful SOA ;

If you don't have dedicated SOA people, then don't do it. If your people (developers, analysts, managers) aren't very flexible, then don't do it. If you don't understand REST, either really learn it (this book is the best there is on this subject!), or don't do it. If you think you need complex systems, don't do it. If you can't wrap your head around resource-orientation, then don't do it.

The thing is, you can perfectly well live without it, create SOA or some other well-meaning version of that concept with SOAP/WS-*/BPEL/ESB or whatever big vendors are more than happy to help you with. You can create POX services just fine. You won't be RESTful, but you will probably survive without it. You don't need it in as much as you can live on only water and bread for years and years, but of course I wouldn't recommend it. :)

Anyways, a few thoughts there on RESTful SOA design and implementation. I haven't digged into the semantics of modeling a full SOA yet, nor talked much about pipeline XML schemas (although the APP protocol is a good hint), system introspection through things like WADL, or even the hidden benefits of ROA (resource-oriented architectures). So. More to come, then. Until then, happy hacking.

Labels: , , ,

28 March 2007

Quiet, oh so quiet ...

For those who wonder why I've been so quiet of late should go read this Lorcan Dempsey bit. Now you also know why I might be a bit quiet for a little while as well, as there's big deadlines and interesting stuff coming up. Watch this space.

Labels: , , , ,