6 April 2010

Before I write what I write before the next time I write

It seems my last poll revealed that there are still people in the library world who hasn't rejected me, or, perhaps a stronger theory, likes to watch road accidents. So my next piece is being written about why the library world fails so badly at technology and seeing the future (or even their own relevance to it), but I'm somewhat busy these days with real work, so a few more days, ok?

However, all is not lost. I've got a few things to say about, well, the stuff I work with, that bucket I stick my head in every day to see if the crap I put in it yesterday has turned to gold yet. No luck so far.

There's a peculiar discussion going on in the Semantic Web mailing-list at the W3C, of which Bernard Vatant will fill you in. It's funny to watch; where are the success stories, where is the commercial viability, does it even work in academia, has it got traction, what do we do now? Why aren't more people doing it? Why haven't the world adopted this specific and undoubtedly brilliant world-view yet? Are we all mad!?

I'm sure you can fill in with our own Topic Maps echo here, but the more you dig, the more you discover that most of the sillies put up as a reason or a scapegoat for the lack of world dominance are things that, frankly, the Topic Maps community have figured out long ago, and some of those missing features in their world is a dominant feature in ours. And we haven't taken over the world, either. Bummer.

It's frustrating, I know, but what can we do? There's no amount of technology suave that can beat any status quo that feeds upon itself. No new ideas can beat old ones that seem to work, because, well, the definition of "works" is so multi-faceted and complex and, eh, making lots of money for lots of people. Semantic Web and Topic Maps doesn't make lots of many. Heck, they don't make money, period. They're convenient little technologies that will stay small and insignificant.

I have a plan, though, and it will piss off some of the Topic Maps purist (or, let's face it, even pragmatists) and hopefully some Semantic Web people as well. First, I'll rename it something cool - maybe something like NoSQL or something - and then rename the integral concepts, strip away the jargon, and make it web-friendly by injecting it straight into HTML5 based technology, and relate all queries through SQL. Mwuahaha, I might even throw some REST API's in there, just to stir it up some more. And I shall call it ; the web.

Man, I hate these technical wars over standards and ways of doing things. The thing I love about Topic Maps isn't the standard or the specs. No, it's the thinking I'm forced to do in rejecting some parts, while loving others. It's what I take from it. It's the epiphanies it yields.

NoSQL? Semantic Web? Topic Maps? SQL? They're all just abstract interfaces into a set of memory positions shaped by various registers, stacks and pops. Standardizing our ways is just a step on the ladder of the future, not a platform upon which we have to stand firm.

Anyway, the whole NoSQL thing is something I'll have to write about more later. Right now dinner and kids and cleaning the house beckons.

Labels: , , ,

20 March 2009

Ressurection : xSiteable Framework

I've just started in my new job (yes, more on that later, I know, I know) and was flipping through a lot of my old projects over the years, big and small, and I was looking for some old Information Architecture / prototyping tool / website generator application I made with some help from IA superstar Donna Spencer (nee Maurer) back when I lived in Canberra, Australia.

I found three generations of the xSiteable project. Generation 1 is the one a lot of people have found online and used, the XSLT framework for generating Topic Maps based websites. I meant to continue with generation 2, the xSiteable Publishing Framework (which runs the Topic Maps-based National Treasures website for the National Library of Australia) but never got around to polishing it enough for publication, and before I came to my senses I was way into developing generation 3, which I now call the xSiteable Framework (which sports a full REST stack, Topic Maps. And yes, I'm still too lazy to polish it enough for publication (which includes writing tons of documentation), at least as of now, but I showed this latest incarnation to a friend lately, and he said I had to write something about it. Well, specifically how my object model is set up, because it's quite different from the normal way to deal with OO paradigms.

First of all, PHP is dynamic, and has some cool "magic" functions in the OO model which one can use for funky stuff. Instead of just extending the normal stuff with some extras I've gone and embraced it completely, and changed my programming paradigms and conventions at the same time. Let's just jump in with some example code;
// Check (and fetch) all users with a given email
$usercheck = $this->database->users->_find ( 'email', 'lucky@bingo.com' ) ;
Tables are are contextually defined in databases, so $this->database->users points directly to the 'users' table in the database. (Well, they're not really table names, but for this example it works that way) The framework checks all levels of granularity, and will always return FALSE or the part-object of which you want, so for example ;
// Get the domain of a users email address
$domain = $this->database->users->ajohanne->email->__after ( '@' ) ;
Again, it's like a tree-structure of data, a stream of granularity to get in and out of the data. This does require you to know the schema (and change the code if you change the schema), but apart from that, in a stable environment, this really is helpfull (it's also cached, so it's really fast, too).

You might also have noticed ... users->ajohanne->email .... Where did that "ajohanne" bit come from? Well, as things are designed, again the framework will try to find stuff that isn't already found, so "ajohanne" it will automatically look up in designated fields. All objects that extend the framework have two very important fields, one being the integer primary identifier, the second one the qualified unique name (so not a normal name as such, but a most often a computer generated one that isn't normally a number. Often systems will use things like a username, say, as a qualified name, and hence "ajohanne" was my username in one such system). Why do this?

Well, PHP is dynamic, so in my static example above, explicitly using 'ajohanne' as part of the query, isn't the best way to go in more flexible systems, but just pop your found user in dynamically instead;
$domain = $this->database->users->$username->email->__after ( '@' ) ;
Easy. And this applies to all parts of the tree, so this works as well ;
$domain = $this->database->$some_table->$some_id->$some_field->__after ( '@' ) ;
No, from the two examples above we might see a different pattern, too. All data parts has unrestrained names, all query operations use an underscore, and all string operations uses two underlines. (__after is a shortcut for substr ($str, strpos ( $str, $pattern ) ), and I've got a heap of little helpers like this built in ) Through this I always know what the type of the object interface is, and with PHP magic functions these types are easy to pull down and react to. As some of my objects are extendable, I need to pass _* and __* functionality up and down the object tree.

Traditionally, we use getters and setters ;
$u = $obj->getUsername() ;
$obj->setUsername ( $u ) ;
I turn them all into properties, so ;
$u = $obj->username ;
$obj->username = $u ;
But they are still full internal functions to the object, and this is another magic function in PHP ;
class obj extends xs_SimpleObject {
function getUsername () {
...
}
function setUsername ( $value ) {
...
}
}
The framework isn't just about object persistence. In fact, it is not about that. I hate ORMs in the sense that they still drag your OO applications back into the relational database stoneage with some sugar on top. In fact what I've done is to implement a TMRM model in a relational database layer, so it's a generic meta model (Topic Maps) driving that backend and not tables, table names, lookup tables, and all that mess. In fact, crazy as it sounds, there's only four tables in the whole darn thing. I'm relying on backend RDBM systems to be good at what they should be good at; clever indeces, and easier joins in a recursive environment (which, when all data is in the one table, it indeed is recursive), where systems use filters to derive joins instead of doing complex cross-operations (which takes lots of time and resources to pull off, and is the main bottleneck in pretty much any application ever created which has a database backend.

A long time ago I thought that the link between persistent URI's for identity management in Topic Maps and the URI (and using links as application state) in REST were made for eachother, and I wanted to try it out. In fact, that fact alone was the very inspiration for me to do the 3rd generation of xSiteable, hacking out code that basically has one URI for every part of the Topic Map, for every part of the TM API, and for other parts of your application. Here's some sample URIs ;
http://mysite.com/prospect/12
http://mysite.com/api/tm/topics/of_type:booking
http://mysite.com/admin/db/prospects
At each of these there are GET, PUT, POST and DELETE options, so when I create a new prospect, it's a POST to http://mysite.com/prospect or a direct PUT to http://mysite.com/prospect/[new_id], for example.

All in all, this means I have many ways into the system and its data, none of them more correct than the other as they all resolve to topics in the topic map. This lowers the need for error checking greatly, and the application is more like a huge proxy for a Topic Map with a REST interface. It's a cute and very effective way of doing it. I'm trying various scaling tests, and with the latest Topic Maps distribution protocols that I can use for distributing the map across any cluster, it's looking really sexy (although I still have some work to do in this area, but the basics rock!).

Anyway, there's a quick intro. I guess I should follow this up with some more coded details of examples. Yeah, maybe next week, as I need to get some other stuff done now, but I like the object model I've got in place, and it's so easy to work with without losing the need for complex stuff. Take care.

Labels: , , , ,

10 October 2008

Keynote speaking at TMRA 2008

Oops, I totally forgot to mention to the world that I'm the intro keynote speaker at the TMRA 2008 conference (one of two yearly Topic Maps conferences each year) in Leipzig next week (15-17 October). My talk is titled "We're all crazy - subjectively speaking" and will contain at least one bad joke, two pretty good ones, some philosophical ranting and hopefully lots of community building. I really, really hope to see you there; find me, say hello, let's have tea and discuss whether my two jokes really were good or not.

The big question is, how did I forget to tell you about this? I'll let you know that in a few days time or so.

Labels: , , , , ,

29 April 2008

What's wrong with Topic Maps people and their tools?

Hoo boy, I'm feeling a bit testy today, but seriously; how many who read this blog know the difference between an association type, an association role type, and a reified topic type? When we humans try to put to "words" - in a Topic Map, as we're dealing with here - our various models, do you know when to create a new topic type, how to use the role type in multi-member association of given types, what a "type" really is, and what an "occurrence" of anything really is all about? And - more importantly! - do you really need to know?

The Topic Maps paradigm takes a good while to sink into your brain. It takes a few goes and lots of patience to get what it is really all about, and when you get there you discover - surprise, surprise! - that "topic maps" are irrelevant to the real nugget of wisdom within; shareable models.

Here's a model of me. (If you can see it properly, ) Does it seem right to you?

No, not the glossy models on the covers of magazines, but the kind of models the way the brain tries to shape an immensely complex reality for a brain than can only hold bits of small distorted representations at a time. Here's a piece of this, a representation of that, held together with some contextual duct tape, and Voila! you've got some "kinda" understanding of how the world works. Kinda.

The world does not want to be modeled, and anyone who has ever attempted ontology work (another phrase that means everything and nothing at the same time, and - of course - I use all the time!) soon discover how impossibly crazy it is to model anything to any degree of correctness. We, as human beings see this infinite tool of modeling our world, and falls promptly into the trap of actually trying it out. In some ways the amazing things you can do with Topic Maps is its own death! Don't try to model anything to any degree of detail; there's crazy in there.

We model all the time, but until I learned all the ins and outs of Topic Maps I didn't have the words for what I was doing, nor did I have the knowledge that I was even doing it. But I was, and I still am. You are, right now even by just reading these words. Just by writing this blog posting I'm modeling my opinion, phrasing it through prose, interacting with my computer, and posting it contextually, all in the hope of you understanding this model. I have a purpose, a model in my head if you will, of what this message is and how it should come across. The fun is, of course, that no matter how accurately and carefully I try to communicate the model in my head to you, every single person who ever comes across these words will have their own unique special - and, dare I say, still correct - version of it. Every single sound, vision, thought and feeling ever expressed is uniquely interpreted.

So, maybe I describe my model to the best of my ability in an attempt to remove tacit knowledge surrounding my message. Maybe I say things like "I use my blog to post this message, choosing a title that might catch your attention, and I'm talking about more important issues than a selected technology; human perception, communication between us in a way that is as close to truth as we can hope to achieve, that what we all do is communicate models from our brain to the hopefully next one. And of course, Topic Maps is one really good way of doing this." That's my mental model for writing this stuff, but you can be quite sure I've forgotten to say bucket loads of stuff, neatly hidden away in prose, innuendo, lost context and the mere fact that my brain is impossible to model, little less understand. These are tremendously complicated things that are, by their very nature, bloody impossible to get to! It just can't be done!

But, because it looks like it has some element of truth to it, we pursue it as if it is the truth. It looks like we can model things well, so we try to model things well. Ouch. Bummer. No, don't do that; you need to have less precision in your models for them to actually be taken seriously, at least by people. But let's talk about computer to computer communication. Surely there we can have precision?

This is the area which for me really brings Topic Maps to its right, where we - as digital communicators - create artificial models which we can attach our data to, and share around. Forget all this human knowledge stuff; if anyone who is to model their idea need to know what an association type is, forget it. Sure, we allude to what it might be, hold a course in it, or explain it on end, and then we throw ourselves heavy into the Topic Maps Data Model and explain that it is just another model that's mapped to the Topic Maps Reference Model, which is a model which is framed in frames theory (or thereabouts), which is a model of key/value pairs in a table setting, which is a model of simplistic systems, which is in a computer model setting, which is a digital model of ... and so on. Models, models, models. And you know what? There's translation and interpretation at every single step of the way.

Steve Pepper claims that Topic Maps are great because Topic Maps are closer to how the brain works than other means of mapping information, and I agree with him, but only in the "sure thing" kind of way, not the "correct" kinda way. The human brain doesn't think about how it models things. In it, there are no kinds, or types, or roles, or occurrences, or reification, or identifiers of any kind. So let's agree that Topic Maps, at the moment, is perhaps the closest we've got standardized right now that somewhat is closer to the way the brain works in computer terms.

Translations. That's what it's all about. I say "hei", you say "hi", she says "hola", he says "yo." Some use Java, some use C#, some PHP or Perl. Business people use business speak, designers talk the talk, programmers code. There's translations up and down and back and forth between them all, and then some. And wherever there is translation, there's a margin of error that gets higher the further the translated models are from each other. No wonder there's problems in the world.

As technologists we care a lot about these things, and from a technical standpoint, Topic Maps kicks ass! Seriously! But I've worked extensively with Topic Maps over the years, and if there's one thing I've learned it is that people don't give a toss about what the underlying technology is, and they certainly don't care what type or association concept any relationship might be. When people perk up about Topic Maps, it's not because of the data model but it's because of some underlying modeling ideas, that promise of sharing models and bring tacit knowledge up from traditional taxonomic and document-centric ways of dealing with "knowledge" in computer systems. They do not care about how to "map" their concepts, and seriously, don't care whether the modeling tools are standardized, shareable or not. They care about the models, for sure, but only as a conceptual thing they wish to share. So let's forget the technology and the data model and Topic Maps, because it really doesn't matter.

What matter is that communication happens, and it doesn't happen on the Topic Maps Data Model level. There is a higher, sloppier, fuzzier level that we humans live on, and that's the model we should try to get closer to. Sure, we can create cool systems using the Topic Maps standard to do all this, but it sucks as a platform of expression! It really, really sucks! Try right now to model the simple concept which is "freedom"; what are your topics, associations, role types, occurrences? One can appreciate that we can try, but there's no one answer to do this. The TMDM does no support human thinking, don't let yourself be fooled, it can only represent some misguided attempt at jotting it down in some machine-exchangeable way.

Topincs, Omnigator and other tools we Topic Mappers give to people who are to model things are a thing that only a technologist can love, and, in addition, a technologist who understand all the ins and outs of the Topic Maps Data Model. This is a huge limitation! People, the real group we have been trying to sell this concept to for years, just can't wrap their head around the Data Model to such a degree as to conceptualize their models! It's lunacy to think so, but of course, the Topic Maps community is chock full of technologists, so that's kinda expected. But I really wished that we had outside help. The Data Model is there for technologists and tool-makers, not people.

Can we rethink this part of the problem? I'm often embarrassed to give these tools to people, as it is extremely counter to claims we lay down to the greatness of Topic Maps. Don't get me wrong; TM is fantastic stuff, but the tools sucks for normal people, and by "normal people" I mean almost anybody but us.

We need to be even more human in our approach to knowledge representation. Topic Maps is a good foundations to build our systems on, but it sucks as a knowledge representation system for humans. What can we do?

Labels: , , , ,