ShelterIt - My digital think-tank

20 March 2009

Ressurection : xSiteable Framework

I've just started in my new job (yes, more on that later, I know, I know) and was flipping through a lot of my old projects over the years, big and small, and I was looking for some old Information Architecture / prototyping tool / website generator application I made with some help from IA superstar Donna Spencer (nee Maurer) back when I lived in Canberra, Australia.

I found three generations of the xSiteable project. Generation 1 is the one a lot of people have found online and used, the XSLT framework for generating Topic Maps based websites. I meant to continue with generation 2, the xSiteable Publishing Framework (which runs the Topic Maps-based National Treasures website for the National Library of Australia) but never got around to polishing it enough for publication, and before I came to my senses I was way into developing generation 3, which I now call the xSiteable Framework (which sports a full REST stack, Topic Maps. And yes, I'm still too lazy to polish it enough for publication (which includes writing tons of documentation), at least as of now, but I showed this latest incarnation to a friend lately, and he said I had to write something about it. Well, specifically how my object model is set up, because it's quite different from the normal way to deal with OO paradigms.

First of all, PHP is dynamic, and has some cool "magic" functions in the OO model which one can use for funky stuff. Instead of just extending the normal stuff with some extras I've gone and embraced it completely, and changed my programming paradigms and conventions at the same time. Let's just jump in with some example code;

// Check (and fetch) all users with a given email
$usercheck = $this->database->users->_find ( 'email', '' ) ;

Tables are are contextually defined in databases, so $this->database->users points directly to the 'users' table in the database. (Well, they're not really table names, but for this example it works that way) The framework checks all levels of granularity, and will always return FALSE or the part-object of which you want, so for example ;

// Get the domain of a users email address
$domain = $this->database->users->ajohanne->email->__after ( '@' ) ;

Again, it's like a tree-structure of data, a stream of granularity to get in and out of the data. This does require you to know the schema (and change the code if you change the schema), but apart from that, in a stable environment, this really is helpfull (it's also cached, so it's really fast, too).

You might also have noticed ... users->ajohanne->email .... Where did that "ajohanne" bit come from? Well, as things are designed, again the framework will try to find stuff that isn't already found, so "ajohanne" it will automatically look up in designated fields. All objects that extend the framework have two very important fields, one being the integer primary identifier, the second one the qualified unique name (so not a normal name as such, but a most often a computer generated one that isn't normally a number. Often systems will use things like a username, say, as a qualified name, and hence "ajohanne" was my username in one such system). Why do this?

Well, PHP is dynamic, so in my static example above, explicitly using 'ajohanne' as part of the query, isn't the best way to go in more flexible systems, but just pop your found user in dynamically instead;

$domain = $this->database->users->$username->email->__after ( '@' ) ;

Easy. And this applies to all parts of the tree, so this works as well ;

$domain = $this->database->$some_table->$some_id->$some_field->__after ( '@' ) ;

No, from the two examples above we might see a different pattern, too. All data parts has unrestrained names, all query operations use an underscore, and all string operations uses two underlines. (__after is a shortcut for substr ($str, strpos ( $str, $pattern ) ), and I've got a heap of little helpers like this built in ) Through this I always know what the type of the object interface is, and with PHP magic functions these types are easy to pull down and react to. As some of my objects are extendable, I need to pass _* and __* functionality up and down the object tree.

Traditionally, we use getters and setters ;

$u = $obj->getUsername() ;
$obj->setUsername ( $u ) ;

I turn them all into properties, so ;

$u = $obj->username ;
$obj->username = $u ;

But they are still full internal functions to the object, and this is another magic function in PHP ;

class obj extends xs_SimpleObject {
function getUsername () {
...
}
function setUsername ( $value ) {
...
}
}

The framework isn't just about object persistence. In fact, it is not about that. I hate ORMs in the sense that they still drag your OO applications back into the relational database stoneage with some sugar on top. In fact what I've done is to implement a TMRM model in a relational database layer, so it's a generic meta model (Topic Maps) driving that backend and not tables, table names, lookup tables, and all that mess. In fact, crazy as it sounds, there's only four tables in the whole darn thing. I'm relying on backend RDBM systems to be good at what they should be good at; clever indeces, and easier joins in a recursive environment (which, when all data is in the one table, it indeed is recursive), where systems use filters to derive joins instead of doing complex cross-operations (which takes lots of time and resources to pull off, and is the main bottleneck in pretty much any application ever created which has a database backend.

A long time ago I thought that the link between persistent URI's for identity management in Topic Maps and the URI (and using links as application state) in REST were made for eachother, and I wanted to try it out. In fact, that fact alone was the very inspiration for me to do the 3rd generation of xSiteable, hacking out code that basically has one URI for every part of the Topic Map, for every part of the TM API, and for other parts of your application. Here's some sample URIs ;

http://mysite.com/prospect/12
http://mysite.com/api/tm/topics/of_type:booking
http://mysite.com/admin/db/prospects

At each of these there are GET, PUT, POST and DELETE options, so when I create a new prospect, it's a POST to http://mysite.com/prospect or a direct PUT to http://mysite.com/prospect/[new_id], for example.

All in all, this means I have many ways into the system and its data, none of them more correct than the other as they all resolve to topics in the topic map. This lowers the need for error checking greatly, and the application is more like a huge proxy for a Topic Map with a REST interface. It's a cute and very effective way of doing it. I'm trying various scaling tests, and with the latest Topic Maps distribution protocols that I can use for distributing the map across any cluster, it's looking really sexy (although I still have some work to do in this area, but the basics rock!).

Anyway, there's a quick intro. I guess I should follow this up with some more coded details of examples. Yeah, maybe next week, as I need to get some other stuff done now, but I like the object model I've got in place, and it's so easy to work with without losing the need for complex stuff. Take care.

Labels: knowledge representation, php, rest, SOA ROA WOA REST SOAP WS architecture, topic maps

20 October 2008

I went to TMRA 2008, and all I got was the best days of my life ...

Update: I've added an embedded version of the slides at the bottom of the post; my cool animations and lots of fonts are wrong, but hey, you can read it at least. :)

Not to put too much sugar in your otherwise fine brew of tea, but being at TMRA 2008 this year was one of the most fantastic experiences I've had so far. Not only did I catch up with some old friends, I met some new ones I know I'll stay in touch with. So much smart and easy-going folks gathered in one place ... I'm surprised it didn't disintegrate in a puff of logic as that there really must be some cosmic law against it. Although, I see the TED conferences still churning out good stuff, so it must be allowed. And yes, I do equate TMRA with TED; it was that great.

This year I was invited to hold the opening keynote speach, which I called "You're all crazy - subjectivelly speaking", a romp on the Topic Maps community, a plea to remember epistemology in all things data modeling, and the message that being "subject-centric" is not a technical feat; it's about social processes and agreement (or, at least, rough understanding of eachother).

I used a few cheap interactive ploys to hold the audiences attention, with making them audibly disagree or agree with certain assertions I made up on the screen. It was very effectice as raising the collective awareness to the issues I was trying to point out, and especially helpful when I needed to point out that there are some things we all disagree with. And not only that, but things we should disagree with.I think people in general thought it was a good speach, and the feedback was great, so thanks to all for that.

I'd like to thank Lars Marius Garshol and Lutz Maicher for inviting and encouraging me, Patrick Durusau, Jack Park (you need a website or blog, mate!) and Robert Barta for just being who you are, and every one else for making me once again believe so strongly that the Topic Maps community is the best thing since recursive properties and frames theory!

I'm sure I'll write more on what went down at TMRA 2008, but right now I need to make porridge for my kids. Later.

You Are All Crazy Subjectivaly Speaking (Uploaded)

View SlideShare presentation or Upload your own. (tags: topicmaps topic)

Labels: conference, tmra 2008, topic maps

10 October 2008

Keynote speaking at TMRA 2008

Oops, I totally forgot to mention to the world that I'm the intro keynote speaker at the TMRA 2008 conference (one of two yearly Topic Maps conferences each year) in Leipzig next week (15-17 October). My talk is titled "We're all crazy - subjectively speaking" and will contain at least one bad joke, two pretty good ones, some philosophical ranting and hopefully lots of community building. I really, really hope to see you there; find me, say hello, let's have tea and discuss whether my two jokes really were good or not.

The big question is, how did I forget to tell you about this? I'll let you know that in a few days time or so.

Labels: conference, knowledge representation, leipzig, talk, tmra, topic maps

3 July 2008

Round and round it goes

This morning was a good one. I got on the bus, armed with breakfast banana in hand, and right there in front of me sat fellow Topic Mapper Stian Danenbarger (from Bouvet), who happened to be living just literally down the road from me. I've been living at Korsvoll (in Oslo) for 6 months now without bumping into him, how odd is that?

Anyways, the last few days I've written about Language and Semantics and about context for understanding communication (all with strong relations to programming languages), and needless to say this became the topic (heh) of discussion on the bus this morning as well.

In this post I'll try to summarize the discussion so far, implement the discussion I had on the bus this morning, coupled with a discussion I've had with Reginald Braithwaite on his blog, from "". Let's start with Reginald and move backwards.

Background

Matz has said that Ruby is an attempt to solve the problem of making programmers happy. So maybe we aren’t happy with some of the accidental complexity. But can we be happy overall? Can we find a way to program in harmony with Ruby rather than trying to Greenspun it into Lisp?

I think that the goal of making programmers happy is a good one, although I suspect there's more than one way to please a programmer. One way is perhaps rooted in the syntax of the language at hand. Then there's the semantics of your language keywords. Another is to have good APIs to work with. Another is how meta the language is (i.e. how much freedom the programmer has in changing the semantics of the language, where Lisp is very meta while Java is not at all), and yet another is the community around it. Or the type and amount of documentation. Or its run-time environment. Or how the code is run (interpreted? compiled? half-compiled to bytecodes?).

Can we find ways in programming that would make all programmers happy? I need to now point back to my first post about Language and Semantics and simply reiterate that there's a tremendous lack of focus on why we program in most modern programming languages. Their idea is to shift bits around, and seldom to satisfy some overal more abstract problem. So for me it becomes more important to convey semantics (i.e. meaning) through my programming more than just having the ability to do so. Most languages will solve any problem you have, so what does the different languages offer us? In fact, how different are they most of the time?

At this moment in time I have extremely mixed feelings about Ruby. I sorely miss the elegance and purity of languages like Scheme and Smalltalk. But at the same time, I am trying to keep my mind open to some of the ways in which Ruby is a great programming language.

I think we really agree here. My own experiences with over 8 years of professional XSLT development (yes, look it up :) has taught me some valuable lessons about how elegant functional programming can be, just like Lisp and the mix-a-lot SmallTalk (which I like less of the two). But then I like certain ways that Ruby does things too, with a better syntax for one. I like to bicker about syntax. Yeah, I'm one of those. And I think I bicker about syntax for very good reasons, too;

Context

In "just enough to make some sense" I talk about context; how many hints do we need to provide in order to communicate well? Make no mistake; when we program, we are doing more than solving the shifting of bits and bytes back and forth. We are giving hints to 1) a computer to run the code, and 2) the programmer (either the original developer, or someone else looking at her code). Most arguments about syntax seems to stem from 1) in which 2) becomes a personal opinion of individuals rather than a communal excericse. In other words, syntax seems to come from some human designer trying to express hints best to the computer in order to shift bits about, instead of focusing entirly on their programming brothers and sisters.

In the first quote about Ruby being designed in order to please the programmer, that would imply that 2) was in focus, but the focus of that quoted statemement is all wrong; it pleases some programmers, but certainly not all, otherwise why are we even talking about this stuff?

Ok, we're ready to move on to the crux of the matter, I think.

I am arguing that while it is easy to agree that languages ought to facilitate writing readable programs, it is not easy to derive any tangible heuristics for language design from this extremely motherhood and apple pie sentiment.

Readability is an important and strong word. And it is very important, indeed. We need everything to be readable, from syntax to APIs to environments and onwards. I think we all want this pipe-dream, but we all see different ways of accomplishing it. Some say it's impossible, others say it's easy, while people like Reginald I think is right there in the middle, the ultimate pragmatic stance. And if I had never done Topic Maps I would be right there with him. Like Stian Danenberger said this morning, there's more to readability than just reading the code well.

Topic Maps

Yeah, it's time talk about what happens when you drink the kool-aid and you accept the paradigm shift that comes with it. There's mainly x things I've learned through Topic Maps;

Everything is a model, from the business ideals and processes, to design and definition, our programming languages, our databases, the interaction against our systems, and the human aspect of business and customers. Models, models, everywhere ...
All we want to do is to work with models, and be able to change those models at will
All programming is to satisfy recreating those models

Have you ever looked at model-driven architecture or domain-driven design? These are somewhat abstract principles to creating complex systems. Now, I'm not going to delve into the pros and cons of these approaches, but merely point out that they were "invented" out from a need that programming languages didn't solve, namely the focus on models.

Think about it; in every aspect of our programming life, all we do is trying to capture models which somehow mimics the real-life problem-space. The shifting of bits wouldn't be necessary if there wasn't a model were working towards. We create abstract models of programming that we use in order to translate between us humans and those pesky computers who's not smart enough to understand "buy cheap, sell expensive" as a command. This is the main purpose of our jobs - to make models that translate human problems into computer-speak - and then we choose our programming language to do this in. In other words, the direction is not language first then the problem, but the other way around. In my first post in this series I talked about tools, and about choosing the "right tool for the job." This is a good moment to lament some of what I see are the real problems of modern programming languages.

What objects?

Object-oriented programming. Now, don't get me wrong, I think OOP is a huge improvement over the process-oriented imperative ways of the olden ways. But as I said in my last post, it looks so much like the truth, we mistakenly treat it as truth. The truth is there's something fundamentally wrong with what we know as object-oriented programming.

First of all, it's not labeled right. Stian Danenbarger mention that someone (can't remember the name; Morten someone?) said it should be called "Class-based programming", or - if you know the Linnean world - taxonomical programming. If you know about RDF and the Semantic Web, it too is based loosely on recursive key/value pairs, creating those tree-structures as the operative model. This is dangerously deceitful, as I've written about in my two previous posts. The world is not a tree-structure, but a mix of trees, graphs and vectors, with some semi-ordered chaos thrown in.

Every single programming approach, be it a language or a paradigm like OOP or functional, comes with its own meta model of how to translate between computers and the humans that use them. Every single approach is an attempt to recreate those models, to make it efficient and user-friendly to use and reuse those models, and make it easy to change the models, remove the models, make new ones, add others, mix them, and so on. My last post goes into much detail about what those meta models are, and those meta models define the communication from human to computer to human to computer to human, and .

It's a bit of a puzzle, then, why our programming languages focus less on the models and more on shifting those bits around. When shifting bits are the modus operandi and we leave the models in the hands of programmers who normally don't think too much about those models (and, perhaps by inference, programmers who don't think about those models goes on to design programming languages in which they want to shift bits around ...), you end up with some odd models, which at most times are incompatible with each other. This is how all models are shifted to the API level.

Everyone who has ever designed an API knows how hard it can be. Most of the time you start in one corner of your API thinking it's going smooth until you meet with the other end, and you hack and polish your API as best you can, and release version 1.0. If anyone but you use that API, how long until requests for change, bugs, "wouldn't it make more sense to ...", "What do you mean by 'construct objects' here?", and on and on and on. Creating APIs is a test of all the skills you've got. And all of the same can be said about creating a programming language.

Could the problem simply be that we're using a taxonomic programming language paradigm in which we try to create a graph structured application? I like to think so. Why isn't there native support in languages for typed objects, the most basic building block of categorisation and graphing?

$mice = all objects of type 'mouse' ;

Or cleanups?

set free $mice of type 'lab' ;

Or relationships (with implicit cardinality)?

with $mice of type ('woodland')
add relationship 'is food' to objects of type 'owl' ;

Or prowling?

with $mice that has relationship to objects of type ('owl')
add type ('owl food') ;

Or workflow models?

in $workflow at option ('is milk fresh?') add possible response ('maybe')
with task ('smell it') and path back to parent ;

[disclaimer : these are all tounge-in-cheek examples]

I know you can extend some languages to do the basic bidding here, for example in JavaScript I can change the prototype for basic objects and types, but it's an extension each programmer must make and the syntax is bound to the limits of the meta model of the language, amking most such extensions look kludgy and inelegant. And unless they know all the problems that I think we've been talking about here, they really won't do this. This sort of discussion certainly does not appear where people learn programming skills.

No, most programming languages follow the tree-structure quite faithfully, or more precise the taxomatic model (which is mostly trees but with the odd jump (relationship) sideways in order to deal with the kludges that didn't fit the tree). Our programs are exactly that; data and code, and the programming languages define not only the syntax for how to deal with the data and code, but the very way we think about dealing with blobs of data and code.

They define the readability of our programs. So, Reginald closes;

Again we come down to this: readability is a property of programs, and the influence of a language on the readability of the programs is indirect. That does not mean the language doesn't matter, but it does make me suspicious of the argument that we can look at one language and say it produces readable programs and look at another language and say it does not.

Agreed, except I think most of the languages we do discuss are all forged over the same OOP and functional anvil, in the same "shifting the bits and byes back and forth" kind of thinking. I think we need to think in terms of the reason we program; those pesky models. Therein lies the key to readability, when the code resembles the models we are trying to recreate.

Syntax for shifting bits around

Yes, syntax is perhaps more important than we like to admit. Syntax defines the nitty-gritty way we shift those bits around in order to accomplish those modeling ideals. It's all in the eyes of the beholder, of course, just like every programming language meta model have their own answer. What is the general consensus on good syntax that convey the right amount of semantics in order for us all to agree to its meaning?

There's certain things which seems to be agreed on. Using angle brackets and the equal sign for comparators of basic types, for example, or using colon and equal to assign values (although there's a 50/50 on that one), using curly brackets to denote blocks (but not closures), using square brackets for arrays or lists (but not in functional languages), using parenthesis for functional lists, certain keywords such as const for constants, var for variables (mostly loosly typed languages, for some reason) or int or Int for integers (basic types or basic type classes), and so on. But does any of this really matter?

As shifting bytes around, I'd say they don't matter. What matters is why they're shifting the bytes around. And most languages don't care about that. And so I don't care about the syntax or the language quirks of inner closures when inner closures are a symptom of us using the wrong tools for the modeling job at hand. We're bickering about how to best do it wrong instead of focusing on doing it right. Um, IMHO, of course, but that's just the Topic Maps drugs talking.

Just like Robert Barta (who I'd wish would come to dinner more often), I too dream of a Topic Maps (or graph based) programming language. Maybe it's time to dream one up. :)

Labels: language, programming, programming languages, semantics, topic maps

2 July 2008

Just enough to make some sense

I've realized that my previous post on language and semantics could possibly be a bit hard to understand without having the proper context wrapped around it, so today I'll continue my journey of explaining life, universe and everything. Today I want to talk about "just enough complexity for understanding, but not more."

Mouses

Let's talk about mouse. Or a mouse. Mice. Let's talk about this ;

One can argue whether this is really enough context for us to talk about this thing. What does "mouse" mean here? The Disney mouse? A computer mouse? The mouse shadow in the second moon? In order for me to communicate clearly with my fellow human beings I need to provide just enough information so that we can figure this out, so I say "mouse, you know the furry, multivorus, small critter that ..." ;

This is too much information, at least for most cases. I'm not trying to give you all the information I know about mice, but just enough for me to say "I saw a mouse yesterday in the pantry." Talking about context is incredibly hard, because, frankly, what does context mean? And how much background information do I need to provide to you in order for you to understand what I'm talking about?

In terms of language "context" means verbal context as words and expressions that surrounds a word, and social context as the connection between the words and those who hear or read them based on the human constraints (age, gender, knowledge, etc.) There's also some controversy about this, and we often also imply certain mental models (social context of understanding).

In general, though, we talk about context as "that stuff that surrounds the issue", from solid objects, ideas, my mental state, what I see, what I know, what my audience see and knows, hears, smells, cultural and political history, musical tastes, and on and on and on. Everything in the moment and everything in the past in order to understand the current communication that takes us to the future.

Yup, it's pretty big and heady stuff, and it's a darn interesting question; how much context do you need in order to communicate well? My previous post was indeed about how much context we need to put into our language and definition in order to communicate well.

A bit of background

Back in 1956 a paper by the cognitive psychologist George A. Miller changed a lot of how we think about our own capacity for juggling stuff in our heads. It's a most famous paper, where further research since has added to and confirmed the basic premise that there's only so much we're able to remember at the same time. And the figure that came up was 7, plus / minus 2.

Of course that number is specific to that research, and may mean very little in the scheme of more specific settings. It's a general rule, though, that hints to the limits we have in cognition, in the way we observe and respond to communication. And it certainly helps us understand the way we deal with context. Context can be overly complex, or overly simple. Maybe the right amount of context is 7, plus / minus 2?

Just right

I'm not going to speculate much in what it means that "between 5 and 9 equally-weighted error-less choices" defines arbitrary constraints on our mental storage capacity (short-term especially), but I'll for sure speculate that it guides the way we can understand context, and perhaps especially where it's loosely defined.

We humans have a tendency to think that those things that looks like the truth must be the truth. We do this perhaps especially in the way we deal with computer systems, because, frankly, it's easy to define structures and limitations there. It's what we do.

An example of this is how we observe anything as containers that may contain things, that in themselves might be containers which might be things or more containers, and so on. Our world is filld with this notion, from taxonomies, to object-oriented programming, to XML, to how we talk bout structures and things, to how science was defined, and on and on and on. Tree-structures, basically.

But as anyone with a decent taxonomic background knows, taxonomies don't always work as a strict tree-structure. Neither does anyone who's meddled in OO for too long. Or fiddled with XML until the angle-brackets break. These things looks so much like the truth that we pursue them as truth.

things are more chaotic than we like. They're more, in fact, like graph structures, where relationships between things go back and forth, up and down, over and under already established relationships. It can be quite tricky, because the simple "this container contains these containers" mentality is gone, and a more complex model appears;

This is the world of the Semantic Web and Topic Maps, of course, and many of the reasons why these emerging technologies are, er, emerging is of course because all containers aren't containers at all, and that the semantics of "this things belongs to that thing" isn't precise enough when we want to communicate well. Explaining the world in terms of tree-structures puts too many constraints on us, so many that we spend most our time trying to fit our communication into it rather than simply defining them.

We could go back to frames theory as well, with recursive key/value properties that you find naturally in b-trees, where values are either a literal, or another property. RDF is based on this model, for example, where the recursiveness is used for creating graph structures. (Which is one reason I hate RDF, using anonymous nodes for literals)

Programming languages and meta models

Programming languages don't extend the basic pre-defined model of the language much. Some languages allow some degree of flexibility (such as Ruby, Lisp and Python), some offer tweaking (such as PHP. Lua and Perl), while others offer macroing and overloading of syntax (mostly C family), and yet more are just stuck in their modeling ways (Java). [note: don't take these notions too strictly; there's a host of features to these languages that mix and match various terms, both within and outside of the OO paradigm]

What they all have in common is that the defined meta model is linked to shifting bits and bytes around a computer program, and that all human communication and / or understanding is left in the hands of programmers. Let's talk about meta models.

Most programming languages have a set of keywords and syntax that make up a model of programming. this is the meta model; it's a foundation of a language, a set of things in which you build your programs on. All programming languages have more or less of them, and the more they have, the stricter they usually are as well. Some are object oriented languages, other functional, some imperative, and yet other mixes things up. If I write ;

Int i = new Int ( 34) ;

in Java, there's only so many ways to interpret that. It's basically an instance of the Integer class, that holds the integer number of 34. But what about

$i = new Int ( 34 ) ;

in PHP? There is no built-in class called Int in PHP, so this code either fails or produce an instance of some class called Int, but we do not know what that means, at least not at this point. And this is what the meta model defines; built-in types, classes, APIs and the overall framework, how things are glued together.

As such, Java and .Net has huge meta models defined, so huge that you can spend your whole career in just one part of it. PHP has a medium meta model, Perl even smaller, all the way down to assembler with a rather puny meta model. Syntax and keywords is not just how we program, but they define the constraints of our language. There's things that's easy and hard in every language, and there is no one answer to what the best programming language is. They all do things differently.

The object-oriented ways of Java differ to the ones of Ruby which differs to the ways of C++ which differs to the ways of PHP. The functional ways of Erlang differs to XSLT which differs to Lisp.

The right answer?

There is no right answer. One can always argue about the little differences between all thse meta models, and we do, all the time. We bicker about operator overloading, about whether mutliple inheritance is better than single inheritance, one the real difference between interfaces and abstract classes, about getter and setter methods (or lack thereof), about types should be first class objects or not, about what closures are, wheter to use curly-brackets or define programming structure through whitespace, and .

My previous post was another way of saying that we perhaps should argue less about the meta model of our language, and worry more about the reason the computer was created more than how a certain problem was solved? We don't have the mental capacity to juggle too much stuff around in our brains, and if the meta model is huge, our ability to focus on perhaps the important bits become less.

There are so many levels of communication in our development stack. Maybe we should introduce a more semantically sane model into it to move a few steps closer to the real problem, the communication between man and machine? I'm not convinced that OO nor functional programming solves the human communication problem. let's speculate and draw sketches on napkins.

Labels: java, language, php, programming, programming languages, semantic web, semantics, topic maps

24 June 2008

Thoughts on PHP

Yes, yes, I admit the herasy; I like PHP. No, no, PHP has tons of worts with it, so no, it's not better that alternative X, Y or Z for task Q, W or E. I hate comparing languages this way, feature by feature, syntax by semantics, and so on. I like to judge languages on two things;

Environment

I really like the CGI style for web resources. No, PHP ain't CGI (unless you're in a pain-self-inflicting mood) but most often a glorious Apache module which reuse all the goodness it can offer, but the model of a totally independent scripting engine which needs to mold its relationships, and then you throw it away when you're done, makes for a clean, fast and very scalable framework. PHP basically compiles together mostly C modules, and use its simple syntax to glue stuff together. Yes, there's a backlog of really shitty and badly written code out there, especially when people have no clue. And when the threshold is as low as it is for PHP, that's inevitable, but I hardly hold that against the language itself. The environment in which that shitty code runs is really good. To me, the environment is the best part of PHP. Perl falls somewhat into this same category.

For those in the know, this model is very similar to how a RESTful system works, where the interpreter is a manifestation of a resource. In resource-oriented development this means gold, and is very important to me as my environment supports my RESTful way of designing systems.

Style

And since style is something any good developer can control themselves (unless their language is super-strict), this really is the main thing that I like about PHP; It gives me the freedom to make things work for me. I can choose the methodology, whether a function, class or inline is best suited, and since PHP always is evaluated in run-time, I can make my environment depend on real-time parameters rather than a pre-compiled utopia.

Zend Framework

I've been using the Zend Framework for the last year or so, and it's a great framework as such, although the focus has been mostly to put OO wrappers around common PHP idioms and conventions. As such it works great to perhaps consolidate the features, and perhaps give PHP 6 a future direction. This is the best part of the Zend Framework.

The bad thing about the Zend Framework is that it imposes its own style, and somewhat alters the environment. Ouch, on both the things that make PHP my choice tool. I've struggled quite a bit in trying to reuse certain parts of the framework, extend others, and generally use some bits without using the whole shebang. There's not a lot of dependencies between the components, but just enough to make it tricky to do serious stuff (for example writing an alternative "threads-like" HTTP adapter to HTTP Request).

Now, people talk about the Zend Frameworks impact on the future of the PHP language. Yes, one can always hope that this is the case, but I think people are a little too preoccupied with the OO capabilities and forgetting perhaps what makes PHP really popular with those who choose to continue to use it past their first few applications. Do not fall into the trap thinking OO is somehow better than other ways.

As much as people think MVC is the best thing, I really don't care about that. MVC works great for some things, not for all (for example, I have a REST framework that use a completely different model, a more resource oriented model), so to impose MVC as the modus operandi is not good, and indeed something that makes reuse of other ZF modules a bit trickier. Instead of a MVC focus, there should be a strong highlighting of a uniform interface to the environment itself. It's the environment that's cool with PHP, so let's make interfacing to that better. There's some work on wrappers and readers for various aspects of the HTTP protocol, but only the most basic stuff is in there and needs serious work. With PHP I could do serious HTTP applications; with Zend Framework I'm limited, and need to hack and extend. Let's get HTTP savvy, not MVC drones.

Availability and diversity

PHP is everywhere. Period. I can use pretty much any ISP or in-house hosting to host most of what I need. And there are so many different open-source projects about that uses PHP (WikiPedia, WordPress, PHPBB, PHPMyAdmin, Drupal, Joomla, Flickr ... the list goes on) meaning both hosting and the amount of high-quality tools are abundant. The LAMP stack is pretty much supported on every hardware and software platform. Heck, I can even run it on the JVM.

I have written many tools over the years, some good, some bad, but I'm always happy to find out that I can copy my old files into any new environment and they will pretty much always run straight away, no tweaking. I can't tell you how important this is!

The shitty stuff

No system is perfect, to some specific definition of what "perfect" means. With PHP, I've learned to live with its odd bits, such as funny booleans and comparators, diverse and non-uniform APIs, sloppy exception handling, no shared memory (which may or may not be a good thing, but certainly a big part of its design), and a few syntactic snuffs (Why not introduce "$$" as a shortcut for "$this->", for example? Tidbits).

Some say that PHP code is crap, usually said when encountering - indeed! - shitty code. But shitty code is everywhere. Even in the most tied-down, static and controlled environments you still get shit. Some say less shit. I say shit with a different color, not different odor. I've been programming for almost 30 years soon, and let me tell you, I've encountered shitty code in every single environment and language I've ever tried, and I'm one of those who tries every language out there. There is no language who can hold a stick to the wonderful imaginitive mind of man, able to lay bricks where diamonds should have shone.

Some say the PHP language itself is immature, or unstructured, or some other parameter that their favourite language holds. No, sorry, but there's not that much which separates all our different languages (except BrainFuck ... that one is seriously different. :), and indeed a lot of the language wars are really more about their API designs than the languages themselves. For example, the syntax of Java is tolerable, but a lot of its APIs are not. The syntax of Python is ok, but some of its APIs are great. Ruby has good syntax, but to me some confusing APIs.

It's really mostly about APIs, and how we create models to solve our problems. It's all about models, as APIs in themselves are models. The API is not the language.

Round off

I should here of course mention the link between models and Topic Maps, as the latter is a way to define, work with and exchange / share the former. Couple this model openness with resource-orientation, and I think it makes for a very interesting environement. But as this whole shebang is part of the new framework I'm working on, expect more blogging on this in the very near future. Have a wonderful week.

Labels: application design, php, SOA ROA WOA REST SOAP WS architecture, topic maps

13 June 2008

Parting thoughts

It's Friday, and I'm leaving my computer alone over the weekend. It's here at the end of the week I thought I'd give a quick status update on things big and small ;

Sam's first steps;
Some more family tidbits; Lilje just turned 5, and started barnehage (kindergarten) which she just loves! Grace is going great at school, with lots of friends, and she now speaks a very tolerable Norwegian. We've started to use Norwegian as our main means of talking these days. Sam is a social little rascal, and now of course wants to walk everywhere. Julie is still too far away from her family to really be happy, but at least we have a few favourite things to make our exsistance herer good, such as solskinnsbolle (a kind of layered horizontal whirly bun, with a yummy egg-cream center) at Godt Brød (great littel bakery), coffee at Kaffepikene (the absolutely, without a doubt, the best darn coffee in town!), norwegian bread (how can we ever go back?) and ice-cream. Oh, and my beloved woods. I can't rave enough about the woods! We all love it. We all love Norwegian nature; it's simply stunning, even the simple stuff.
My RESTful Topic Maps framework lumbers ahead. I'll be using it for two major real-world projects, and will verify its success or failure from that. I'm sure it will be tweaked a lot.
The kids are doing well, the parents not so much; the other side of the planet is somewhat problematic, as our values and views have changed so much in the last 5 years. The Norway I came back to isn't the Norway I left. We're contemplating leaving for Australia sooner than first thought.
My eyes have been a painful nightmare the last 6 or 7 years, slowly but surely disintegrating my quality of life (and, sadly, those around me; my nickname lately is "grumpy". I'm so sorry!) where either my glasses have caused a pressure and hence headaches, or the lack of glasses makes me squint to the point of getting headaches, or contacts itch my eyes so much that I fall asleep (probably with a headache). Last month it became unbearable, with headaches (as in migraines; really painful headaches with a center right inside my eyes), and I didn't know what to do. I've so far seen 7 eye specialists (4 in Australia, 3 in Norway), including a full scan of my head and sinuses. Nothing. Until last week, where in a casual conversation between my sister and my wife, the option of allergies to pollen came up, and lo and behold, there seemed to be a pattern match between my worst periods of headaches with pollen outbreaks. Last Sunday I raced to the pharmacy and got anti-histamine pills, and - ladies and gentlemen - I haven't had a single headache since; the pills probably reduce the inflammation in certain pressure-points (top of nose, and the forehead) so the headaches don't develop. As you can imagine, I am over the moon about this result, and I hope there might be some answers as I get myself to a doctor in July for a full checkup on this.
I'm doing an interesting UCD project on the side. I had forgotten how much fun and enjoyment I get from facilitating projects and meetings. I have of late immersed myself in coaching (course given through work; I've been doing stuff like this for years, but this time I've formalized it a bit more, and getting a Norwegian translation of the stuff) which seems to come naturally to me. Presentations, meetings, sessions ... I'm having a blast, with good results to boot.

Ok, have a great weekend. See you on the other side.

Labels: coaching, family, kids, oslo, status, topic maps, ucd, ux

29 April 2008

What's wrong with Topic Maps people and their tools?

Hoo boy, I'm feeling a bit testy today, but seriously; how many who read this blog know the difference between an association type, an association role type, and a reified topic type? When we humans try to put to "words" - in a Topic Map, as we're dealing with here - our various models, do you know when to create a new topic type, how to use the role type in multi-member association of given types, what a "type" really is, and what an "occurrence" of anything really is all about? And - more importantly! - do you really need to know?

The Topic Maps paradigm takes a good while to sink into your brain. It takes a few goes and lots of patience to get what it is really all about, and when you get there you discover - surprise, surprise! - that "topic maps" are irrelevant to the real nugget of wisdom within; shareable models.

Here's a model of me. (If you can see it properly, ) Does it seem right to you?

No, not the glossy models on the covers of magazines, but the kind of models the way the brain tries to shape an immensely complex reality for a brain than can only hold bits of small distorted representations at a time. Here's a piece of this, a representation of that, held together with some contextual duct tape, and Voila! you've got some "kinda" understanding of how the world works. Kinda.

The world does not want to be modeled, and anyone who has ever attempted ontology work (another phrase that means everything and nothing at the same time, and - of course - I use all the time!) soon discover how impossibly crazy it is to model anything to any degree of correctness. We, as human beings see this infinite tool of modeling our world, and falls promptly into the trap of actually trying it out. In some ways the amazing things you can do with Topic Maps is its own death! Don't try to model anything to any degree of detail; there's crazy in there.

We model all the time, but until I learned all the ins and outs of Topic Maps I didn't have the words for what I was doing, nor did I have the knowledge that I was even doing it. But I was, and I still am. You are, right now even by just reading these words. Just by writing this blog posting I'm modeling my opinion, phrasing it through prose, interacting with my computer, and posting it contextually, all in the hope of you understanding this model. I have a purpose, a model in my head if you will, of what this message is and how it should come across. The fun is, of course, that no matter how accurately and carefully I try to communicate the model in my head to you, every single person who ever comes across these words will have their own unique special - and, dare I say, still correct - version of it. Every single sound, vision, thought and feeling ever expressed is uniquely interpreted.

So, maybe I describe my model to the best of my ability in an attempt to remove tacit knowledge surrounding my message. Maybe I say things like "I use my blog to post this message, choosing a title that might catch your attention, and I'm talking about more important issues than a selected technology; human perception, communication between us in a way that is as close to truth as we can hope to achieve, that what we all do is communicate models from our brain to the hopefully next one. And of course, Topic Maps is one really good way of doing this." That's my mental model for writing this stuff, but you can be quite sure I've forgotten to say bucket loads of stuff, neatly hidden away in prose, innuendo, lost context and the mere fact that my brain is impossible to model, little less understand. These are tremendously complicated things that are, by their very nature, bloody impossible to get to! It just can't be done!

But, because it looks like it has some element of truth to it, we pursue it as if it is the truth. It looks like we can model things well, so we try to model things well. Ouch. Bummer. No, don't do that; you need to have less precision in your models for them to actually be taken seriously, at least by people. But let's talk about computer to computer communication. Surely there we can have precision?

This is the area which for me really brings Topic Maps to its right, where we - as digital communicators - create artificial models which we can attach our data to, and share around. Forget all this human knowledge stuff; if anyone who is to model their idea need to know what an association type is, forget it. Sure, we allude to what it might be, hold a course in it, or explain it on end, and then we throw ourselves heavy into the Topic Maps Data Model and explain that it is just another model that's mapped to the Topic Maps Reference Model, which is a model which is framed in frames theory (or thereabouts), which is a model of key/value pairs in a table setting, which is a model of simplistic systems, which is in a computer model setting, which is a digital model of ... and so on. Models, models, models. And you know what? There's translation and interpretation at every single step of the way.

Steve Pepper claims that Topic Maps are great because Topic Maps are closer to how the brain works than other means of mapping information, and I agree with him, but only in the "sure thing" kind of way, not the "correct" kinda way. The human brain doesn't think about how it models things. In it, there are no kinds, or types, or roles, or occurrences, or reification, or identifiers of any kind. So let's agree that Topic Maps, at the moment, is perhaps the closest we've got standardized right now that somewhat is closer to the way the brain works in computer terms.

Translations. That's what it's all about. I say "hei", you say "hi", she says "hola", he says "yo." Some use Java, some use C#, some PHP or Perl. Business people use business speak, designers talk the talk, programmers code. There's translations up and down and back and forth between them all, and then some. And wherever there is translation, there's a margin of error that gets higher the further the translated models are from each other. No wonder there's problems in the world.

As technologists we care a lot about these things, and from a technical standpoint, Topic Maps kicks ass! Seriously! But I've worked extensively with Topic Maps over the years, and if there's one thing I've learned it is that people don't give a toss about what the underlying technology is, and they certainly don't care what type or association concept any relationship might be. When people perk up about Topic Maps, it's not because of the data model but it's because of some underlying modeling ideas, that promise of sharing models and bring tacit knowledge up from traditional taxonomic and document-centric ways of dealing with "knowledge" in computer systems. They do not care about how to "map" their concepts, and seriously, don't care whether the modeling tools are standardized, shareable or not. They care about the models, for sure, but only as a conceptual thing they wish to share. So let's forget the technology and the data model and Topic Maps, because it really doesn't matter.

What matter is that communication happens, and it doesn't happen on the Topic Maps Data Model level. There is a higher, sloppier, fuzzier level that we humans live on, and that's the model we should try to get closer to. Sure, we can create cool systems using the Topic Maps standard to do all this, but it sucks as a platform of expression! It really, really sucks! Try right now to model the simple concept which is "freedom"; what are your topics, associations, role types, occurrences? One can appreciate that we can try, but there's no one answer to do this. The TMDM does no support human thinking, don't let yourself be fooled, it can only represent some misguided attempt at jotting it down in some machine-exchangeable way.

Topincs, Omnigator and other tools we Topic Mappers give to people who are to model things are a thing that only a technologist can love, and, in addition, a technologist who understand all the ins and outs of the Topic Maps Data Model. This is a huge limitation! People, the real group we have been trying to sell this concept to for years, just can't wrap their head around the Data Model to such a degree as to conceptualize their models! It's lunacy to think so, but of course, the Topic Maps community is chock full of technologists, so that's kinda expected. But I really wished that we had outside help. The Data Model is there for technologists and tool-makers, not people.

Can we rethink this part of the problem? I'm often embarrassed to give these tools to people, as it is extremely counter to claims we lay down to the greatness of Topic Maps. Don't get me wrong; TM is fantastic stuff, but the tools sucks for normal people, and by "normal people" I mean almost anybody but us.

We need to be even more human in our approach to knowledge representation. Topic Maps is a good foundations to build our systems on, but it sucks as a knowledge representation system for humans. What can we do?

Labels: communication, data models, emnekart, knowledge representation, topic maps

11 April 2008

Topic Maps 2008

I just quickly need to jot down some thoughts on the Topic Maps 2008 conference I attended last week before I move on to more personal things.

First of all, it was a very smooth operation as conferences go, and I felt comfortable and welcome thanks to very nice hosts. I met a lot of really interesting and smart people there, too, all interested in roughly the same thing (which is such a generic thing as "world peace" and "doing what's right").

I have to say, though, that my tutorial on "Topic Maps for Information Architects" was the hardest presentation or tutorial I have ever prepared. It was very hard to balance based on the audience type, as they were either heavy into Topic Maps, heavy into information architecture, or some other thing alltogether. I struggled a long time, and ended up with a 33% focus on each subject, with the remaining 33% focusing on philosophy and epistemology. Some loved it, some hated it, but that's what you get when you take a risk such as this.

My presentation on "A Topic Maps vision for the library world" on the second day went much better, with much enthusiastic response afterwards. At least here I know my audience, and have been saying these things for years. And the quest for a library world uptake of Topic Maps is of course one that I shall pursue further even if I'm now back in commercial sector.

There was a great deal of wound up dust about the latest OOXML standardization process by ISO, how Steve Pepper, one of the main people behind the conference, also has quit ISO in protest over the latest happenings. The funny part is that another Topic Maps great, Patrick Durusau, was also at the conference, together with the Microsoft-bought consultant Shahzad Rana, which made the whole thing a bit poignant at times throughout. It was actually quite fun to watch sometimes as greats clashed in a non-clashing sorta way. All smart people, serious fun.

I also talked a bit with Robert Barta and his wife (having them over for dinner was the conference highlight, hands down!) about theirs and ours experiences with moving in and out of Australia. The verdict, I think, is that Australia is great unless you're a foreigner staying for more than 3 months. :) Also we talked about Topic maps, and I was extremely happy to confirm that I at least were not alone in my madness. Thanks, Robert.

All in all, a great conference with lots of interesting presentations. Winning presentation might be Conal Tuhoys presentation on what New Zealand is doing in the area, and especially their Electronic Library is doing, and I'm very thrilled to see them push forth an persistant identification open-source project that libraries can use for their work. I hope the uptake is large, because the library world won't survive without it.

So, tThanks for having me, and thanks for putting the conference on. Good stuff.

Labels: emnekart, ooxml, topic maps

: Name: Alexander Johannesen; Location: Kiama, NSW, Australia

I am a mid-life-crisis aged Norwegian in Kiama near Wollongong, Australia. I've been working in the computer industry as a coffee drinker for over 18 years, spanning from AI, user interfaces, motion detection, digital video, security and web standards. Three kids, a lovely wife, and an untuned piano fills the rest of my life with meaning. .

Ressurection : xSiteable Framework
And so a new era begins
Want to feel like a second-class citizen?
Lots of changes, big and small
Monteverdi and me and tonight
Bad start of the week, thanks to Steven Spielberg ...
An amazing and beautiful digi-analog kick-ass cloc...
I went to TMRA 2008, and all I got was the best da...
Keynote speaking at TMRA 2008
MARCXML : Beast of burden

"Sarchasm: The gulf between the author of sarcastic wit and the person who doesn't get it."

More of