The last few months have been interesting for me in a philosophical sense. My job is on an architectural level in using ontologies in software development, both in the process (development, deployment, documentation), the infra-structure (SOA, servers, clusters) and the end result of it (business applications). So needless to say, I’ve been going a bit epistemental, so I promised myself yesterday to jot down my thoughts and worries, if for no other reason than for future reference.
One big thing that seems to go through my ponderings like a theme, is the linguistic flow of the definition language itself, in how the mode of definition changes the relative inference of the results of using that ontology over static data (not to mention how it gets even trickier with dynamic data). We usually say that the two main ontological expressions (is_a, has_a) of most triplets (I use the example of triplets / RDF as they are the most common ones, although I use Topic Maps association statements myself) defines a flat world from which we further classify the round world. But how do we do this? We make up statements like this ;
Alex is_a Person
Alex has_a Son
Anyone who works in this field understand what’s going on, and that things like “Alex” and “Person” and “Son” are entities, and defined with URIs, so actually they become ;
https://shelter.nu/me.html is_a http://psi.ontopedia.net/Person
https://shelter.nu/me.html has_a http://en.wikipedia.org/wiki/Son
Well, in RDF they do. In Topic Maps we have these as subject identifiers, but pretty much the same deal (except some subtleties I won’t go into here). But our work is not done. Even those ontological expressions have their URIs as well, giving us ;
https://shelter.nu/me.html https://shelter.nu/psi/is_a http://psi.ontopedia.net/Person
https://shelter.nu/me.html https://shelter.nu/psi/has_a http://en.wikipedia.org/wiki/Son
Right, so now we got triplets of URIs we can do inferencing over. But there’s a few snags. Firstly, a tuple like this is nothing but a set of properties for a non-virtual property and does not function like a proxy (like for instance the Topic Maps Reference Model do), and in transforming between these two forms gives us a lot of ambiguity that quickly becomes a bit of a problem if you’re not careful (it can completely render inferencing useless, which is kinda sucky). Now given that most ontological expressions are defined by people, things can get hairy even quicker. People are funny that way.
So I’ve been thinking about the implications of more ambiguous statement definitions, so instead of saying is_a, what about was_a, will_be_a, can_be_a, is_a_kindof_a? What are the ontological implications of playing around with the language itself like this? It’s just another property, and as such will create a different inferred result, but that’s the easy answer. The hard answer lies between a formal definition language and the language in which I’m writing this blog post.
We tend to define that “this is_a that”, this being the focal point from which our definition flows. So, instead of listing all Persons of the world, we list this one thing who is a Person, and moves on to the next. And for practical reasons, that’s the way it must be, especially considering the scope of the Semantic Web itself. But what if this creates bias we do not want?
Alex is_a Person, for sure, but at some point I shall die, and then I change from is_a to a was_a. What implications will this, if any, have on things? Should is_a and was_a be synonyms, antonyms, allegoric of, or projection through? Do we need special ontologies that deal with discrepancies over time, a clean-up mechanism that alters data and sub-sequentially changes queries and results? Because it’s one thing to define and use data as is, another completely to deal with an ever changing world, and I see most – if not all – ontology work break when faced with a changing world.
I think I’ve decided to go with a kind_of ontology (and ontology where there is no defined truth, only an inferred kind-system), for no other reason that it makes cognitive sense to me and hopefully to other people who will be using the ontologies. This resonates with me especially these days as I’m sick on the distinction people make between language and society, that the two are different. They are not. Our languages are just like music; with the ebb and flow, drama and silence that makes words mean different things. By adding the ambiguity of “kind of” instead of truth statements I’m hoping to add a bit of semiotics to the mix.
But I know it won’t fix any real problems, because the problem is that we are human, and as humans we’re very good at reading between the lines, at being vague, clever with words, and don’t need our information to be true in order to live with it. Computers suck at all these things.
This is where I’m having a semi-crisis of belief, where I’m not sure that epistemological thinking will ever get past the stage of basic tinkering with identity in which we create a false world of digital identities to make up for any real identity of things. I’m not sure how we can properly create proxies of identity in a meaningful way, nor in a practical way. If you’re with me so far, the problem is that we need to give special attention to every context, something machines simply aren’t capable of doing. Even the most kick-ass inferencing machines breaks down under epistemological pressure, and it’s starting to bug me. Well, bug me in a philosophical kind of way. (As for mere software development and such, we can get away with a lot of murder)
I’m currently looking into how we can replicate the warm, fuzzy impreciseness of human thinking through cumulative histograms over ontological expressions. I’m hoping that there is a way to create small blobs of “thinking” programs (small software programs or, probably more correctly, script languages) that can work over ontological expressions without the use of formal logic at all (first-order logic, go to hell!) that can be shared, that can learn what data can and can’t be trusted to have some truthiness. Here’s to hoping.
The next issue is directional linguistics, in how the vectors of knowledge is defined. There’s things of importance to what order you gain your knowledge, just like there’s great importance in how you sort it. This is mostly ignored, and the data is treated as it’s found and entered. I’m not happy with that state of things at all, and I know that if I was taught about axioms before I got sick of math, my understanding of axiomatic value systems would be quite different. Not because I can’t sit down now and figure it out, but because I’ve built a foundation which is hard to re-learn when wrong, hard to break free from. Any foundation sucks in that way, even our brains work this way, making it very hard to un-learn and re-train your brain. Ontological systems are no different; they build up a belief-system which may prove to be wrong further down the line, and I doubt these systems know how to deal with that, nor do the people who use such systems. I’m not happy.
Change is the key to all this, and I don’t see many systems designed to cope with change. Well, small changes, for sure, but big, walloping changes? Changes in the fundamentals? Nope, not so much.
We humans can actually deal with humongous change pretty well, even though it may be a painful process to go through. Death, devastation, sickness and other large changes we adapt to. There’s the saying, “when you’ve lost everything, there’s nothing more to lose and everything to gain”, and it holds remarkably true for the human adventure on this planet (look it up; the Earth is not really all that glad to have us around). But our computer systems can’t deal with a CRC failure, little less a hard-drive crash just before tax-time.
There’s something about the foundations of our computer systems that are terribly rigid. Now, of course, them being based on bits and bytes and hard-core logic, there’s not too much you can do about the underlying stuff (apart from creating quantum machines; they’re pretty awesome, and can alter the way we compute far more than the mere efficeny claims tell us) to make it more human. But we can put human genius on top of it. Heck, the ontological paradigm is one such important step in the right direction, but as long as the ontologies are defined in first-order logic and truth-statements, it is not going to work. It’s going to break. It’s going to suck.
Ok, enough for now. I’m heading for Canberra over the weekend, so see you on the other side, for my next ponder.