Part 6. Applying reusable SST patterns for graph data modelling
With the basic tools and conceptual issues in place, we can begin to apply them more seriously to build database-backed models of the world, including (machine) learning structures. The purpose of storing data is then not just to build an archaeological record of events, but to structure observations in ways that offer future value to possibly related processes. This is where Semantic Spacetime helps, but there are other practical considerations too. In this installment, I’ll walk through the Semantic Spacetime library package for Go to explain what’s going on.
Semantic Spacetime treats every imaginable process realm as a knowledge representation in its own right. It could be in human infrastructure space, sales product space, linguistic word space, protein space, or all of these together. In part 5, I showed how certain data characteristics naturally belong at nodes (locations) while others belong to the links in between (transitions).
How could we decide this rationally? Promise Theory offers one approach, and explains quite explicitly how models can be a “simple” representation of process causality.
Choosing Node and Link data fields
When we observe the world and try to characterize it, we have to distinguish between modelling things and modelling their states. These are two different ideas. Trying to think about objects led to the troubles in Object Oriented Analysis mentioned in the previous post. If we think about state, some information is qualitative or symbolic in nature (definitive and descriptive, and based on fixed-point semantics), whereas other information measures quantitative characteristics, degrees, extents, and amplitudes of phenomena. The latter obey some algebra: they may combine additively or multiplicatively, for instance.
In data analysis, and machine learning in particular, probabilistic methods stand out as a popular way of modelling. Probabilities are an even mixture of qualitative and quantitative values. The probability of a qualitative outcome is a quantitative estimate. Using probability is partly a cultural issue — scientists traditionally feel more comfortable if there are quantitative degrees to work with, because it makes uncertainty straightforward to incorporate. Thus one tends to neglect the detailed role of the spanning sets of outcomes. In probabilistic networks, people might refer to such outcomes in terms of convergence points in a belief network. But it’s important to understand the sometimes hidden limitations of probabilities in modelling. Sometimes we need to separate qualitative and quantitative aspects in a rational way, without completely divorcing them!
In IT, there are databases galore, vying for our attention and trying to persuade users with impressive features. Yet, you shouldn’t head straight for a particular database and start playing with its query language expecting to find answers. Wisdom lies in understanding how to model and to capture processes with integrity and high or low fidelity. My choice of tools for this series (ArangoDB and Go) were based on trial and error — after five other attempts.