Part 10: Big data, coarse graining, and “effective” approximations
In the previous post, I looked at how large scale properties of graphs could be calculated from smaller scales, and how the graph itself reveals the computation! Before we end this series of posts, we need to finish addressing the elephant in the room, which is scale, in all its interpretations! We already saw in part 8 what happens when we zoom into a graph of even moderate size, when it has dense interconnectivity. We lose our way, and find it hard to track causality. For some, the solution is to use one of several inscrutable Artificial Neural Network techniques being developed in the ML research community, which mistakenly (in my opinon) try to replace graphs with Euclidean embeddings. But there’s a lot we can do without throwing away causal explanation. In this post, I want to show how the concept of semantic spacetime shows us how to stay in control of a more traditional and causal alternative: we can apply classic methods from statistical mechanics to integrate out degrees of freedom, leaving a controllable approximation over a scale parameter.
Big data — why do we want it?
The fact is that it mainly became popular to accumulate Big Data when disk prices fell below a certain threshold of affordability in the tech industry. Supply drove demand more than direction. We can blame tech marketing for some gratuitous over-usage of storage during those times, but — if it has been advocated indiscriminately in the past, then — today there’s a higher level of sophistication involved in the use of large caches of data. Perhaps the same can’t be said of CPU time.
Let’s recall the two main reasons for using data:
To seek answers about empirically gathered data, i.e. to determine (repeatable) causal relationships.
To combine data through models in order to generate composite outcomes (e.g. for modern “AI” syntheses) like an expert system.
The former is a traditional view of scientific inquiry — high fidelity signal analysis, as “objective” as can be. In the latter, there’s a kind of artistic license like making photo-fit mug shots from body parts— we don’t have full causal integrity in the story between input and output, but we apply superpositionsover possible “worlds” or “alternative states” to compose an outcome to be judged on the merits of later applicability. Not every data set (large or small) will contain all the information we need to make it meaningful by itself. Semantics are not always apparent from observable dynamics. Nonetheless, we might still be able to combine one source with another, to “imagine” possible ideas, as we do in “AI” methods — again, a bit like a photo-fit face pieced together to identify a criminal suspect.