Semem POC
First-thing morning ramblings, deep sea trawling for actionables again
I have two upcoming deadlines I've set myself:
- #:PoggiPivot 2025-01-05 - get my #:hyperdata project docs sorted
- #:ProofPivot 2025-01-14 - proofs of concept for projects
Both racing towards me. But that's the whole idea, use the purely artificial urgency to motivate myself to focus. Success at achieving the goals is secondary, reflected in the fact I haven't actually written down those criteria yet.
Anyway, since setting these I've made a good start on them for Semem (Semantic Memory). Which has highlighted one particular flaw in my current processes, that proof of concept isn't actually the phrase I intended in the above.
Somewhere, probably in a Claude project, hopefully condensed in a vocab, I have (work on) better terms. But I need to get my docs sorted...
The phrase is/should be closer to minimum viable product. But I don't like this phrase because product in this context, in my mind suggests a marketable commodity. As a dyed in the wool Capitalism-sceptic, I have a knee-jerk reaction to this kind of thing. In society at large there is an overarching conflation between a thing's value in a "noble" utilitarian sense and it's bottom line $ value.
Aside : I have a vague memory of having the same kind of reaction while tech-reviewing some book material by Shelley Powers a good few years ago. (This may be a completely false memory, apologies to Shelley if that's the case, but it is plausible and moreover nicely fits this narrative, "don't let the truth get in way..." and all that). Shelley is one of my heros of the Web-tech world, and a bloody good writer. But in this particular chapter she made repeated use of phrases like "business rules". Which I repeatedly criticised. There is a legitimate definition of "business" in this context, to paraphrase "the important stuff". But still in my mind the conflation with business in the common-English commercial activity sense made my hackles rise.
(Btw, I am an experienced reviewer, and key to that role is drawing attention to text that might not be quite right. It might well be fine. But I'd say the diligent reviewer should err a little on the side of caution, flag things where suspicions are raised. As an experienced writer, Shelley would know many of such flags were tentative, a legitimate response being to reject them after a cursory mental check - which she probably did in this instance, if it ever happened. For sure, iffy crit is a pain and any crit can be demoralising when you get a downpour (I'd be one of many reviewers doing this). But the patience of Job and thick skin are job requirements.)
Any road... I do hope I did get some clear and accurate terminology nailed down for POC/MVC. Neutral language in the sense of not carrying misleading baggage. What I'm after is more like "demonstration of utility". Clunky, eh.
Semem Demonstration of Utility
I delighted myself with the hubris I got into my working description of #:semem :
Semem is an LLM-compatible context-aware, open-ended graph knowledgebase system combining the advantages of vector embeddings and Linked Data technologies.
But what am I aiming for by 2025-01-14? I actually have an immediate use case with low-end requirements :
A system I can point to my local FS containing scattered markdown notes etc etc so I can use the stuff efficiently.
I reckon a baseline #:Semem needs to :
- provide simple similarity search
- cluster related topics at different scales
- create quasi-formalised annotation (metadata)
- make a disorganised, heterogenous corpus addressable as a coherent entity
- exploit LLMs (and maybe trad reasoning) to facilitate new knowledge discovery
Ok, I have some actionables. Mari's picking me up in 15 mins to go for a coffee and I'm not yet out of bed.