Last time we’ve constructed a wide variety of Jaccard-like distance functions on the set of all notes in our vault . That is, and for each triple of notes we have a triangle inequality
By construction we had , but we can modify any of these distances by setting if there is no path of internal links from note to note , and otherwise. This new generalised distance is no longer symmetric, but still satisfies the triangle inequality, and turns into a Lawvere space.
becomes an enriched category over the monoidal category (the poset-category for the reverse ordering ( iff ) with as ‘tensor product’ and as unit). The ‘enrichment’ is the map
Writers (just like children) have always loved colimits. They want to morph their notes into a compelling story. Sadly, such colimits do not always exist yet in our vault category. They are among too many notes still missing from it.
For ordinary categories, the way forward is to ‘upgrade’ your category to the presheaf category. In it, ‘the child can cobble together crazy constructions to his heart’s content’. For our ‘enriched’ vault we should look at the (enriched) category of enriched presheaves . In it, the writer will find inspiration on how to cobble together her texts.
An enriched presheaf is a map such that for all notes we have
Think of as the distance (or similarity) of the virtual note to the existing note , then this condition is just an extension of the triangle inequality. The lower the value of the closer resembles .
Each note determines its Yoneda presheaf by . By the triangle inequality this is indeed an enriched presheaf in .
The set of all enriched presheaves has a lot of extra structure. It is a poset
with minimal element , and maximal element .
It is even a lattice with and . It is easy to check that and are again enriched presheaves.
Here’s when the vault consists of just two notes of non-zero distance to each other (whether symmetric or not) as a subset of .
This vault of all missing (and existing) notes is again enriched over via
The triangle inequality follows because the definition of is equivalent to . Even if we start from a symmetric distance function on , it is clear that this extended distance on is far from symmetric. The Yoneda map
is an isometry and the enriched version of the Yoneda lemma says that for all
Indeed, taking in gives . Conversely,
from the presheaf condition for all follows
For every subset of notes we can consider the presheaf (use triangle inequality)
then its zero set can be thought of as the closure of , and the collection of all such closed subsets define a topology on .
In our simple example of the two note vault this is just the discrete topology, but we can get more interesting spaces. If but
we get the Sierpinski space: is the only closed point, and lies in the closure of . Of course, if your vault contains thousands of notes, you might get more interesting topologies.
In the special case when is a poset-category, as was the case in the shape of languages post, this topology is the down-set (or up-set) topology.
Now, what is this topology when you start with the Lawvere-space ? From the definitions we see that
So, all presheaves in the up-set lie in the closure of , and lies in the closure of all everything in the down-set of . So, this time the topology has as its closed sets all down-sets of the poset .
What’s missing is a good definition for the implication between two enriched presheaves . In An enriched category theory of language: from syntax to semantics it is said that this should be, perhaps only to be used in their special poset situation (with adapted notations)
but I can’t even show that this is a presheaf. I may be horribly wrong, but in their proof of this (lemma 5) they seem to use their lemma 4, but with the two factors swapped.
If you have suggestions, please let me know. And if you trow Kelly’s Basic concepts of enriched category theory at me, please add some guidelines on how to use it. I’m just a passer-by.
Lately, I’ve been reading up a bit on psycho-analysis, tried to get through Grothendieck’s La clef des songes (the key to dreams) and I’m in the process of writing a series of blogposts on how to construct a topos of the unconscious.
Stella Maris is set in 1972, when the math-prodigy Alicia Western, suffering from hallucinations, admits herself to a psychiatric hospital, carrying a plastic bag containing forty thousand dollars. The book consists entirely of dialogues, the transcripts of seven sessions with her psychiatrist Dr. Cohen (nomen est omen).
Alicia is a doctoral candidate at the University Of Chicago who got a scholarship to visit the IHES to work with Grothendieck on toposes.
During the psychiatric sessions, they talk on a wide variety of topics, including the nature of mathematics, quantum mechanics, music theory, dreams, and the unconscious (and its role in doing mathematics).
The core question is not how you do math but how does the unconscious do it. How it is that it’s demonstrably better at it than you are? You work on a problem and then you put it away for a while. But it doesnt go away. It reappears at lunch. Or while you’re taking a shower. It says: Take a look at this. What do you think? Then you wonder why the shower is cold. Or the soup. Is this doing math? I’m afraid it is. How is it doing it? We dont know. How does the unconscious do math? (page 99)
Before going to the IHES she had to send Grothendieck a paper (‘It was an explication of topos theory that I thought he probably hadn’t considered.’ page 136, and ‘while it proved three problems in topos theory it then set about dismantling the mechanism of the proofs.’ page 151). At the IHES ‘I met three men that I could talk to: Grothendieck, Deligne, and Oscar Zariski.’ (page 136).
I don’t know whether Zariski visited the IHES in the early 70ties, and while most historical allusions (to Grothendieck’s life, his role in Bourbaki etc.) are correct, Alicia mentions the ‘Langlands project’ (page 66) which may very well have been the talk of town at the IHES in 1972, but the mention of Witten ‘Grothendieck writes everything down. Witten nothing.’ (page 100) raised an eyebrow.
The book also contains these two nice attempts to capture some of the essence of topos theory:
When you get to topos theory you are at the edge of another universe.
You have found a place to stand where you can look back at the world from nowhere. It’s not just some gestalt. It’s fundamental. (page 13)
You asked me about Grothendieck. The topos theory he came up with is a witches’ brew of topology and algebra and mathematical logic.
It doesnt even have a clear identity. The power of the theory is still speculative. But it’s there.
You have a sense that it is waiting quietly with answers to questions that nobody has asked yet. (page 68)
I did read ‘The passenger’ first, which is probably better as then you’d know already some of the ghosts haunting Alicia, but it’s not a must if you are only interested in their discussions about the nature of mathematics. Be warned that it is a pretty dark book, better not read when you’re already feeling low, and it should come with a link to a suicide prevention line.
This allowed us to view the text-snippets as points in a Lawvere pseudoquasi metric space, and to define a ‘topos’ of enriched presheaves on it, including the Yoneda-presheaves containing semantic information of the snippets.
In Obsidian, a vault is a collection of notes (with their tags and other meta-data), together with all links between them.
The vault of the language-poset will have one note for every text-snipped, and have a link from note to note if is a text-fragment in .
In their paper, Bradley, Terilla and Vlassopoulos use the enrichment structure where is the conditional probablity of the fragment to be extended to the larger text .
Most Obsidian vaults are a lot more complicated, possibly having oriented cycles in their internal link structure.
Still, it is always possible to turn the notes of the vault into a category enriched over , in multiple ways, depending on whether we want to focus on the internal link-structure or rather on the semantic similarity between notes, or any combination of these.
Let be a set of searchable data from your vault. Elements of may be
Assign a positive real number to every . We see as the ‘relevance’ we attach to the search term . So, it is possible to emphasise certain key-words or tags, find certain links more important than others, and so on.
For this relevance function , we have a function defined on all subsets of
Take a note from the vault and let be the set of search terms from contained in .
We can then define a (generalised) Jaccard distance for any pair of notes and in :
This distance is symmetric, for all notes , and the crucial property is that it satisfies the triangle inequality, that is, for all triples of notes , and we have
How does this help to make the vault into a category enriched over ?
The poset is the category with objects all numbers , and a unique morphism between two numbers iff . This category has limits (infs) and colimits (sups), has a monoidal structure with unit object , and an internal hom
We say that the vault is an enriched category over if for every pair of notes and we have a number satisfying for all notes
for all triples of notes and .
Starting from any relevance function we define for every pair and of notes the distance function satisfying the triangle inequality. If we now take
then the triangle inequality translates for every triple of notes and into
That is, every relevance function makes into a category enriched over .
Two simple relevance functions, and their corresponding distance and enrichment functions are available from Obsidian’s Graph Analysis community plugin.
To get structural information on the link-structure take as the set of all incoming and outgoing links in your vault, with relevance function the constant function .
‘Jaccard’ in Graph Analysis computes for the current note the value of for all notes , so if this value is , then the corresponding enrichment value is .
To get semantic information on the similarity between notes, let be the set of all words in all notes and take again as relevance function the constant function .
To access ‘BoW’ (Bags of Words) in Graph Analysis, you must first install the (non-community) NLP plugin which enables various types of natural language processing in the vault. The install is best done via the BRAT plugin (perhaps I’ll do a couple of posts on Obsidian someday).
If it gives for the current note the value for a note , then again we can take as the enrichment structure .
Graph Analysis offers more functionality, and a good introduction is given in this clip:
Calculating the enrichment data for custom designed relevance functions takes a lot more work, but is doable. Perhaps I’ll return to this later.
Mathematically, it is probably more interesting to start with a given enrichment structure on the vault , describe the category of all enriched presheaves and find out what we can do with it.