Skip to content →

Learners’ logic

In the Learners and Poly-post we’ve seen that learners from A to B correspond to set-valued representations of a directed graph G and therefore form a presheaf topos.

Any topos comes with its Mitchell-Benabou language, allowing us to speak of formulas, propositions and their truth values. Two objects play a special role in this: the terminal object 1, and the subobject classifier Ω. It is a fun exercise to determine these special learners.

T is the free rooted tree with branches sprouting from every node nT0 for each element in A×B. C will be our set of colours, one for each element of Maps(A,B)×Maps(A×B,A).

For every map λ:T0C we get a coloured rooted tree Tλ, and for each branch (a,b) from the root we get another rooted sub-tree Tλ(a,b) which is again of the form Tμ for a certain map μ:T0C.

The directed graph G has a vertex vλV for each coloured rooted tree Tλ and a directed edge vλvμ if Tμ is the isomorphism class of coloured rooted trees of the subtree Tλ(a,b) for some (a,b)A×B.

There are exactly #A×B directed edges leaving every vertex in G, but there may be (many) more incoming edges. We can colour each vertex vλ with the colour of the root of Tλ.



The coloured directed graph G depicts the learning process in a neural network, being trained to find a suitable map AB. The colour of a vertex vλ gives a map fMaps(A,B) (and a request function). If the network now gives as output bB for a given input aA, we can move on to the end-vertex vμ of the directed edge labeled (a,b) out of vλ. The colour of vμ gives us a new (hopefully improved) map fnewMaps(A,B) (and a new request function). A new training data (a,b) brings us to a new vertex and map, and so on.

Clearly, some parts of G are more efficient to find the desired map than others, and the aim of the game is to distinguish efficient from inefficient learners. A first hint that Grothendieck topologies and their corresponding sheafifications will turn out to be important.

We’ve seen that a learner, that is a morphism PyPCyA×B in Poly, assigns a set Pλ to every vertex vλ (this set may be empty) and a map PλPμ to every directed edge vλvμ in G.

The terminal object 1 in this setting assigns to each vertex a singleton {}, and the obvious maps for each directed edge. In Poly-speak, the terminal object is the morphism
1 : VyVCyA×B
which sends each vertex vλV to its colour cC, and where the backtrack map φvλ#[c] maps (a,b) to vμ if this is the end-vertex of the edge labelled (a,b) out of vλ. That is, 1 contains all information about the coloured directed graph G.

The subobject classifier Ω assigns to each vertex vλ the set Ω(vλ) of all subsets S of directed paths in G, starting at vλ, such that if pS then also all prolongated paths belong to S. Note that the emptyset satisfies this requirement, so is an element of this vertex set. Another special element in Ω(vλ) is the set 1λ of all oriented paths starting at vλ.

Ω(vλ) is an Heyting algebra with 1=1λ, 0=, partially ordered via inclusion, and logical operations (intersection), (union), ¬ (with ¬S the largest SΩ(vλ) disjoint from S) and defined by SS is the union of all SΩ(vλ) such that SSS.



S¬S is not always equal to 1. Here, the union misses the left edge from the root. So, we will not be able to prove things by contradiction.

If vλvμ is the directed edge labeled (a,b), then the corresponding map Ω(vλ)Ω(vμ) takes an SΩ(vλ), drops all paths which do not pass through vμ and removes from those who do the initial edge (a,b). If no paths in S pass through vμ then S is mapped to Ω(vμ).

If Ω=λΩ(vλ) then the subobject classifier is the morphism in Poly
Ω : ΩyΩCyA×B
sending a path starting in vλ to the colour of vλ and the backtrack map of (a,b) the image of the path under the map Ω(vλ)Ω(vμ).

Ok, let’s define the Learner’s Mitchell-Benabou language.

We’ll view a learner PyPCyA×B as a set-valued representation P of the directed graph G with vertex set Pλ placed at vertex vλ.

A formula ϕ(p) of the language with a free variable p is a morphism (of representations of G) from a learner P to the subobject classifier
ϕ : PΩ
Such a morphism determines a sub-representation of P which we can denote {p|ϕ(p)} with vertex sets
{p|ϕ(p)}λ={pPλ | ϕ(vλ)(p)=1λ}

On formulas we can apply logical connectives to get more formulas. For example, the formula ϕ(p)ψ(q) is the composition
P×Qϕ×ψΩ×ΩΩ

By quantifying all free variables we get a formula without free variables, and those correspond to morphisms 1Ω, that is, to sub-representations of the terminal object 1.

For example, if ϕ(p) is the formula with free variable p corresponding to the morphism ϕ:PΩ, then we have
p:ϕ(p)={vλV | {p|ϕ(p)}λ=Pλ}
and
p:ϕ(p)={vλV | {p|ϕ(p)}λ}

Sub-representations of 1 again form a Heyting-algebra in the obvious way, so we can assign a “truth-value” to a formula without free variables as that sub-object of 1.

There’s a lot more to say, so perhaps this will be continued.

Published in geometry math

Comments are closed.