<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Poly &#8211; neverendingbooks</title>
	<atom:link href="https://lievenlebruyn.github.io/neverendingbooks/tag/poly/feed/" rel="self" type="application/rss+xml" />
	<link>https://lievenlebruyn.github.io/neverendingbooks/</link>
	<description></description>
	<lastBuildDate>Sat, 31 Aug 2024 11:08:25 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.6.1</generator>
	<item>
		<title>Learners and Poly</title>
		<link>https://lievenlebruyn.github.io/neverendingbooks/learners-and-poly/</link>
					<comments>https://lievenlebruyn.github.io/neverendingbooks/learners-and-poly/#comments</comments>
		
		<dc:creator><![CDATA[lieven]]></dc:creator>
		<pubDate>Sat, 29 Jan 2022 10:03:09 +0000</pubDate>
				<category><![CDATA[geometry]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[Gavranovic]]></category>
		<category><![CDATA[learners]]></category>
		<category><![CDATA[neural networks]]></category>
		<category><![CDATA[Poly]]></category>
		<category><![CDATA[Spivak]]></category>
		<category><![CDATA[topos]]></category>
		<guid isPermaLink="false">http://www.neverendingbooks.org/?p=10133</guid>

					<description><![CDATA[Brendan Fong, David Spivak and Remy Tuyeras cooked up a vast generalisation of neural networks in their paper Backprop as Functor: A compositional perspective on&#8230;]]></description>
										<content:encoded><![CDATA[<p>Brendan Fong</a>, David Spivak and Remy Tuyeras cooked up a vast generalisation of neural networks in their paper <a href="https://arxiv.org/abs/1711.10455">Backprop as Functor: A compositional perspective on supervised learning</a>.</p>
<p>Here&#8217;s a nice introduction to neural networks for category theorists by <a href="https://www.brunogavranovic.com/">Bruno Gavranovic</a>. At 1.49m he tries to explain supervised learning with neural networks in one slide. Learners show up later in the talk.</p>
<p><iframe width="560" height="315" src="https://www.youtube.com/embed/ji8MHKlQZ9w?start=109" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></p>
<p>$\mathbf{Poly}$ is the category of all polynomial functors, that is, things of the form<br />
\[<br />
p = \sum_{i \in p(1)} y^{p[i]}~:~\mathbf{Sets} \rightarrow \mathbf{Sets} \qquad S \mapsto \bigsqcup_{i \in p(1)} Maps(p[i],S) \]<br />
with $p(1)$ and all $p[i]$ sets.</p>
<p><a href="https://lievenlebruyn.github.io/neverendingbooks/poly">Last time</a> I gave Spivak&#8217;s &#8216;corolla&#8217; picture to think about such functors.</p>
<p>I prefer to view $p \in \mathbf{Poly}$ as an horribly discrete &#8216;sheaf&#8217; $\mathcal{P}$ over the &#8216;space&#8217; $p(1)$ with stalk $p[i]=\mathcal{P}_i$ at point $i \in p(1)$.</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/poly1.png" width=40% \><br />
</center></p>
<p>A morphism $p \rightarrow q$ in $\mathbf{Poly}$ is a map $\varphi_1 : p(1) \rightarrow q(1)$, together with for all $i \in p(1)$ a map $\varphi^{\#}_i : q[\varphi_1(i)] \rightarrow p[i]$.</p>
<p>In the sheaf picture, this gives a map of sheaves over the space $p(1)$ from the inverse image sheaf $\varphi_1^* \mathcal{Q}$ to $\mathcal{P}$.</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/poly2.png" width=70% \><br />
</center></p>
<p>But, unless you dream of sheaves in the night, by all means stick to Spivak&#8217;s corolla picture.</p>
<p>A <em>learner</em> $A \rightarrow B$ between two sets $A$ and $B$ is a complicated tuple of things $(P,I,U,R)$:</p>
<ul>
<li>$P$ is a set, a <em>parameter space</em> of some maps from $A$ to $B$.</li>
<li>$I$ is the <em>interpretation map</em> $I : P \times A \rightarrow B$ describing the maps in $P$.</li>
<li>$U$ is the <em>update map</em> $U : P \times A \times B \rightarrow P$, the learning procedure. The idea is that $U(p,a,b)$ is a map which sends $a$ closer to $b$ than the map $p$ did.</li>
<li>$R$ is the <em>request map</em> $R : P \times A \times B \rightarrow A$.</li>
</ul>
<p>Here&#8217;s a nice application of $\mathbf{Poly}$&#8217;s set-up:</p>
<p><strong>Morphisms $\mathbf{P y^P \rightarrow Maps(A,B) \times Maps(A \times B,A) y^{A \times B}}$ in $\mathbf{Poly}$ coincide with learners $\mathbf{A \rightarrow B}$ with parameter space $\mathbf{P}$.</strong></p>
<p>This follows from unpacking the definition of morphism in $\mathbf{Poly}$ and the process CT-ers prefer to call <a href="https://en.wikipedia.org/wiki/Currying">Currying</a>.</p>
<p>The space-map $\varphi_1 : P \rightarrow Maps(A,B) \times Maps(A \times B,A)$ gives us the interpretation and request-map, whereas the sheaf-map $\varphi^{\#}$ gives us the more mysterious update-map $P \times A \times B \rightarrow P$.</p>
<p>$\mathbf{Learn(A,B)}$ is the category with objects all the learners $A \rightarrow B$ (for all paramater-sets $P$), and with morphisms defined naturally, that is, maps between the parameter-sets, compatible with the structural maps.</p>
<p>A surprising result from David Spivak&#8217;s paper <a href="https://arxiv.org/abs/2103.01189">Learners&#8217; Languages</a> is</p>
<p><strong>$\mathbf{Learn(A,B)}$ is a topos. In fact, it is the topos of all set-valued representations of a (huge) directed graph $\mathbf{G_{AB}}$.</strong></p>
<p>This will take some time.</p>
<p>Let&#8217;s bring some dynamics in. Take any polynmial functor $p \in \mathbf{Poly}$ and fix a morphism in $\mathbf{Poly}$<br />
\[<br />
\varphi = (\varphi_1,\varphi[-])~:~p(1) y^{p(1)} \rightarrow p \]<br />
with space-map $\varphi_1$ the identity map.</p>
<p>We form a directed graph:</p>
<ul>
<li> the vertices are the elements of $p(1)$,</li>
<li> vertex $i \in p(1)$ is the source vertex of exactly one arrow for every $a \in p[i]$,</li>
<li> the target vertex of that arrow is the vertex $\phi[i](a) \in p(1)$.</li>
</ul>
<p>Here&#8217;s one possibility from Spivak&#8217;s paper for $p = 2y^2 + 1$, with the coefficient $2$-set $\{ \text{green dot, yellow dot} \}$, and with $1$ the singleton $\{ \text{red dot} \}$.</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/poly3.png" width=40% \><br />
</center></p>
<p>Start at one vertex and move after a minute along a directed edge to the next (possibly the same) vertex. The potential evolutions in time will then form a tree, with each node given a label in $p(1)$.</p>
<p>If we start at the green dot, we get this tree of potential time-evolutions</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/poly4.png" width=80% \><br />
</center></p>
<p>There are exactly $\# p[i]$ branches leaving a node labeled $i \in p(1)$, and all subtrees emanating from equal labelled nodes are isomorphic.</p>
<p>If we had started at the yellow dot we had obtained a labelled tree isomorphic to the subtree emanating here from any yellow dot.</p>
<p>We can do the same things for any morphism in $\mathbf{Poly}$ of the form<br />
\[<br />
\varphi = (\varphi_1,\varphi[-])~:~Sy^S \rightarrow p \]<br />
Now, we have a directed graph with vertices the elements $s \in S$, with as many edges leaving vertex $s$ as there are elements $a \in p[\varphi_1(s)]$, and with the target vertex of the edge labeled $a$ starting in $s$ the vertex $\varphi[\varphi_1(s)](A)$.</p>
<p>Once we have this directed graph on $\# S$ vertices we can label vertex $s$ with the label $\varphi_1(s)$ from $p(1)$.</p>
<p>In this way, the time evolutions starting at a vertex $s \in S$ will give us a $p(1)$-labelled rooted tree.</p>
<p>But now, it is possibly that two distinct vertices can have the same $p(1)$-labeled tree of evolutions. But also, trees corresponding to equal labeled vertices can be different.</p>
<p>Right, I guess we&#8217;re ready to define the graph $G_{AB}$ and prove that $\mathbf{Learn(A,B)}$ is a topos.</p>
<p>In the case of learners, we have the target polynomial functor $p=C y^{A \times B}$ with $C = Maps(A,B) \times Maps(A \times B,A)$, that is<br />
\[<br />
p(1) = C \quad \text{and all} \quad p[i]=A \times B \]</p>
<p>Start with the free rooted tree $T$ having exactly $\# A \times B$ branches growing from each node.</p>
<p>Here&#8217;s the directed graph $G_{AB}$:</p>
<ul>
<li><em>vertices</em> $v_{\chi}$ correspond to the different $C$-labelings of $T$, one $C$-labeled rooted tree $T_{\chi}$ for every map $\chi : vtx(T) \rightarrow C$,</li>
<li><em>arrows</em> $v_{\chi} \rightarrow v_{\omega}$ if and only if $T_{\omega}$ is the rooted $C$-labelled tree isomorphic to the subtree of $T_{\chi}$ rooted at one step from the root.</li>
</ul>
<p><strong>A learner $\mathbf{A \rightarrow B}$ gives a set-valued representation of $\mathbf{G_{AB}}$.</strong></p>
<p>We saw that a learner $A \rightarrow B$ is the same thing as a morphism in $\mathbf{Poly}$<br />
\[<br />
\varphi = (\varphi_1,\varphi[-])~:~P y^P \rightarrow C y^{A \times B} \]<br />
with $P$ the parameter set of maps.</p>
<p>Here&#8217;s what we have to do:</p>
<p>1. Draw the directed graph on vertices $p \in P$ giving the dynamics of the morphism $\varphi$. This graph describes how the learner can cycle through the parameter-set.</p>
<p>2. Use the map $\varphi_1$ to label the vertices with elements from $C$.</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/poly6.png" width=80% \><br />
</center></p>
<p>3. For each vertex draw the rooted $C$-labeled tree of potential time-evolutions starting in that vertex.</p>
<p>In this example the time-evolutions of the two green vertices are the same, but in general they can be different.</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/poly7.png" width=80% \><br />
</center></p>
<p>4. Find the vertices in $G_{AB}$ determined by these $C$-labeled trees and note that they span a full subgraph of $G_{AB}$.</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/poly8.png" width=60% \><br />
</center></p>
<p>5. The vertex-set $P_v$ consists of all elements from $p$ whose ($C$-labeled) vertex has evolution-tree $T_v$. If $v \rightarrow w$ is a directed edge in $G_{AB}$ corresponding to an element $(a,b) \in A \times B$, then the map on the vertex-sets corresponding to this edge is<br />
\[<br />
f_{v,(a,b)}~:~P_v \rightarrow P_w \qquad p \mapsto \varphi[\varphi_1(p)](a,b) \]</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/poly9.png" width=60% \><br />
</center></p>
<p><strong>A set-valued representation of $\mathbf{G_{AB}}$ gives a learner $\mathbf{A \rightarrow B}$.</strong></p>
<p>1. Take a set-valued representation of $G_{AB}$, that is, the finite or infinite collection of vertices $V$ in $G_{AB}$ where the vertex-set $P_v$ is non-empty. Note that these vertices span a full subgraph of $G_{AB}$.</p>
<p>And, for each directed arrow $v \rightarrow w$ in this subgraph, labeled by an element $(a,b) \in A \times B$ we have a map<br />
\[<br />
f_{v,(a,b)}~:~P_v \rightarrow P_w \]</p>
<p>2. The parameter set of our learner will be $P = \sqcup_v P_v$, the disjoint union of the non-empty vertex-sets.</p>
<p>3. The space-map $\varphi_1 : P \rightarrow C$ will send an element in $P_v$ to the $C$-label of the root of the tree $T_v$. This gives us already the interpretation and request maps<br />
\[<br />
I : P \times A \rightarrow B \quad \text{and} \quad R : P \times A \times B \rightarrow A \]</p>
<p>4. The update map $U : P \times A \times B \rightarrow P$ follows from the sheaf-map we can define stalk-wise<br />
\[<br />
\varphi[\varphi_1(p)](a,b) = f_{v,(a,b)}(p) \]<br />
if $p \in P_v$.</p>
<p>That&#8217;s all folks!</p>
<p>$\mathbf{Learn(A,B)}$ is equivalent to the (covariant) functors $\mathbf{G_{AB} \rightarrow Sets}$.</p>
<p>Changing the directions of all arrows in $G_{AB}$ any covariant functor $\mathbf{G_{AB} \rightarrow Sets}$ becomes a contravariant functor $\mathbf{G_{AB}^o \rightarrow Sets}$, making $\mathbf{Learn(A,B)}$ an honest to Groth topos!</p>
<p>Every topos comes with its own logic, so we have a &#8216;learners&#8217; logic&#8217;. (to be continued)</p>
]]></content:encoded>
					
					<wfw:commentRss>https://lievenlebruyn.github.io/neverendingbooks/learners-and-poly/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>Poly</title>
		<link>https://lievenlebruyn.github.io/neverendingbooks/poly/</link>
		
		<dc:creator><![CDATA[lieven]]></dc:creator>
		<pubDate>Wed, 26 Jan 2022 12:07:40 +0000</pubDate>
				<category><![CDATA[geometry]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[stories]]></category>
		<category><![CDATA[applied category theory]]></category>
		<category><![CDATA[Poly]]></category>
		<category><![CDATA[Spivak]]></category>
		<guid isPermaLink="false">http://www.neverendingbooks.org/?p=10109</guid>

					<description><![CDATA[Following up on the deep learning and toposes-post, I was planning to do something on the logic of neural networks. Prepping for this I saw&#8230;]]></description>
										<content:encoded><![CDATA[<p>Following up on the <a href="https://lievenlebruyn.github.io/neverendingbooks/deep-learning-and-toposes">deep learning and toposes</a>-post, I was planning to do something on the logic of neural networks.</p>
<p>Prepping for this I saw David Spivak&#8217;s paper <a href="https://arxiv.org/abs/2103.01189">Learner&#8217;s Languages</a> doing exactly that, but in the more general setting of &#8216;learners&#8217; (see also the deep learning post).</p>
<p>And then &#8230; I fell under the spell of $\mathbf{Poly}$.</p>
<p>Spivak is a story-telling talent. A long time ago I copied his short story (actually his abstract for a talk) &#8220;Presheaf, the cobbler&#8221; in the <a href="https://lievenlebruyn.github.io/neverendingbooks/children-have-always-loved-colimits">Children have always loved colimits</a>-post.</p>
<p>Last week, he did post <a href="https://topos.site/blog/2022/01/poly-makes-me-happy-and-smart/">Poly makes me happy and smart</a> on the blog of the Topos Institute, which is another great read.</p>
<p>If this is way too &#8216;fluffy&#8217; for you, perhaps you should watch his talk <a href="https://www.youtube.com/watch?v=Cp5_o2lDqj0">Poly: a category of remarkable abundance</a>.</p>
<p><iframe width="560" height="315" src="https://www.youtube.com/embed/Cp5_o2lDqj0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></p>
<p>If you like (applied) category theory and have some days to waste, you can binge-watch all 15 episodes of the Poly-course <a href="https://topos.site/poly-course/">Polynomial Functors: A General Theory of Interaction</a>.</p>
<p>If you are more the reading-type, the 273 pages of the <a href="https://topos.site/poly-book.pdf">Poly-book</a> will also kill a good number of your living hours.</p>
<p>Personally, I have no great appetite for category theory, I prefer to digest it in homeopathic doses. And, I&#8217;m allergic to co-terminology.</p>
<p>So then, how to define $\mathbf{Poly}$ for the likes of me?</p>
<p>$\mathbf{Poly}$, you might have surmised, is a category. So, we need &#8216;objects&#8217; and &#8216;morphisms&#8217; between them.</p>
<p>Any set $A$ has a corresponding &#8216;representable functor&#8217; sending a given set $S$ to the set of all maps from $A$ to $S$<br />
\[<br />
y^A~:~\mathbf{Sets} \rightarrow \mathbf{Sets} \qquad S \mapsto S^A=Maps(A,S) \]<br />
This looks like a monomial in a variable $y$ ($y$ for Yoneda, of course), but does it work?</p>
<p>What is $y^1$, where $1$ stands for the one-element set $\{ \ast \}$? $Maps(1,S)=S$, so $y^1$ is the identity functor sending $S$ to $S$.</p>
<p>What is $y^0$, where $0$ is the empty set $\emptyset$? Well, for any set $S$ there is just one map $\emptyset \rightarrow S$, so $y^0$ is the constant functor sending any set $S$ to $1$. That is, $y^0=1$.</p>
<p>Going from monomials to <em>polynomials</em> we need an addition. We add such representable functors by taking disjoint unions (finite or infinite), that is<br />
\[<br />
\sum_{i \in I} y^{A_i}~:~\mathbf{Sets} \rightarrow \mathbf{Sets} \qquad S \mapsto \bigsqcup_{i \in I} Maps(A_i,S) \]<br />
If all $A_i$ are equal (meaning, they have the same cardinality) we use the shorthand $Iy^A$ for this sum.</p>
<p>The <em>objects</em> in $\mathbf{Poly}$ are exactly these &#8216;polynomial functors&#8217;<br />
\[<br />
p = \sum_{i \in I} y^{p[i]} \]<br />
with all $p[i] \in \mathbf{Sets}$. Remark that $p(1)=I$ as for any set $A$ there is just one map to $1$, that is $y^A(1) = Maps(A,1) = 1$, and we can write<br />
\[<br />
p = \sum_{i \in p(1)} y^{p[i]} \]<br />
An object $p \in \mathbf{Poly}$ is thus described by the couple $(p(1),p[-])$ with $p(1)$ a set, and a functor $p[-] : p(1) \rightarrow \mathbf{Sets}$ where $p(1)$ is now a category with objects the elements of $p(1)$ and no morphisms apart from the identities.</p>
<p>We can depict $p$ by a trimmed down forest, Spivak calls it the <em>corolla</em> of $p$, where the tree roots are the elements of $p(1)$ and the tree with root $i \in p(1)$ has one branch from the root for any element in $p[i]$. The corolla of $p=y^2+2y+1$ looks like</p>
<p><center><br />
<img decoding="async" src="https://lievenlebruyn.github.io/neverendingbooks/DATA3/corollaPoly.png" width=50% \><br />
</center></p>
<p>If $M$ is an $m$-dimensional manifold, then you might view its tangent bundle $TM$ set-theoretically as the &#8216;corolla&#8217; of the polynomial functor $M y^{\mathbb{R}^m}$, the tree-roots corresponding to the points of the manifold, and the branches to the different tangent vectors in these points.</p>
<p><em>Morphisms</em> in $\mathbf{Poly}$ are a bit strange. For two polynomial functors $p=(p(1),p[-])$ and $q=(q(1),q[-])$ a map $p \rightarrow q$ in $\mathbf{Poly}$ consists of</p>
<ul>
<li>a map $\phi_1 : p(1) \rightarrow q(1)$ on the tree-roots in the right direction, and</li>
<li>for any $i \in p(1)$ a map $q[\phi_1(i)] \rightarrow p[i]$ on the branches in the opposite direction</li>
</ul>
<p>In our manifold/tangentbundle example, a morphism $My^{\mathbb{R}^m} \rightarrow y^1$ sends every point $p \in M$ to the unique root of $y^1$ and the unique branch in $y^1$ picks out a unique tangent-vector for every point of $M$. That is, vectorfields on $M$ are very special (smooth) morphisms $Mu^{\mathbb{R}^m} \rightarrow y^1$ in $\mathbf{Poly}$.</p>
<p>A smooth map between manifolds $M \rightarrow N$, does <em>not</em> determine a morphism $My^{\mathbb{R}^m} \rightarrow N y^{\mathbb{R}^n}$ in $\mathbf{Poly}$ because tangent vectors are pushed forward, not pulled back.</p>
<p>If instead we view the cotangent bundle $T^*M$ as the corolla of the polynomial functor $My^{\mathbb{R}^m}$, then everything works well.</p>
<p>But then, I promised not to use co-terminology&#8230;</p>
<p>Another time I hope to tell you how $\mathbf{Poly}$ helps us to understand the logic of learners.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
