Skip to content →

Tag: logic

Stephen Wolfram on ChatGPT

A month ago, Stephen Wolfram put out a little booklet (140 pages) What Is ChatGPT Doing … and Why Does It Work?.



It gives a gentle introduction to large language models and the architecture and training of neural networks.

The entire book is freely available:

The advantage of these online texts is that you can click on any of the images, copy their content into a Mathematica notebook, and play with the code.

This really gives a good idea of how an extremely simplified version of ChatGPT (based on GPT-2) works.

Downloading the model (within Mathematica) uses about 500Mb, but afterwards you can complete any prompt quickly, and see how the results change if you turn up the ‘temperature’.

You should’t expect too much from this model. Here’s what it came up with from the prompt “The major results obtained by non-commutative geometry include …” after 20 steps, at temperature 0.8:


NestList[StringJoin[#, model[#, {"RandomSample", "Temperature" -> 0.8}]] &,
"The major results obtained by non-commutative geometry include ", 20]

The major results obtained by non-commutative geometry include vernacular accuracy of math and arithmetic, a stable balance between simplicity and complexity and a relatively low level of violence.

Lol.

In the more philosophical sections of the book, Wolfram speculates about the secret rules of language that ChatGPT must have found if we want to explain its apparent succes. One of these rules, he argues, must be the ‘logic’ of languages:

But is there a general way to tell if a sentence is meaningful? There’s no traditional overall theory for that. But it’s something that one can think of ChatGPT as having implicitly “developed a theory for” after being trained with billions of (presumably meaningful) sentences from the web, etc.

What might this theory be like? Well, there’s one tiny corner that’s basically been known for two millennia, and that’s logic. And certainly in the syllogistic form in which Aristotle discovered it, logic is basically a way of saying that sentences that follow certain patterns are reasonable, while others are not.

Something else ChatGPT may have discovered are language’s ‘semantic laws of motion’, being able to complete sentences by following ‘geodesics’:

And, yes, this seems like a mess—and doesn’t do anything to particularly encourage the idea that one can expect to identify “mathematical-physics-like” “semantic laws of motion” by empirically studying “what ChatGPT is doing inside”. But perhaps we’re just looking at the “wrong variables” (or wrong coordinate system) and if only we looked at the right one, we’d immediately see that ChatGPT is doing something “mathematical-physics-simple” like following geodesics. But as of now, we’re not ready to “empirically decode” from its “internal behavior” what ChatGPT has “discovered” about how human language is “put together”.

So, the ‘hidden secret’ of successful large language models may very well be a combination of logic and geometry. Does this sound familiar?

If you prefer watching YouTube over reading a book, or if you want to see the examples in action, here’s a video by Stephen Wolfram. The stream starts about 10 minutes into the clip, and the whole lecture is pretty long, well over 3 hours (about as long as it takes to read What Is ChatGPT Doing … and Why Does It Work?).

Comments closed

The strange logic of subways

“A subway is just a hole in the ground, and that hole is a maze.”

“The map is the last vestige of the old system. If you can’t read the map, you can’t use the subway.”

Eddie Jabbour in Can he get there from here? (NYT)

Sometimes, lines between adjacent stations can be uni-directional (as in the Paris Metro map below in the right upper corner, 7bis). So, it is best to view a subway map as a directed graph, with vertices the different stations, and directed arrows when there’s a service connecting two adjacent stations.



Aha! But, directed graphs form a presheaf topos. So, each and every every subway in the world comes with its own logic, its own bi-Heyting algebra!

Come again…?

Let’s say Wally (or Waldo, or Charlie) is somewhere in the Paris metro, and we want to find him. One can make statements like:

$P$ = “Wally is on line 3bis from Gambetta to Porte des Lilas.”, or

$Q$ = “Wally is traveling along line 11.”

Each sentence pinpoints Wally’s location to some directed subgraph of the full Paris metro digraph, let’s call this subgraph the ‘scope’ of the sentence.

We can connect such sentences with logical connectives $\vee$ or $\wedge$ and the scope will then be the union or intersection of the respective scopes.

The scope of $P \vee Q$ is the directed subgraph of line 11 (in both directions) together with the directed subgraph of line 3bis from Gambetta to Porte des Lilas.

The scope of $P \wedge Q$ is just the vertex corresponding to Porte des Lilas.

The scope of the negation $\neg R$ of a sentence $R$ is the subgraph complement of the scope of $R$, so it is the full metro graph minus all vertices and directed edges in $R$-scope, together with all directed edges starting or ending in one of the deleted vertices.

For example, the scope of $\neg P$ does not contain directed edges along 3bis in the reverse direction, nor the edges connecting Gambetta to Pere Lachaise, and so on.

In the Paris metro logic the law of double negation does not hold.



$\neg \neg P \not= P$ as both statements have different scopes. For example, the reverse direction of line 3bis is part of the scope of $\neg \neg P$, but not of $P$.

So, although the scope of $P \wedge \neg P$ is empty, that of $P \vee \neg P$ is not the full digraph.

The logical operations $\vee$, $\wedge$ and $\neg$ do not turn the partially ordered set of all directed subgraphs of the Paris metro into a Boolean algebra structure, but rather a Heyting algebra.

Perhaps we were too drastic in removing all “problematic edges” from the scope of $\neg R$ (those with a source or target station belonging to the scope of $R$)?

We might have kept all problematic edges, and added the missing source and/or target stations to get the scope of another negation of $R$: $\sim R$.

Whereas the scope of $\neg \neg R$ always contains that of $R$, the scope of $\sim \sim R$ is contained in $R$’s scope.

The scope of $R \vee \sim R$ will indeed be the whole graph, but now $R \wedge \sim R$ does no longer have to be empty. For example, $P \wedge \sim P$ has as its scope all stations on line 3bis.

In general $R \wedge \sim R$ will be called the ‘boundary’ $\partial(R)$ of $R$. It consists of all stations within $R$’s scope that are connected to the outside of $R$’s scope.

The logical operations $\vee$, $\wedge$, $\neg$ and $\sim$ make the partially ordered set of all directed subgraphs of the Paris metro into a bi-Heyting algebra.

There’s plenty more to say about all of this (and I may come back to it later). For the impatient, there’s the paper by Reyes and Zolfaghari: Bi-Heyting Algebras, Toposes and Modalities.

Right now, I’m more into exploring whether this setting can be used to revive an old project of mine: Heyting Smullyanesque problems (btw. the algebra in that post is not Heyting, oops!).

Comments closed

Knights and Knaves, the Heyting way

(image credit: Joe Blitzstein via Twitter)

Smullyan’s Knights and Knaves problems are classics. On an island all inhabitants are either Knights (who only tell true things) and Knaves (who always lie). You have to determine their nature from a few statements. Here’s a very simple problem:

“Abercrombie met just two inhabitants, A and B. A made the following statement: “Both of us are Knaves.” What is A and what is B?”

Now, this one is simple enough to solve, but for more complicated problems a generic way to solve the puzzles is to use propositional calculas, as explained in Smullyan’s Logical Labyrinths”, chapter 8 “Liars, truth-tellers and propositional logic’.

If an inhabitants $A$ asserts a proposition $P$, and if $k_A$ is the assertion ‘$A$ is a Knight’, then the statement can be rephrased as

\[
k_A \Leftrightarrow P \]

for if $A$ is a Knight, $P$ must be true and if $A$ is a Knave $P$ must be false.

Usually, one can express $P$ as a propositional statement involving $k_A,k_B,k_C,\dots$.
The example above can be rephrased as

\[
k_A \Leftrightarrow (\neg k_A \wedge \neg k_B) \]

Assigning truth values to $k_A$ and $k_B$ and setting up the truth-table for this sentence, one sees that the only possibility for this to be true is that $k_A$ equals $0$ and $k_B$ equals $1$. So, $A$ is a Knave and $B$ is a Knight.

Clearly, one only requires this approach for far more difficult problems.

In almost all Smullyan puzzles, the only truth values are $0$ and $1$. There’s a short excursion to Boolean algebras (sorry, Boolean islands) in chapter 9 ‘Variable Liars’ in Logical Labyrinths. But then, the type of problems are about finding equivalent notions of Boolean algebras, rather that generalised Knights&Knaves puzzles.

Did anyone pursue the idea of Smullyanesque puzzles with truth values in a proper Heyting algebra?

I only found one blog-post on this: Non-Classical Knights and Knaves by Jason Rosenhouse.

He considers three valued logic (the Heyting algebra corresponding to the poset 0-N-1, and logical connectives as in the example on the Wiki-page on Heyting algebras.

On his island the natives cycle, repeatedly and unpredictably, between the two states. They are knights for a while, then they enter a transitional phase during which they are partly knight and partly knave, and then they emerge on the other side as knaves.

“If Joe is in the transitional phase, and you say, “Joe is a knight,” or “Joe is a knave,” what truth value should we assign to your statement? Since Joe is partly knight and partly knave, neither of the classical truth values seems appropriate. So we shall assign a third truth value, “N” to such statements. Think of N as standing for “neutral” or “neither true nor false.” On the island, vague statements are assigned the truth value N.

Just to be clear, it’s not just any statement that can be assigned the truth value N. It is only vague statements that receive that truth value, and for now our only examples of such statements are attributions of knight-hood and knave-hood to people in the transitional phase.

For the natives, entering the transitional phase implied a disconcerting loss of identity. Uncertain of how to behave, they hedged their bets by only making statements with truth value N. People in the transitional phase were referred to as neutrals. So there are now three kinds of people: Knights, who only make true statements; Knaves, who only make false statements; and Neutrals, who only make statements with the truth value N.”

He gives one example of a possible problem:

“Suppose you meet three people, named Dave, Evan and Ford. They make the following statements:

Dave: Evan is a knight.
Evan: Ford is a knave.
Ford: Dave is a neutral.

Can you determine the types of all three people?”

If you know of more of these Smullanesque problems using Heyting algebras, please leave a comment.

Comments closed