Posts by Tag

attention

Attention Diagnostics: Testing KL and Susceptibility on the IOI Circuit

9 minute read

The previous post introduced KL selectivity and susceptibility χ as per-head diagnostics derivable from attention weights alone. Here I test them on GPT-2-small’s IOI circuit: can two scalar statistics, computed from a single forward pass, distinguish the 23 known circuit heads from the other 121? It seems so!

Why Softmax? A Hypothesis Testing Perspective on Attention Weights

8 minute read

Softmax is ubiquitous in transformers, yet its role in attention can feel more heuristic than inevitable. In this post, I try to make it feel more natural and show how this interpretation suggests useful diagnostics for the often circuit-like behavior of attention heads.

Back to Top ↑

softmax

Why Softmax? A Hypothesis Testing Perspective on Attention Weights

8 minute read

Softmax is ubiquitous in transformers, yet its role in attention can feel more heuristic than inevitable. In this post, I try to make it feel more natural and show how this interpretation suggests useful diagnostics for the often circuit-like behavior of attention heads.

Back to Top ↑

hypothesis testing

Why Softmax? A Hypothesis Testing Perspective on Attention Weights

8 minute read

Softmax is ubiquitous in transformers, yet its role in attention can feel more heuristic than inevitable. In this post, I try to make it feel more natural and show how this interpretation suggests useful diagnostics for the often circuit-like behavior of attention heads.

Back to Top ↑

KL divergence

Why Softmax? A Hypothesis Testing Perspective on Attention Weights

8 minute read

Softmax is ubiquitous in transformers, yet its role in attention can feel more heuristic than inevitable. In this post, I try to make it feel more natural and show how this interpretation suggests useful diagnostics for the often circuit-like behavior of attention heads.

Back to Top ↑

machine learning

Why Softmax? A Hypothesis Testing Perspective on Attention Weights

8 minute read

Softmax is ubiquitous in transformers, yet its role in attention can feel more heuristic than inevitable. In this post, I try to make it feel more natural and show how this interpretation suggests useful diagnostics for the often circuit-like behavior of attention heads.

Back to Top ↑

deep learning

Why Softmax? A Hypothesis Testing Perspective on Attention Weights

8 minute read

Softmax is ubiquitous in transformers, yet its role in attention can feel more heuristic than inevitable. In this post, I try to make it feel more natural and show how this interpretation suggests useful diagnostics for the often circuit-like behavior of attention heads.

Back to Top ↑

mechanistic-interpretability

Attention Diagnostics: Testing KL and Susceptibility on the IOI Circuit

9 minute read

The previous post introduced KL selectivity and susceptibility χ as per-head diagnostics derivable from attention weights alone. Here I test them on GPT-2-small’s IOI circuit: can two scalar statistics, computed from a single forward pass, distinguish the 23 known circuit heads from the other 121? It seems so!

Back to Top ↑

transformers

Attention Diagnostics: Testing KL and Susceptibility on the IOI Circuit

9 minute read

The previous post introduced KL selectivity and susceptibility χ as per-head diagnostics derivable from attention weights alone. Here I test them on GPT-2-small’s IOI circuit: can two scalar statistics, computed from a single forward pass, distinguish the 23 known circuit heads from the other 121? It seems so!

Back to Top ↑

IOI-circuit

Attention Diagnostics: Testing KL and Susceptibility on the IOI Circuit

9 minute read

The previous post introduced KL selectivity and susceptibility χ as per-head diagnostics derivable from attention weights alone. Here I test them on GPT-2-small’s IOI circuit: can two scalar statistics, computed from a single forward pass, distinguish the 23 known circuit heads from the other 121? It seems so!

Back to Top ↑

diagnostics

Attention Diagnostics: Testing KL and Susceptibility on the IOI Circuit

9 minute read

The previous post introduced KL selectivity and susceptibility χ as per-head diagnostics derivable from attention weights alone. Here I test them on GPT-2-small’s IOI circuit: can two scalar statistics, computed from a single forward pass, distinguish the 23 known circuit heads from the other 121? It seems so!

Back to Top ↑