The proof that π = 4
Recreational math: why a troll proof involving circles is less wrong than it seems.
In an article published here last week, I discussed the perils of thinking about infinity as a number. More specifically, I criticized the structure of some of the elementary proofs that 0.9999… = 1.
As a teaching prop, I wheeled out the following equation:
This is an endless sum of alternating +1 and -1 terms. Pairwise, they all work out to zero, so the equation seems to make sense.
At the same time, there’s no risk of running out of terms in an infinite sum, so it seems harmless to shift the annotations one position to the right:
This seems to be saying that 1 = 0. Oops.
The reason I like this “proof” is that it’s hard to reflexively dismiss. A common reaction is “oh yeah, but this left an unpaired - 1 at infinity”, but what does this mean? If there’s a single, specific element at the ∞-th position in the sum, what do we find at position ∞ + 1?…
In the earlier article, we concluded that in contexts like these, infinity must be understood as a process metaphor, not a number. We’re not talking about an infinite number of steps as much as we’re talking about the outcome of an ill-specified number of steps.
Got it, are we done now?
Well, sort of. Thinking of infinity as a process helps us make sense of a fair chunk of higher math, but it’s not always enough. Sometimes, it’s easy to fixate on the notion of infinity and miss more basic flaws in our reasoning.
Consider the following troll proof that π = 4:
We begin by drawing a circle with a diameter of 1; it follows that the circumference of this circle is 1 · π. We then draw a 1×1 square around the circle. The perimeter of the square is a sum of the lengths of its sides: 1 + 1 + 1 + 1 = 4.
Next, we “fold back” small sections near the corners of the square. We trim and reorient segments of length a such that the inverted corner just touches the circumference of the circle. Critically, this operation doesn’t change the perimeter of the outer shape. This should be fairly clear, but we can also double-check the result: each of the remaining long edges has a length of b = 1 - 2a; the newly-added corner sections are 8a in total. The sum of 4b + 8a = 4 · (1 - 2a) + 8a = 4 - 8a + 8a = 4, regardless of the exact values of a and b.
Yet, the outer shape is now evidently a better approximation of the circle. If we perform another iteration, folding back the eight protruding corners, we get even closer to a circle without changing the perimeter in any way. If we keep doing this forever, the seemingly inescapable conclusion is that we’ll get infinitely close to the shape of a unit circle while keeping the perimeter of 4. In other words, the circle’s circumference must be also equal to 4. Or, to put it more bluntly: π = 4.
Help me, Mr. Internet!
The “proof” is fascinating in part because it’s multilayered; it trips up both novices and people who are quite conversant in math. For example, many popular YouTube videos offer explanations that are unsatisfying, incomplete, or outright wrong.
If you ask on a math forum, you can simultaneously get closer to and farther away from the truth. The usual response is something along the lines of:
I get it: mathematics doesn’t concern itself with intuition or reality. It operates in a closed universe of axioms; the main thrust of the discipline is to make these axioms as abstract as possible, and specify them as precisely as possible. So, if you don’t want to learn the lingo of mathematical analysis, what are you doing here?
At the same time, you might be one of these entitled bozos who just want to know why the π = 4 proof is wrong. If so, to peel off the first layer, don’t get distracted by the part about infinity. We start by distilling the troll proof to a simpler but functionally identical case — trying to find the length of the diagonal of a 1×1 square:
We have a diagonal of some unknown length. We make the first approximation with a path consisting of a single horizontal segment and a single vertical segment (arrows, left). The overall length of this path is 1 + 1 = 2.
Next, similarly to the circle scenario, we fold back the corner where the two segments intersect. This seemingly gives us four identical sections, each half as long as before (middle diagram); the overall length of the stairstep path appears to be the same as before. We keep going; the shape gets closer and closer to the diagonal, but the walking distance along the jagged path evidently doesn’t budge. As before, the conclusion is that the diagonal has a length of 2, rather than the ~1.41 value you can measure with a ruler or calculate from the Pythagorean theorem.
So, what’s wrong with these proofs? Some of the highly-ranked attempts to debunk the troll proof claim that the resulting shape never actually gets close to what it purports to approximate. If so, the first thing we should confirm is that the construction process actually works the way the troll proof claims it does.
It helps to develop some well-defined metric for that. Most simply, we can analyze the pointwise distance between the stairstep approximation and the corresponding point of the diagonal. The following diagram should help:
On the left, I marked the peak pointwise distance between the diagonal and the initial approximation; this is labeled x; we could solve it using Pythagorean theorem, but we don’t really need to.
If we look at the rotated view in the bottom left of the figure, the actual distance between the curves changes linearly from 0 to x and back to zero. Because the ramps are linear, the average distance is simply one half the maximum. I’m going to invent a symbol for this error and write εshape = x/2.
In the center panel, the situation repeats for the next approximation: we have two triangles that are precisely half the size of the earlier one. Within the span of each of these triangles, peak deviation is x/2, so the average εshape = x/4. Finally, after one more iteration (right), we get εshape = x/8.
The deviation remaining after iteration c can be generalized as:
Again, in this equation, x is some positive constant that we couldn’t be bothered to calculate. Either way, this constant is divided by a denominator that grows without bound as the number of iterations grows, so the measured distance between the shapes robustly approaches zero over time. On a pointwise basis, the troll shape approximation algorithm looks fine.
If not this, then what?
If the method of constructing the approximation is correct, perhaps we’re mistaken about the length of the constructed curve? It doesn’t feel that way, but once again, it’s best to have a real metric in place:
On the left, we have a diagonal of some length n and a two-segment path (total length 2). The resulting length error — the difference in the walking distances associated with each of the two routes — is εwalk = 2 - n.
Next, let’s have a look at the middle diagram. Here, the length of the diagonal is obviously the same as before (n), while the stairstep curve has a length of 4 · ½; this yields εwalk = 4 · ½ - n — no change from before. The situation repeats on the right: εwalk = 8 · ¼ - n. The general formula for the length error after c steps is:
That’s to say, as long as the jaggies exist, εwalk appears to be constant (and pretty big) as we iterate.
So, what’s going on? Well, the actual problem isn’t the construction method at all: it’s that the proof implied a contradiction where none exists! When we were first presented with the troll argument, we should have asked ourselves if pointwise proximity and walking-path distance must be correlated to begin with. After all, you can take many routes from home to work or school that are geometrically distant, but have similar lengths. Conversely, two nearby walking paths can have vastly different lengths if one is straight as an arrow and the other zig-zags a lot. Halving the amplitude of every zig and zag, but then doubling their number, doesn’t really change anything.
So… the result is a different shape?
Yes and no. That’s where the proof trips up many folks who are more proficient in math. For a finite (but arbitrarily large) number of iterations, the answer is a firm yes: despite visual similarity, jagged circles and smooth circles are two wholly separate things. Making these jaggies small doesn’t make them disappear.
But if we take the “repeat forever” part of the troll proof literally, we enter the realm of mathematical fiction; in that realm, the answer can be different. In standard analysis — the prevailing flavor of fiction used to deal with infinity in algebraic contexts — attempts to formally analyze the scenario will show that our increasingly jagged curve somehow collapses to a smooth diagonal (or a smooth circle) the moment we start talking about the hypothetical outcome “at infinity”. In this view, the troll proof is incorrect in a different way: it implies that the outcome of an infinite process must bear some resemblance to what we’ve seen after an arbitrarily high but finite number of steps.
The best way to develop intuition about this gotcha is to have another look at the earlier formula for the pointwise (shape) error between the stairstep pattern and the diagonal:
We’d be forgiven to say that as c (the iteration count) tends to infinity, the value of εshape becomes infinitely small. It’s not wrong, but this kind of talk is verboten: as outlined in the earlier article, infinitesimals have no place on the real number line.
In essence, real numbers must obey the Archimedean property: for every positive real number a < b, multiplying a by some integer should allow you to flip the inequality (a · n > b). Infinitesimal numbers, which could be very loosely visualized as fractions with “infinity in the denominator”, can’t possibly obey this rule. Allowing them in reals would cause a wide range of thorny algebraic issues that almost everyone would rather avoid.
This means that in standard mathematical discourse, “infinitely close to zero” effectively means the same as “equal to zero”. Because of this, the limit of εshape is zero. And if εshape = 0, then we must conclude that the two figures no longer differ in any way. This also implies that “at infinity” — and not a moment sooner — the jaggies vanish and the length of the constructed curve jumps from 2 to √2 (in the case of a diagonal), or from 4 or π (in the case of a circle).
The presence of this jump may sound weird, but there’s nothing that prohibits such a discontinuity. Infinity is not a part of the continuum of real numbers: it’s an abstraction for a place as distant from it as you can get. Sudden shifts can happen as we take a gigantic leap from “here” to “there”.
The apparent collapse of our infinitely-jagged shaped doesn’t have any profound meaning; it’s just an outcome of an thought experiment in a framework where numbers must be finite, but processes can continue without end. This asymmetry can produce wacky results elsewhere, too; the previously-discussed case of 0.9999… = 1 is another manifestation of the same phenomenon.
If we’re in a philosophical mood, we could insist that the geometric fine structure of the curve survives, just becomes too small to ever exert any influence on real numbers. That’s not just grasping at straws: there are nonstandard analysis approaches that allow infinitesimals and that would keep the two curves distinguishable, for some definitions of infinity.
👉 For more articles about math, visit this page. In particular, you might enjoy:
I write well-researched, original articles about geek culture, electronic circuit design, algorithms, and more. If you like the content, please subscribe.









Another interpretation of the “infinity times infinitesimal error” statement is just to realize that the staircase only _appears_ to converge to the diagonal; if you zoom in closely enough you see that the staircase never in fact actually converges. So there is no real mystery; it’s just a matter of scale
For some reason, to me, this makes sense intuitively, while on your previous article I mentioned that 0.(9) still bothers me somehow.