The proof that π = 4
Recreational math: why a troll proof involving circles is less wrong than it seems.
In an article published here last week, I discussed the perils of thinking about infinity as a number. More specifically, I criticized the structure of some of the elementary proofs that 0.9999… = 1.
As a teaching prop, I wheeled out the following equation:
This is an endless sum of alternating +1 and -1 terms. Pairwise, they all work out to zero, so the equation seems to make sense.
At the same time, there’s no risk of running out of terms in an infinite sum, so it seems harmless to shift the annotations one position to the right:
This seems to be saying that 1 = 0. Oops.
The reason I like this “proof” is that it’s hard to reflexively dismiss. A common reaction is “oh yeah, but this left an unpaired - 1 at infinity”, but what does this mean? If there’s a single, specific element at the ∞-th position in the sum, what do we find at position ∞ + 1?…
In the earlier article, we concluded that in contexts like these, infinity must be understood as a process metaphor, not a quantity. We’re not talking about an infinite number of steps as much as we’re talking about an ill-specified number of steps.
Got it, are we done now?
Well, sort of. Thinking of infinity as a process helps us make sense of a fair chunk of higher math, but it’s not always enough. Sometimes, it’s easy to fixate on the notion of infinity and miss more basic flaws in our reasoning.
Consider the following troll proof that π = 4:
We begin by drawing a circle with a diameter of 1; it follows that the circumference of this circle is 1π. We then draw a 1×1 square around the circle. The perimeter of the square is a sum of the lengths of its sides: 1 + 1 + 1 + 1 = 4.
Next, we “fold back” small sections near the corners of the square. We trim and reorient segments of length a such that the inverted corner just touches the circumference of the circle. Critically, this operation doesn’t change the perimeter of the outer shape. This should be fairly clear, but we can also double-check the result: each of the remaining long edges has a length of b = 1 - 2a; the newly-added corner sections are 8a in total. The sum of 4b + 8a works out back to 4.
Yet, the outer shape is now evidently a better approximation of the circle. If we perform another iteration, folding back the eight protruding corners, we get even closer to a circle without changing the perimeter in any way. If we keep doing this forever, the seemingly inescapable conclusion is that we’ll get infinitely close to the shape of a unit circle while keeping the perimeter of 4. In other words, the circle’s circumference must be also equal to 4. Or, to put it more bluntly: π = 4.
Help me, Mr. Internet!
The “proof” is fascinating in part because it’s multilayered; it trips up both novices and people who are quite conversant in math. For example, many popular YouTube videos offer explanations that are unsatisfying, incomplete, or outright wrong.
If you ask on a math forum, you get both closer and farther away from truth, because the usual response is something along the lines of:
I get it: mathematics doesn’t concern itself with intuition or reality. It operates in a closed universe of axioms; the main thrust of the discipline is to make these axioms as abstract as possible, and specify them as precisely as possible. So, if you don’t want to learn the lingo of mathematical analysis, what are you doing here?
At the same time, you might be one of these entitled bozos who just want to know why the π = 4 proof is wrong. If so, to peel off the first layer of the proof, don’t get distracted by the part about infinity. We start by distilling the troll proof to a simpler but functionally identical case — trying to find the length of the diagonal of a 1×1 square:
We have a diagonal of some unknown length. We make the first approximation with a path consisting of a single horizontal segment and a single vertical segment (arrows, left). The overall length of this path is 1 + 1 = 2.
Next, similarly to the circle scenario, we fold back the corner where the two segments intersect. This seemingly gives us four identical sections, each half as long as before (middle diagram); the overall length of the stairstep path appears to be the same as before. We keep going; the shape gets closer and closer to the diagonal, but the walking distance along the jagged path evidently doesn’t budge. As before, the conclusion is that the diagonal has a length of 2, rather than the ~1.41 value you can measure with a ruler or calculate from the Pythagorean theorem.
So, what’s wrong with these proofs? The first thing we should confirm is that the construction process properly converges on what it claims to converge on. If not, maybe we’re just looking at some incarnation of Zeno’s paradox?
It helps to develop some well-defined metric for that. Most simply, we can analyze the pointwise distance between the stairstep approximation and the diagonal. The following diagram should help:
On the left, I marked the peak distance between the diagonal and the initial approximation; this is labeled x. If we look at the rotated view in the lower part of the figure, the actual distance changes linearly from 0 to x (and back). So, finding the average deviation is akin to calculating the average water level in a bucket that’s steadily filling up from empty to full. The average deviation is simply 50% of the maximum. I’m going to invent a symbol for the average error and write εshape = x/2.
In the center panel, the situation repeats: we have two triangles that are precisely half the size of the previous one. Within the span of each of these triangles, peak deviation is x/2, so the average is x/4; this doesn’t change if we line up both tringles side-by-side. The calculated average deviation is εshape = x/4. Finally, after one more iteration (right), we get εshape = x/8.
The deviation remaining after iteration c can be generalized as:
In this equation, x is just some finite constant value; we could calculate it, but we don’t need to. Either way, on the face of it, the expression robustly moves toward zero as we iterate — so from the perspective of pointwise shape error, we can get arbitrarily close to the target, and our shape approximation algorithm looks fine.
If not this, then what?
If the method of constructing the approximation is correct, perhaps we’re mistaken about the length of the constructed shape? It doesn’t feel that way, but once again, it’s best to have a firm metric in place:
On the left, we have a diagonal of some length n and a two-segment path (total length 2). The resulting path error — the walking-distance difference between the two routes — is εpath = 2 - n.
Next, let’s have a look at the middle diagram. Here, the length of the diagonal is obviously the same as before (n), while the stairstep curve has a length of 4 · ½; this yields εpath = 4 · ½ - n — no change from before. The situation repeats on the right: εpath = 8 · ¼ - n. The general formula for the error after c steps is:
That’s to say, εpath appears independent of εshape; it remains constant (and pretty big) as we iterate.
So, what’s actually going on? Well, we can observe that the diagonal is smooth while the stairstep approximation gets increasingly jagged. In each iteration, the size of each “detour” is halved, but the number of detours doubles.
In this view, the core claim of the troll proof checks out. The problem lies elsewhere: to a layperson, the proof implies a contradiction where none exists. Who says that pointwise proximity and walking-path distance must be correlated to begin with? You can probably take many routes from home to work or school that are geometrically distant, but have similar lengths. Conversely, two nearby walking paths can have vastly different lengths if one is straight as an arrow and the other zig-zags a lot.
So the result is a different shape?
Well… that’s where the proof trips up many folks who are more proficient in math. For a finite (but arbitrarily large) number of iterations, the answer is yes: despite visual similarity, jagged circles and smooth circles are just two wholly separate things.
But if we take the “repeat forever” part of the troll proof literally, we enter the realm of mathematical fiction. In standard analysis — the prevailing flavor of fiction used to deal with infinity in algebraic contexts — most attempts to formally analyze the scenario would show that our increasingly jagged curve somehow collapses to a smooth diagonal (or a smooth circle) the moment we start talking about the hypothetical outcome “at infinity”. In other words, the troll proof becomes invalid in a different (and less explicable) way.
The best way to develop intuition about this scenario is to have another look at the earlier formula for the pointwise error between the stairstep pattern and the diagonal:
We’d be forgiven to say that as c (the iteration count) tends to infinity, the value of εshape becomes infinitely small. It’s not wrong, but this kind of talk is verboten: as outlined in the earlier article, infinitesimals have no place on the real number line. In standard mathematical discourse, “infinitely close to zero” and “equal to zero” are effectively the same, so the limit of εshape is zero. And if εshape = 0, then we must conclude that the two figures no longer differ in any way.
This implies that “at infinity” — and not a moment sooner — the length of the constructed curve must jump from 2 to √2 (in the case of a diagonal), or from 4 or π (in the case of a circle). This sounds weird, but there’s nothing that prohibits such a result. Keep in mind that infinity is not a point on the number line: it’s an abstraction for a place as distant from finite numbers as you can get. Discontinuities can happen as we take a leap from here to there.
The apparent collapse of our kinda-would-be-fractal doesn’t have any profound meaning; it’s just an outcome of an thought experiment in a framework where numbers must be finite, but processes can continue without end. This asymmetry can produce wacky results elsewhere, too; the earlier case of 0.9999… = 1 is another manifestation of the same phenomenon.
If we’re in a philosophical mood, we could insist that the geometric fine structure of the “infinitely jagged” curve survives, just becomes too small to ever exert any influence on real numbers. That’s not just grasping at straws: there are nonstandard analysis approaches that allow infinitesimals and that would keep the two curves distinguishable, for some definitions of infinity.
👉 For more articles about math, visit this page. In particular, you might enjoy:
I write well-researched, original articles about geek culture, electronic circuit design, algorithms, and more. If you like the content, please subscribe.









Another interpretation of the “infinity times infinitesimal error” statement is just to realize that the staircase only _appears_ to converge to the diagonal; if you zoom in closely enough you see that the staircase never in fact actually converges. So there is no real mystery; it’s just a matter of scale
For some reason, to me, this makes sense intuitively, while on your previous article I mentioned that 0.(9) still bothers me somehow.