Acutely Obtuse

The Beauty and Importance of Geometry

November 1, 2024

Mathematical Cardiac Arrest

As my readers should know by now, I teach mathematics, focusing mainly on students in high school and there too preferring to focus on students in grades 11 and 12. Hence, when students reach me, their foundation in mathematics has, for the most part, been laid, for good or ill. Unfortunately, very often this foundation is weak. At times, the students reach me with strange ideas about mathematics and mathematical operations. And I wonder at the kind of mathematical education they had prior to reaching me.

So just recently, I was teaching a student in grade 10 how to form an equation given some data. The question we were dealing with is below:

The student struggled for many minutes to form the equation. When I offered help by saying that the area of rectangle A is (3x – 1)(2x + 1), the student expressed confusion, asking me, “Why is that the area?” I said, “The area of a rectangle is length multiplied by width, isn’t it?” to which I received a period of awkward silence, which was broken when the student finally told me that this had not been covered in school. It was my turn to become silent. I then asked, “But you know that the perimeter of B is twice the sum of 2x + 1 and x + 1/2, right?” Again silence. This too had not been covered in the student’s school.

Diagnosis of the Problem

Now you may tell me that the student was probably not interested in mathematics and had probably zoned out when the teacher was teaching the class about the area and perimeter of rectangles. I would accept this if we were talking about a student who was in, say the fifth grade or even the sixth grade. After all, in most curriculums, this is taught in the third or fourth grade. So having a lag time of about a year or two could be granted. However, here we are talking about a student in grade 10! Am I to believe that not once in the five years from fifth grade to ninth grade the student had to use these results? If this is the case, then the school has completely failed the student.

However, this student is reasonably quick with new concepts. And also displays quite a bit of interest in learning new things. So I had to come away with the conclusion that the student just had not been taught these concepts.

I wondered about this. How could a student reach grade ten and not have learned about the area and perimeter of a simple figure like a rectangle?

After a little research, I realized that, in North America, where the student is from, they divide mathematics into siloed subjects like Algebra 1, Algebra 2, Geometry, Trigonometry, Pre-calculus, etc. Depending on what the student intends to do after high school, the student may or may not take all siloed subjects. What happens when the discipline is divided like this is that each such subject it treated quite independently from the others. While this segregation happens only in high school, the effects bleed downward into middle and elementary school. This is because, in most schools, teachers teach ‘to the test’, aiming to have students score high in exams, rather than teaching them in order to help them understand and appreciate mathematics.

This results in a downplaying of the importance of geometry in elementary and middle school. After all, pure geometry is conceptually heavy and not readily applicable, unlike algebra and trigonometry. However, geometry was considered the heights of mathematical understanding in the past. And this is because it uses pretty much all other areas of mathematics, except statistics and probability. It is sad, therefore, that some countries and even some global exam boards, like the IB and CAIE, have relegated geometry to being a footnote or option when studying mathematics. And in order to show you the beauty of geometry, I am considering two simple problems that yield remarkable insights.

Problem 1: Area of a trapezium

Each side of a trapezium is tangent to a circle of radius 1, as shown. Prove that the area of the trapezoid is at least 4.

First, a note of the terminology. I am calling this figure a trapezium because that is what it is. In North America it is called a trapezoid, which is strange since the ‘oid’ suffix means “resembling” or “like” and the figure is not “like” a trapeze! To the contrary, the English word ‘trapeze’ derives from the French trapèze, which in turn derives from the Latin trapezium, which means ‘table’.

I encourage the reader to pause here and attempt the problem before proceeding.

A trapeze artist swinging on a trapeze.

Anyway, let’s proceed. In solving this problem, I presume that the reader knows that the area of a trapezium is given by half the product of the sum of the lengths of the parallel sides and the distance between the parallel sides. In symbolic terms

The distance between the two parallel lines is the diameter of the circle, which is 2 units. Now let’s draw a line parallel to both parallel sides and halfway between them as shown below.

The length of the dotted line is

It is easy to see that the dotted line cannot be shorter than the diameter of the circle. If it were shorter, then one of the oblique lines would actually be a secant rather than a tangent. Hence, the length of the dotted line is at least 2 units. It follow, then, that the area of the trapezium must be at least 2 × 2 = 4.

We can see that a small insight, namely that the length of the dotted line is half the sum of the lengths of the parallel sides yields the answer. The fact that a polygon that circumscribes a circle cannot have any part of it inside the circle also played a role.

Problem 2: Hexagon and triangles

In the ﬁgure below, hexagon ABCDEF is divided into three squares and four triangles. Show that the areas of all four triangles are equal.

Once again, I suggest that the reader pause here and attempt to solve the problem.

Hexagonal shapes forming a honeycomb. (Source: Nature Back In)

Ok, let’s get started. We begin by naming the vertices of the triangle as shown below.

Now, since AXZF, BXYC, and DEZY are squares, all their internal angles are 90°. Specifically, ∠AXZ = ∠BXY = 90°. This means that ∠AXB + ∠ZXY = 180° since the angles around point X must add up to 360°.

We now rotate triangle XYZ anticlockwise until XY coincides with XB. Remember, both XY and XB are sides of a square and hence are equal. This would mean that XZ rotated anticlockwise by 90°. Let the new position of Z be Z’. Hence, ∠AXZ would have increased in magnitude by 90°, making ∠AXZ’ = 180°, meaning AXZ’ is a straight line.

So now, ABZ’ is a triangle with X being the midpoint of AZ’. Hence, BX is a median of the triangle through B. Hence, the areas of triangles XAB and XZ’B must be equal since the median divides the triangle into two smaller triangles with equal areas. But triangle XZ’B was formed by rotating triangle XYZ. And rotation does not change the area of a triangle. Hence, the triangles XYZ and XAB have the same area.

We can follow the same process by rotating triangle XYZ about the vertices Y and Z to show that the other two triangles have the same area. Here we have used the property that the sum of the angles about a point is 360°, that the sides of a square are equal, that the median bisects a triangle, and that rotation does not alter the area of a geometric figure.

An Indication of Failure

In both problems we have used geometric ideas that are taught in middle school. There isn’t a single idea that actually is taught in high school, at least not where the mathematics curriculum takes seriously the importance of geometry. Yet, we have seen that these ideas are put together in ways that would require considerable immersion into the world of geometry.

What both problems needed was some sort of spatial reasoning. In the first, it was crucial to understand that the sides of the trapezium could not intersect the circle but had to be tangential to it, meaning there was a lower bound to its length. In the second, the fact that rotating the triangle XYZ would result in a larger triangle with a median was crucial to the solution.

These are not ideas that would have occurred to a student who has only started studying geometry seriously in grade 9 or 10 or even grade 8. Rather, this kind of intuition can only be developed over many years. This is why in India students are introduced to geometry as early as in elementary school. A curriculum that only introduces students to geometry in high school or at the end of middle school ensures that students will only always have a superficial understanding of geometry. And since geometry includes spatial reasoning as well as other kinds of reasoning that occurs in mathematics, a superficial grasp of geometry means an enduring inability on the part of the student to integrate different areas of mathematics, resulting in an impoverished understanding and appreciation of mathematics.

Hope for the Future

The only reason I can think of for the curriculums in North America to divide mathematics into artificially segregated siloes is the need to have learning fit into discrete semesters. However, this prioritizes an artificial external constraint over the nature of the subject and reflects a failure on the part of those who created the curriculum to prioritize the learning of the students.

However, as I hope I have shown, geometry is an integral part of mathematics and its importance should not be denied. It is time that curriculum designers in North America took seriously the nature of the subject and the learning of the students and designed a curriculum that does not segregate a subject artificially.
A Plea for a Revolution

October 25, 2024
The Presenting Problem

Screengrab from The Wall by Pink Floyd.

In response to the previous post, in which I had categorically declared that the use of calculators in mathematics education is a hindrance to students, a former student asked me, “Do you think it would be feasible or even beneficial to move mathematical assessments from the current manipulation & computation to something that focuses more on formulation and derivation from first principles from a given context (similar to IB DP AA HL paper 3)?”

To decipher the last few characters of the student’s question, ‘IB‘ stands for International Baccalaureate, ‘DP‘ for Diploma Programme, ‘AA’ for Analysis and Approaches, and HL for Higher Level.

In what follows, I will be addressing specifically what the IB does. However, this should not be taken as a critique of the IB alone, but of all the major exam boards around the world, like CAIE, Edexcel, AP, CBSE, and CISCE, just to name a few international and Indian boards. If you are reading in a country other than India, the UK, or the USA, please think of the exam boards that exist in your country.

What is Paper 3 Like?

Anyway, returning to my student, he was asking if I thought that the kind of questions that often appear in paper 3 for that syllabus would be something I considered ‘feasible’ and ‘beneficial’. Of course, many of you may be in the dark about what kind of questions might appear in this paper. If so, please click here to be taken to one example of paper 3.

If you read the paper, you will discover the following:
1. The question paper consists of 2 questions for a total of 55 marks and a duration of 1 hour.
2. The first question is worth 27 marks and combines geometry, sequences and series, and mathematical induction.
3. The second question is worth 28 marks and combines polynomials, complex numbers, coordinate geometry, and calculus.
As someone who loves mathematics, I must concede that I do find the Mathematics AA HL Paper 3 to be quite an interesting ‘animal’. As just mentioned, it gives the student problems in which different areas of mathematics are made to relate to each other. While the situations are contrived, the student is exposed to the possibilities that a little imagination can introduce to us.

Zoning Problems

Now, the IB has different papers for different time zones. In time zone 2 the students would have received this paper. If you read the time zone 2 paper you will discover the following:
1. The question paper consists of 2 questions for a total of 55 marks and a duration of 1 hour.
2. The first question is worth 28 marks and combines coordinate geometry, functions, and calculus.
3. The second question is worth 27 marks and combines polynomials, theory of equations, and complex numbers.
You may be wondering, “So what?” Of course the papers have to be different. Otherwise, the IB would not be able to administer different papers in different time zones, thereby risking the security of the question papers. And I fully agree.

But note that the time zone 1 paper has a significant number of marks devoted to mathematical induction, while the other one has marks devoted to the theory of equations. Mathematical Induction is a stand-alone topic that is somewhat weird to boot. So many students often decide to skip it and hedge their bets on it not being tested for too many marks. Theory of equations on the other hand is an integral part of mathematics and links with many other topics.

So consider four students, P, Q, R, and S. P and Q write exams in time zone 1, while R and S write in time zone 2. P and R regularly get a grade of 7 (the highest possible), while Q and S are borderline between 5 and 6 and have both chosen to ignore mathematical induction. Q, having ignored mathematical induction, will have a low score in paper 3 because he is writing a paper that tests that topic, while S, who also ignored the same topic, will not be affected since she is writing a paper that does not test that topic. P and R are not affected negatively in terms of their raw score. However, there will be little to distinguish between R and S since both are writing tests that do not include topics that S chose to ignore. Since many students writing exams in time zone 2 might have ignored mathematical induction without being negatively affected, the grade boundary for a 7 in time zone 2 will rise, thereby disadvantaging student R, who, for no fault of his own, is clumped with student S, who had hedged her bets. So what we have is some students, like S, in time zone 2 being unfairly favored, while others, like R, being unfairly disfavored.

The Problem of Grade Descriptors

Of course, the IB, like many exam boards around the world, claim to assess students on the basis of grade descriptors. But this is a hoax. There can be no grade descriptors when we are assigning final grades based on some numerical grade boundaries. If it were actually possible to relate grade descriptors to grade boundaries, we would actually not need both because everyone would automatically know how to translate from one to the other.

You see, I have experience marking IB mathematics exams and internal assessments. I know the difference between the two. The internal assessments are indeed marked according to a set of grade descriptors for each assessment criterion. However, the exam papers are not. In fact, if we are being honest, it is impossible to have a rigid mark scheme and also grade according to student achievement based on grade descriptors. You can do one or the other. You cannot do both. And the hoax that exam board the world over attempt to make us believe is that it is possible to do not just both, but also to be fair to all the students.

You see, the very idea of numerical grade boundaries that are determined after the exams are written and graded is that there is a post hoc determination of what separates the 6 from the 7. Why is this a fallacious approach?

The Problem of Grade Boundaries

It is impossible for every question to assess the student on every criterion. In other words, question 1 may assess a student on criteria A, B, and C, while question 2 may assess a student on criteria B, D, and E. However, when we say there is a grade boundary, we are saying that only the final mark matters and not where the student earned the mark. Hence, if criterion C is weighted more heavily than criterion D and E, then the student who gets a total of 40 marks with 25 marks question 1 and 15 in question 2 should receive a higher grade than a student who scores 15 in question 1 and 25 in question 2. All this is ignored when we assign grade boundaries based solely on the raw score.

What I am saying is that, for all their claims to assign grades based on some assessment criteria, the major exam boards are doing nothing of the sort. They have some highfalutin jargon that confuses people and dupes them into thinking the boards are actually assessing the students on criteria rather than relative to others. You see, what we all want is to know what a student has achieved because that tells us how much the student has learned. We do not want to know how she did relative to others because that tells us very little about her own learning. After all, if she is in the 95^th percentile when the average was 50 with a standard deviation of 10, then her raw score would be 66, meaning that she has ‘mastered’ about two-thirds of the syllabus. However, if she is in the 90^th percentile when the average was 70 with a standard deviation of 8, then her raw score would be 80, meaning that she had ‘mastered’ more than four-fifths of the syllabus. However, since the grade boundaries are based on the performance of the cohort, the board will determine that only a small fraction, say 8%, of the students should get the highest grade. Hence, the student who had only ‘mastered’ two-thirds of the syllabus would get a grade of 7 while the student who had ‘mastered’ four-fifths (i.e. 20% more) would get a grade of 6. You can check my working using the online inverse normal calculator here.

In other words, assigning grades based on numerical grade boundaries does precisely what the major international exam boards tell us they are not doing, that is assigning grades in a relative manner. If the grades are indeed criteria based, how is it that the criteria map so perfectly onto the raw marks that students achieve? Further, the practice of having different papers for different zones may mitigate against candidate malpractice for sure. But it seems it opens the door to ‘exam board malpractice’, something no one wishes to talk about! When I say ‘exam board malpractice’, I mean the practice of equating two things that appear equivalent only in the eyes of the exam board, but not to the students. If the exam boards are serious about grading the students on the basis of assessment criteria, then it is imperative that they move away from exam focused assessments.

The Proposed Revolution

Mind you, this is not the case of sour grapes. I am an exceptionally good test taker. And I have done well as a teacher to prepare my students not just with the knowledge to succeed at exams but also with the mental resilience that exams need.

However, for more than two decades now I have found that exams are dehumanizing. Yes, I have used a strong word. But I do this because no one ever finds themselves in ‘exam conditions’ outside school. The exam rooms are made pristine and silent, thereby placing at a disadvantage students who thrive in situations of controlled chaos and those who like to listen to music as they learn. Movement is prohibited, thereby disenfranchising those students who like to think as they walk or dance. Food is forbidden even though some students like a regular calorie burst to stimulate their minds. Everything that makes us human – food and movement and music – are forbidden from the exam room. Rather, the only acceptable involvement of the body is to hold a pen and move it along a piece of paper, something that humans have been doing only for the last 1% of their existence on this planet. The exam room forces humans to become just a bodiless head moving a torsoless hand. Exam rooms suit people like me – those who are able to concentrate in quiet environments and who are able to be seated for long stretches of time and who can go without food for ages.

Teaching and learning is a local affair. Even in this age of online classes, there is a ‘proximity’ of the student and the teacher. Exams strip the student of this support structure even though, in every job I can think of, one is free to consult with a more knowledgeable colleague or a mentor. Further, except in the most time sensitive jobs, one always has the freedom to ask for an extension, something that the very nature of an exam precludes.

But if teaching and learning are local affairs why should the assessment of learning not also be predominantly a local affair between a teacher and a student? That was how it was before the Industrial Revolution made us accept the production of graduates on an assembly line. That was how is was before we decided that large exam boards were better at gauging a student’s learning than the teachers who had taught the student.

I think it is time for a revolution in education. It is time we reaffirmed the agency and responsibility of the teacher not to teach to some predetermined syllabus but to teach what is relevant for the student’s plans and hopes for his/her future in accordance with the unique skills and talents and proclivities of each student in her/his care.
Appeal to the Custodians of Knowledge

October 18, 2024

Tools that Help

There are tools that enhance one’s ability to understand. And there are tools that are a hindrance to understanding. For example, when I first came across a Phillips head screw and the screwdriver that accompanied it, I was blown away. I was intuitively able to see why this screw head was superior to the regular flat screw head. And I was intuitively able to understand why the screwdriver was shaped the way it was.

I experienced a similar sense of euphoria when first paito the tables of logarithms and antilogarithms. I was able to see that the numbers in the logarithm tables increase rapidly at first before tapering off to a crawl, while, in the case of the antilogarithm tables, it was the reverse, starting at a crawl and speeding up exponentially. Hence, when my teacher later told me that the antilogarithm function is actually the exponential function and that the exponential function and logarithm function were inverses of each other, I had already intuitively grasped it through years of poring over the tables.

A Tool that Hinders

One tool, however, that is actually a hindrance to student learning is the calculator. As a mathematics teacher, I loathe calculators because they obscure the calculations that they perform and because they do not allow the student to see the larger picture of mathematical beauty and reality. For example, it is almost impossible for any student using a calculator to realize that the logarithms begin with rapid increases and then taper off, something that a couple of intentional glances at the tables would readily make evident to the same student.

Similarly, the periodicity of the trigonometric functions, something that is easily grasped through the simplicity of the unit circle, is rendered quite opaque when a student uses a calculator. Moreover, if a student obtains 0.78539816339 or 2.09439510239 as the answer to a trigonometry question, it is all but impossible for the student to realize that the first answer is actually π/4 and the second 2π/3. Given that these are important angles in geometric and trigonometric settings, including the setting of complex numbers, the obscuring of these angles is detrimental to the student’s learning.

In a similar way, the calculator may easily give answers like 0.36787944 or 0.69314718056, but if the student doesn’t recognize these as 1/e and log_e2, something crucial is lost in terms of mathematical understanding.

What the calculator does is elevate a decimal representation of numbers over all others. For example, calculators regularly give the first few digits of the decimal representation of irrational numbers. So, for example, π will be given as 3.14159265359 or something like that, depending on the number of display digits the calculator may have. Similarly, the square root of 2 may be given as 1.41421356237. However, what the calculators do not tell the user is that these are only the first 12 digits of the decimal representation of these irrational numbers and that this representation continues indefinitely and without patterns.

At the same time, students are introduced to ways of representing rational numbers in decimal format. Hence, students are taught how to show that 1/7 = 0.14285714285. And they are taught to do the reverse. Hence, they know how to demonstrate that 0.25 = 1/4.

With the calculator display misleading the students by obscuring the fact that the number is irrational, students often do the reverse process and conclude that, since the calculator shows only 12 digits, this must mean that the decimal representation of π either terminates at the twelfth digit or repeats in some pattern after the twelfth digit. But this would mean that π is a rational number!

The problems with calculators are compounded when we actually get to solving questions. For example, by the time students reach the ninth grade, they are introduced to quadratic equations and are told that the roots of the equation

are

Now let us say the student is asked to solve the equation

From this equation the student will be able to obtain

Then she can plug the values of a, b, and c into the calculator to get

Actually, most calculators would give -1.66666666667 and 1.5, but I’m presuming the student is astute enough to know that these are rational numbers.

The presence of the calculator makes the student reach for it, if she has become used to using it, much like an addict would reach for his next fix. And she would reach the correct answers if she pushed the right buttons. However, by doing this, the student has lost an opportunity to learn something more about number patterns. Give then equation

where

we want two integers whose product is

and whose sum is

We can proceed by listing pairs of factors of -90 that multiply to give -90 and then check to see which pair also gives a sum of 1. This would give us

A quick glance at these pairs would reveal that -9 and 10 are the numbers we need. This allows us to split the middle term and factorize as follows:

The insights that two integers have a product of -90 and a sum of 1 and that the factors of the equation are 3x + 5 and 2x – 3 are totally obscured by the calculator. The greater insight that, if the product of two numbers is zero, then one of them must be zero is completely lost when we use the calculator.

Calculators and Computational Engines

Calculators have developed a lot since their now clearly humble beginnings as four function calculation devices. Not only do we have scientific calculators and the more advanced graphic display calculators, but also we have some calculators that can perform highly sophisticated symbolic manipulations. They are in effect hand held computational knowledge engines.

I do not doubt nor wish to disparage the ingenuity of the people who have brought us to the stage where anyone can make an inquiry about a mathematics problem and get not just the answer, but also a step by step solution. For example, if you ask WolframAlpha to determine

it will churn out

as the answer in a fraction of a second. If you have a paid subscription, it will even give you the full sixteen step solution, where each step actually consists of sub-steps!

But what have I gained by using WolframAlpha? I have obtained an answer to a reasonably involved question of integration, but I have not advanced my own understanding of integration nor of even more banal mathematical concepts like the addition of fractions. I have not learned how the trigonometry and the algebra interrelate in this question.

Now place a similar, but lighter version, of this computational knowledge engine in the hands of a student in an exam. If the exam authorities do not permit the use of the symbolic manipulator, the student will have to place the calculator in exam mode. So what was the use of having this feature in the calculator? Absolutely none! But suppose the exam authorities do not have such restrictions. Then the student can simply use the symbolic manipulator to solve the questions. What then did the exam test except whether the students knows which buttons to press and in which sequence?

Computational knowledge engines, in my view, showcase the wonderful human ability to program something as lifeless as a circuit to emulate human behavior in computational situations. However, as a learning tool, they fall abysmally short. Showing a solution, as WolframAlpha does, is not the same as teaching. As the solution reveals, there is not even the slightest attempt made to give a rationale for any of the steps, which is the essence of teaching and learning.

Reneged Responsibility

Calculators and computational engines, in my view, are worse than crutches to a student of mathematics. They deceive the user into thinking that he/she has accomplished something grand, when in fact all he/she was was functioning as a glorified button pusher. Unfortunately, humans have never shown the ability to decide whether our ability to do something means that something ought to be done or not. And it is here that our educators have failed us as well. Just because we have machines that can do certain tasks, does not mean we have to depend on them for doing those tasks. Indeed, if those machines are robbing us of some ability we should be quite circumspect about bringing those machines into our learning spaces.

However, most educators in the world want to jump onto the technological bandwagon. We have surrendered our responsibility of curating what students should be exposed to in our classrooms and have allowed the wider world to dictate to us. Just a while back I realized that some of my students did not even know the names of the components of a geometry set, let alone know how to use them. Yet, they knew very well how to push buttons on a calculator. However, the geometry set develops a student’s hand-eye coordination, his/her fine motor skills, and his/her dexterity, while also inherently bridging the two hemispheres of the brain in a single activity.

The calculator does none of this. Yet, we have privileged these electronic machines in our classrooms, depriving our students of invaluable learning experiences. It is a shame. It is time educators put their feet down and acted as what they truly as – custodians of our future knowledge. It is time we stopped our slavish dependence on technological innovation just for the sake of it and actually ask whether any innovation actually furthers student learning before allowing something new into the classroom.

Mind you, I am not someone who is averse to technological developments. I readily adopt new technologies. However, as a teacher, I have to prioritize what will enhance the learning of my students above any other factor, including especially the lure of staying technologically current. If some technology hinders student learning, as I firmly believe calculators and computational engines do, then it is my responsibility to make my voice of opposition to them heard even if it means alienating myself from teachers who hold differing views.
A Historical Mathematics Problem

October 11, 2024
The Siege of Yodfat

Site of ancient Yodfat. (Source: Wikipedia)

In AD 67, in the middle of the first Jewish-Roman War, a rag tag band of forty Jewish revolutionaries managed to get themselves besieged in Yodfat. Realizing they were going to be captured and perhaps tortured, the group decided that they would commit mass suicide. However, given the Jewish aversion to committing suicide, none of them wanted to kill themselves. Also, some of them argued, if each person was given the responsibility of killing himself, there could be some squeamish rebels who would simply not carry through with the decision, leading to some of them losing their lives while others saved their own lives.

In order to avoid such rejection of their collective decision to kill themselves rather than being captured, they devised a plan where they cast lots and the lots told them how they were to arrange themselves. They would then arrange themselves in a circle in order of the lots. Then the first person would drive his spear through the second, killing him. Then the third person would drive his spear through the fourth, killing him. This would proceed until only one person was left and he was expected to kill himself. Hence, only the last person was expected to kill himself and end the mass suicide.

As it so happened, when thirty-eight of them had been killed, the last two decided that they would stop the killing and surrender to the Romans! One of the survivors was the Jewish historian Josephus, who narrates this story in The Wars of the Jews (Book 3, chapter 8, paragraph 7).

The Josephus Problem

This problem has come to be known as the ‘Josephus Problem’ because Josephus himself reports that he did not want to die but actually wanted to surrender to the Romans.

Memorial to the Jewish defenders of Yodfat, which fell to Roman forces on July 20, 67 CE. (Source: Wikipedia)

Hence, the problem can be framed as follows: Given forty people arranged in a circle, with each alternate person being killed by the person before him, what is the position you should be in to survive? We can generalize this as follows: Given n people arranged in a circle, with each alternate person being killed by the person before him, what is the position you should be in to survive? This involves skipping 1 person at each stage. However, the problem can be further generalized as follows: Given n people arranged in a circle, with k people being skipped and the person in position k+1 being killed by the person before him, what is the position you should be in to survive? Is there a solution to this general problem and, if so, how would we find it?

The Survivor Position

We start small! Suppose we have 3 people: A, B, and C. A kills B and C kills A. Hence, the survivor is in position 3. Suppose we have 4 people: A, B, C, and D. A kills B, C kills D, and A kills C. Hence, the survivor is in position 1. If we continue in this way we will obtain the following table

We should be able to observe some patterns here. No one in an even position is ever the survivor. This is because, when we skip just one person each time, all the even positions are killed in the first round itself. Next, whenever the number of people equals a power of 2 (i.e. 2, 4, 8, 16, etc.), the survivor is in position 1. Between two consecutive powers of 2 (i.e. 2 and 4, or 4 and 8 or 8 and 16, etc.), the survivor position proceeds according to successive odd numbers.

Hence, we can obtain the following algorithm:
1. Given, the number of people (n), determine the largest power (say m) of 2 that is less than or equal to n.
2. Calculate p = n – 2^m + 1.
3. The survivor will be in position s = 2p – 1.
Let’s test this algorithm. Say n = 13. Now 2³ = 8 < 13 < 16 = 2⁴. Hence, m = 3. This gives us p = 13 – 2³ + 1 = 6. This gives, s = 2(6) – 1 = 11, which is the survivor position according to the table.

Say n = 16. Now 2⁴ = 16. Hence, m = 4. This gives us p = 16 – 2⁴ + 1 = 1. This gives, s = 2(1) – 1 = 1, which is the survivor position according to the table.

For the situation in which Josephus found himself n = 40. Now 2⁵ = 32 < 40 < 64 = 2⁶. Hence, m = 5. This gives us p = 40 – 2⁵ + 1 = 9. This gives, s = 2(9) – 1 = 17. Hence, Josephus would have been safe if he had drawn the lot for the 17^th position.

The Second Last Survivor Position

However, remember that Josephus was one of two who remained and who decided to renege on their commitment to the fallen comrades. So he need not have been the last man standing. He could have been the second last man, the one whose fate is would have been to be killed by the man in the 17^th position. What position would this second last man have held? We can generate a similar table as before to obtain

The first line is in red because of the strange case in which a person occupying an even numbered position survives. However, this is only because we are starting with 2 people. If we observe the third column, we observe once again that, apart from the anomalous ‘2’, there are only odd numbers. Further, we can see that the position resets to 1 whenever we reach 6, 12, 24, etc. This is a geometric sequence and the general term can be express as

Between successive terms of this series, the third column merely goes through all the odd numbers. So we can obtain the following algorithm:
1. Given, the number of people (n), determine the largest term the series u_k that is less than or equal to n.
2. Calculate q = n – u_k + 1.
3. The second last survivor will be in position t = 2q – 1.
Let’s test this algorithm. Say n = 13. Now 6×2^2-1 = 12 < 13 < 24 = 6×2^3-1. Hence, k = 2. This gives us q = 13 – 6×2^2-1 + 1 = 2. This gives, t = 2(2) – 1 = 3, which is the second last survivor position according to the table.

Say n = 24. Now 6×2^3-1 = 24. Hence, k = 3. This gives us q = 24 – 6×2^3-1 + 1 = 1. This gives, t = 2(1) – 1 = 1, which is the second last survivor position according to the table.

For the situation in which Josephus found himself n = 40. Now 6×2^3-1 = 24 < 13 < 48 = 6×2^4-1. Hence, k = 3. This gives us q = 40 – 6×2^3-1 + 1 = 17. This gives, t = 2(17) – 1 = 33. Hence, Josephus would have been the second last survivor if he had drawn the lot for the 33^rd position.

Throwing Down the Gauntlet

So we have now obtained positions for the last and second last survivors. However, what we have done is look at the tables and use them to obtain the ‘algorithms’ for the positions. We have not used any rigorous mathematics. What we have done is play around with the situations and keep an eye out for patterns. While there are certainly ways of deriving these ‘algorithms’, we can see that we do not need any rigorous mathematics to pull off something quite remarkable. For example, suppose we have n = 1000. Imagine, a circle of one thousand people. What would we obtain?

Using the first algorithm we would get: 2⁹ = 512 < 1000 < 1024 = 2¹⁰. Hence, m = 9. This gives us p = 1000 – 2⁹ + 1 = 489. This gives, s = 2(489) – 1 = 977. Using the second algorithm we would get: 6×2^8-1 = 768 < 1000 < 1536 = 6×2^9-1. Hence, k = 8. This gives us q = 1000 – 6×2^8-1 + 1 = 233. This gives, t = 2(233) – 1 = 465. Hence, the last survivor is the person in position 977 and the second last survivor the one in position 465.

Now you know how to set up the problem and draw up the tables, can you obtain the ‘algorithms’ is we skip not one but two people each time? What kind of patterns would we see?
Getting Squared Away

October 4, 2024

Confession

Playing with numbers is one of my favorite ways to pass time. Of course, we can get to numbers per se in a variety of ways. One can be from simply doodling on a page. I remember drawing grids of dots on a page and playing all kinds of games with my classmates when I was in school. I’d like to say that this was only during breaks or free periods, but I know I won’t be able to pull the wool over your eyes, at least in this regard. Those of you who know me personally also know the glint that appears in my eyes when I am up to no good. Oh well!

As the well known saying goes, “An idle mind is the devil’s workshop.” And I often found myself bored stiff in many of my classes. In some of them I did take a perverse pleasure in disrupting the class. This was in classes where I had some kind of antagonism toward the teacher. But in other classes, where I found the teacher likable, I would engage in my random play, one kind of which was drawing grids and either contriving new games to play on the grid or discovering intriguing patterns.

Anyway, during my dot grid days I would attempt to come up with different types of grids. And of course, as mentioned earlier, I had to count the number of dots and recognize patterns. It was only much later that I realized that my (successful) efforts at distracting myself in some classes were actually leading to a common recreational mathematics idea – the centered square numbers. Let me explain.

Introduction to Centered Square Numbers

The centered square starts with a single dot. Each successive centered square is obtained then by surrounding the existing centered square with another layer of dots to complete a new square. The first four members of this sequence are shown in the figure below.

Allow me to explain the notation. In the term C_4,3, the ‘4’ refers to the number of sides in the regular polygons we are dealing with. In this case, since we are considering centered squares, the first number is a ‘4’, indicating a quadrilateral. The ‘3’ refers to the third member in the sequence. The C indicates the count of the number of dots.

In the figure, apart from the first member, each figure has both red dots and grey dots. If you focus on either the red or gray dots, you will notice that the set of dots of that color form a square. So in the third member, the red dots form a square of side 2 dots while the gray dots form a square of side 3 dots. Similarly, in the fourth member, the gray dots form a square of side 3 dots and the red dots form a square of side 4 dots. This is what is indicated on the right side of each equation. For the third member, 4 is the number of dots in a square of side 2 while 9 is the number of dots in a square of side 3. Similarly, for the fourth member, the square of side 3 has 9 dots while the square of side 4 has 16 dots.

We can now add the numbers on the right side of the equations to get

From this it is clear that the number of dots in the n^th member is equal to the sum of the squares of n and n – 1. Hence, we can write

Fruits of Algebraic Massaging

With a little bit of algebraic ‘massaging’ we can show that

What this tells us is that the n^th centered square number is equal to half the sum of the square of the n^th odd number and 1. This can be illustrated as in the diagrams below:

Here the dots in gray indicate the count for the centered square while the total number of dots is the square of the n^th odd number.

We can also ‘massage’ the expression for C_4,n as follows:

Now, if we consider the sum of the first n – 1 natural numbers we can see that

This means that the n^th centered square number is one more than four times the sum of the first n – 1 natural numbers. We can depict this as follows:

Here, the central black dot represents the 1 that is added at the end. The remaining dots are divided into 4 triangles, 2 in gray and 2 in red. Each of these triangles represents the sum of the first n – 1 natural numbers.

Further, since

we can conclude that each centered square number is necessarily odd. We could obtain this intuitively from the fact that the centered square number is the sum of consecutive square numbers. Since one number must be odd and the other even, this must mean that we are adding an odd number, since the square of an odd number is odd, and an even number, since the square of an even number is even.

Of course, given the m^th odd number, we can obtain

This means that the square of an odd number is always 1 more than a multiple of 4. This leads easily to the conclusion that all centered square numbers are 1 more than a multiple of 4. This was depicted in the previous diagram since there are 4 identical triangles and the single black dot.

Absolution

As I played around with the grid of dots in my early days, I did recognize some of these patterns. Not all, of course. But it was quite intriguing that the simple endeavor of making a grid of dots could yield so many interesting properties. This began out of boredom in the classroom. I am confident that countless many of the discoveries and inventions we now know about and enjoy had their beginnings in the boredom of a child in a class he/she was forced to attend. If this is the case, I think I can be absolved for my boredom fueled explorations!
Numbers are Playthings

September 27, 2024

Inhale

Ok. It’s time for a little breather. On 17 March 2024 we began a series of posts on e. Then we moved to another series on principles of counting. Then last week we completed a series on π. All these were quite heavy and I’m sure some of you are still reeling from the shock caused by the posts on π. So for a few weeks I think we should lighten up a bit. I will dedicate the next few posts to recreational mathematics.

From as long as I can remember, I have enjoyed playing with numbers. I enjoyed performing all sorts of operations on them in the attempt to recognize patterns. At an early age, and I believe independent of any external influence, I noticed that when you multiply single digit numbers by nine, the digit in the units place keeps reducing by 1 while the digit in the tens place increases by 1.

Playing with numbers is exhilarating. The joy that one can get from a few simple operations cannot be expressed. In this post, I wish to give you one way I have occupied myself for endless hours and one easy but mathematically based trick. The challenge is to use four copies of the same number (from 1 to 9) and any of the mathematical operations (+, −, ×, ÷, a^b, and √a), with as many sets of parentheses to make as many numbers as possible.

Massaging the Numbers

For example, we can choose to use four ‘2’s. In that case, we can have:

(2 ÷ 2) ÷ (2 ÷ 2) = 1

(2 ÷ 2) + (2 ÷ 2) = 2

(2 + 2 + 2) ÷ 2 = 3

(2 + 2) ÷ 2 × 2 = 4

2 × 2 + 2 ÷ 2 = 5

2 × 2 × 2 – 2 = 6

2^{(2 + 2 ÷ 2)} = 8

(2 + 2 ÷ 2)² = 9

2 × 2 × 2 + 2 = 10

You get the idea. Is there a way of getting 7 as the answer? What about 11? What’s the largest number you can get using four ‘2’s?

We could repeat the same with another number. Try it on your own. Believe me, it keeps your mind engaged and, when you make a breakthrough, there is a serious dopamine rush.

A Constant Result

Now concerning the trick I mentioned earlier. Ask someone to choose a three digit number where the hundreds digit and the units digit as not the same. Now ask them to reverse the order of the digits. So if they choose 356, after reversing, they get 653.

Now ask them to subtract the small from the larger. With the above numbers 653 – 356 = 297. (If they get a two digit number ask them to place a 0 in the hundreds place. So if they had chosen 433, they would have 433 – 344 = 099.) Ask them to reverse the digits again and add the two numbers. So with the original number we chose we would get 297 + 792 = 1089. (If they had 099, they would now get 099 + 990 = 1089.) But tell them not to give you any of these results.

Now, the answer is always 1089. So you could keep a book with around 200 pages and memorize the 9^th line on page 108. So when they have finished adding, tell them to turn to the page indicated by the first three digits (i.e. 108) and check the line indicated by the units digit (i.e. 9). Now you read out the line you have memorized and bamboozle everyone. Of course, since the answer is always 1089, you cannot do this trick with the same person multiple times.

But why does it work? Say we have a in the hundreds place, b in the tens place, and c in the units place. For now let’s assume that a > c. Then the value of the original number is 100a + 10b + c

When we reverse the order, we get a number whose value is 100c + 10b + a. When you subtract this from the original number you will get 100(a – c) + (c – a).

Now since a > c, it followed that c – a is negative. Hence, when we subtracted, we would have had to borrow. However, from the form 100(a – c) + (c – a), there is nothing that represents the tens place. So we will have to borrow 1 from the hundreds place, and then a 1 from the tens place, giving

100(a – c) + (c – a) = 100(a – c – 1) + 10 × 9 + (10 + c – a)

When we reverse this number we get 100(10 + c – a) + 10 × 9 + (a – c – 1)

Now when we add the last two numbers we get

100(a – c – 1 + 10 + c – a) + 10 × 9 + 10 × 9 + (10 + c – a + a – c – 1)

which on simplification gives 900 + 180 + 9 = 1089

Exhale

As we have seen, numbers provide us with a lot of entertainment. We will continue to look at this lighter side of mathematics in the few posts that follow. Don’t forget to try your hand with four ‘3’s or four ‘4’s. And astound someone who does not read this blog with the trick. And tell them to come over here and read the other posts. I’ll see you next week.
Inside the Gauss-Legendra Algorithm

September 13, 2024

Recapitulation

We are in the middle of a series of posts on π. Our journey began with A Piece of the Pi which was followed by Off On a Tangent. Last week, I posted The Rewards of Repetition in which I promised that we would break down the Gauss-Legendre algorithm into the distinct parts. Of course, as mentioned in the previous post, the algorithm uses mathematics that is way above the level of the blog. I will not be explaining any of that. I will only explain those parts that are at the level of this blog. So let’s proceed.

The Gauss-Legendre Iteration

First, let us recall the algorithm. The Gauss-Legendre algorithm begins with the initial values:

The iterative steps are defined as follows:

Then the approximation for π is defined as

The question, of course, is why this process even works. Surely this can’t be some random method that Gauss and Legendre independently stumbled upon! So what is the rationale behind this strange algorithm? After all, if we are honest, there is nothing in the steps that lead self-evidently to a conclusion that the iterations would converge to π. So let us consider the steps involved and why these initial values and iterative method leads to the approximation of π.

Comparison of Means

The method begins with the realization that, given to distinct positive numbers, their arithmetic mean (AM) is always greater than their geometric mean. Of course, the AM and GM are defined as

So we can proceed as follows since a and b are distinct

Convergence of a_n and b_n

Now the iterative formulas

Assure us that

and

This means that the sequence b_n is steadily (monotonically) increasing while the sequence a_n is steadily (monotonically) decreasing. Since we start with the fixed values of a₀ and b₀, this means that the two sequences will necessarily converge.

Now if we define

we can obtain

which, further yields

This means that the sequence c_n also converges. And since c_n cannot be negative, it must converge to 0. This means that a_n and b_n converge to the same value. Since this common value is obtained by converging series of arithmetic and geometric means, it is called the arithmetic-geometric mean (AGM). Let us designate it with m.

The Magic of Calculus

The next stage involves some calculus. Since I haven’t introduced calculus in the blog yet, let me just present the results. Those who wish to read in detail can look at the proofs here. Anyway, the method begins by defining

Then with some nifty substitutions and change of limits, we are able to prove

The observant reader will now realize why we had done all the stuff related to AM and GM and the convergence of the same. Now it is a trivial step to conclude that

From this we can conclude that

where, m, as defined earlier, is the AGM of a₀ and b₀. What this means is that

and

Now using the Gamma and Beta functions it is possible to prove

What this means is that if we start with the initial values as stated earlier, each iteration of

gets us closer to the value of π.

The iterative process of the Gauss-Legendre method converges so quickly that, after only 25 iterations, the algorithm generates 45 million digits of π.

Conclusion

I apologize to the reader for not including all the calculus behind the algorithm. I have done that in the interests of not making this post too heavy. I know that it is still quite heavy. But that is the nature of the beast! There are no rapidly converging methods for approximating the value of π that do not involve a lot of calculus. Indeed, some algorithms derived using calculus are really quite strange – so strange that even the Gauss-Legendre algorithm would seem quite sensible! We will tackle some of them in the next post, after which I will lay this π to rest.
Breather

August 23, 2024

The past week got quite busy all of a sudden and the week ahead also looks like it will be the same. I know I’m in the middle of a series of post on π. I will continue with the series in a couple of weeks. For now, I leave you with some collections of earlier posts.

Posts on e

Posts on mathematical pedagogy

Posts on number theory

Posts on π

Posts on proofs
Breaking the Nexus

August 2, 2024

The Growing Frustration

It’s frustrating. Yes, you heard me right. It’s frustrating. You may be wondering why I am saying this and in what context. So here goes. Just the other day, I was looking up some perspectives on a mathematics discussion forum. Quite naturally, there were many mathematics teachers in the forum. And quite naturally there was some discussion about questions asked by students.

As a mathematics teacher, I find these questions quite illuminating. In fact, this is one of the reasons I love being a teacher. You never know what would spark the interest of some student enough to formulate a question. So even though what I teach might be much the same from year to year, because who I teach changes, it is never boring. Indeed, even when you teach the same group of students, as I did from 2016 to 2018 and then 2020 to 2023, you can see their questions change as they themselves mature and change.

But it’s time to stop this nostalgic reverie and return to my frustration!

Case I: Equivalent Fractions

One question on the forum was asked by the teacher of a student in elementary school. I presume this is between Grade 1 and Grade 5. The student was learning fractions. When they came to the concept of equivalent fractions, the student asked the teacher what was different between the fractions 6/8 and 3/4. The teacher asked the question on the forum, requesting advice on how to respond to the student.

The poor teacher! The backlash she faced was unconscionable. So many of her fellow mathematics teachers pounced on her to tell her not just that both fractions are the same but that any mathematics teacher worth his/her salt should know this. Some questioned her competence to teach the subject. I commented, taking her side, and posted a link to a previous post in which I demonstrate that equivalent fractions might have the same value but do not contain the same information.

I faced the same issue in a recent class. A student asked me what the difference between various equivalent fractions was. That this student is currently in Grade 12 and is a reasonably astute student is an indictment against all his mathematics teachers.

Anyway, the question arose when, in answer to a problem, the student gave me the following numbers:

The student had calculated accurately and the numerical values of each of these fractions was spot on. However, I told him that, by reducing the fractions to their lowest forms, he had lost something crucial. It was then that he asked me the question about equivalent fractions.

So I gave him another set of fractions:

I asked him what links this set of fractions. He struggled, unsuccessfully, for a few minutes. Then I told him that I had generated this set of fractions in a very similar way to the one with which he had generated the first set. I gave him a few more minutes to try his luck, to no avail.

What was it that gave rise to these two sets of numbers? There is clearly no discernible pattern in either of the sets and they may seem to be just some random fractions generated by an addled mind. However, what if I told you that the first set, before reducing to equivalent fractions, was

Right away, you will recognize that there is a pattern. The numerator is composed of multiples of 5, while the denominator is composed of factorials, starting with the factorial of 3. If I asked you to give me a formula for generating the terms, you will quickly be able to tell me:

Similarly, if, for the second set of numbers, I gave you

you will equally quickly be able to tell me

You see, while the numbers in the first set and third set have the same corresponding values (and similarly with the second and fourth sets), something is lost by reducing the fractions to their lowest terms. And with this loss, we are rendered incapable of generating further terms in the sequence because the process of reducing the fractions destroys the pattern.

Case II: Quadratic Expressions

This shows up in other places too. For example, one of the students I am currently tutoring is studying quadratics at school. Now this student is exceptionally bright. When I was concluding a class, I asked her to think about the difference between

and

For those unfamiliar with the above, let me give an example.

The above is an example of the fact that any quadratic expression of the form

can be expressed in the form

But what was the point of the re-expression? Why should we bother to do it? In the next class I asked my student if she had given this some thought. She said she had but that she could not figure out in what way

is different from

Since we had just started learning about quadratics, I did not fault her for not being able to answer the question. After all, that’s what learning is. But she told me that she had asked her teacher about the difference. And her teacher had told her that there was no difference because

I’m glad the class was online. Otherwise, my student would have felt my fury palpably. It is likely that she did sense it even over our Zoom call! The statement of the teacher is absolute rot and betrays a failure to look below the surface. For, while both expressions are indeed equivalent, the expression on the right tells me that this quadratic expression, if plotted as a function of x, has a vertex at the point (1, -13). You may wonder, “What’s the big deal? So what if the expression on the left gives me the coordinates of the vertex? I could do a few steps of algebra and get it.” Absolutely true.

However, if you were given the functions

you would not know that they all have the same vertex. However, if they were written as

the fact that the vertices are at (1, -13) is indisputable. But more than that, the second set of equations also tell us, from the signs on the coefficients 3 and -13, 5 and -13, and -6 and -13, that f(x) = 0 and g(x) = 0 will have real roots, while h(x) = 0 will only have complex roots.

Case III: Theory of Equations

The same kind of issue shows up when we attempt to solve equations in general. Suppose, for example, you are asked to solve the equation

How would you proceed? Of course, you can use some general results from the theory of equations and conclude

But after this, what? Every student who has fumbled with these results will know that, if you try to solve these equations by some standard methods like elimination or substitution, you will revert to the original equation. So while the above four results give some valuable insights about sums and products of the roots of the equation, they do not bring us any closer to actually solving the equation.

When it comes to just simple quadratic equations, of course, the above method is powerful. After all, suppose we are given the equation

A little back of the napkin calculations will tell me that I need to find two numbers α and β such that

With a little effort, I can determine that the divisors of 3705 are 1, 3, 5, 13, 15, 19, 39, 57, 65, 95, 195, 247, 285, 741, 1235, and 3705. I can place them in pairs as follows: (1, 3705), (3, 1235), (5, 741), (13, 285), (15, 247), (19, 195), (39, 95), and (57, 65), where the numbers in each pair gives the product of 3705. Now since I have -3705 and not 3705, I need two numbers of opposite signs. And since I have -56 and not 56, I need the number with the larger magnitude to be negative. This gives the pair of numbers to be 39 and -95. Now we can proceed to split the middle term and factorize, giving

This allows us to see that we have the product of two numbers equalling zero, which must mean that either of the two numbers is zero. This allows us to conclude

While this method is relatively simple and straightforward in the case of quadratics, there is absolutely no way to use it when it comes to any generic polynomial equation of higher degree, like the quartic equation we saw earlier.

However, the actual way of solving the quadratic, by having a product equal to zero, should allow us to see that, if we had a quartic equation of similar form, we would be able to solve it. Indeed, the earlier quartic equation can be written as

This will allow us to conclude that

In other words, though it is almost child’s play for a student in grade 9 or above to show that

can be expanded to give

it would take even a skilled mathematician quite a few iterations of trial and error before she was able to solve

However, most high school students and perhaps even some middle school students would be able to solve

It is, therefore, quite frustrating when students come to me in Grade 11 and tell me that their teachers either found no difference between the expanded forms of expressions and the factorized forms or that they actually preferred the expanded form. Granted that the parentheses seem intrusive at times, one person may certainly prefer the visual appeal of the expanded form to the factorized form. However, when it is clear that the expanded form is difficult, if not impossible, to solve, this would be like preferring opacity to lucidity.

Case IV: Permutations and Combinations

I attribute this unfortunate preference to the noose like grip that textbook publishers and calculator manufacturers have on the global high school examination boards and schools. This fourth case, should highlight this point well. Suppose we were faced with the following problem. In how many ways can a committee of 7 people and another of 8 people be chosen out of 25 people such that 2 members of each committee are to be designated as president and secretary and such that no person belongs to both committees? How would we proceed?

We could first select the 7 member committee, which can be done in ²⁵C₇ ways. Now among these 7 members, we have to choose 1 to be president and 1 to be secretary, which can be done in ⁷P₂ ways. We use a permutation because it matters whether a person is president or if she is secretary. Now we have 18 people remaining. We can choose the second committee of 8 members in ¹⁸C₈ ways and the president and secretary in ⁸P₂ ways. So in all we have

Most textbooks will state that the answer is 49473074851200 or 4.947×10¹³ ways or something of the sort, where each individual term has been calculated and then the product evaluated. But this is an onerous task to do by hand. Hence, expecting the student to get either answer is expecting them to use a calculator. But perhaps the above number is too large for most of us to even process. Let’s deal with some smaller numbers.

Suppose I have 5 students. On one day, I wish to send a team of 3 of them for a debate. Quite naturally, this can be done in ⁵C₃ = 20 ways. On another day, I need to send one student to the staff room to collect my books and another to the library to collect a journal. This can be done in ⁵C₁ × ⁴C₁ = 5 × 4 = 20 ways. Both tasks can be done in 20 ways. However, the number 20 obscures how it was obtained. If the textbook gives the answer as 20 for the first situation, but a student used the second method and obtained 20, she would think she has understood the concepts since she obtained the ‘correct’ numerical answer. However, her thinking is obviously flawed. However, if the textbook gave the answers as ⁵C₃ and ⁵C₁ × ⁴C₁ respectively, any diligent student would be able to see where she had gone wrong.

When I have taught large classes of students, I have often given them a random number and asked them to give me a series of operations with only a particular single digit number that would yield the random number. For example, suppose the random number is 23 and the chosen single digit number is 4. Then I could have

There are infinitely many ways in which I can use just the number 4 and obtain 23. The challenge would be for each student to give a unique set of operations. And when I have done this exercise, never once have we needed to repeat a set of operations.

What I am trying to say is that expressions such as ⁵C₁ × ⁴C₁ or ⁵C₃ or, as in the original problem,

may be numerically equal to 20, 20, and 49473074851200 but contain far more information and give far more insight than the explicit numbers. However, if you are a calculator manufacturer, none of these insightful forms will help you sell your products!

Escaping the Trap

So you, that is, the calculator manufacturer, and the examination boards and the textbook publishers form a nexus in which these explicit but uninsightful forms are given preference over the forms that can be used to yield mathematical insight and promote mathematical understanding.

Unfortunately, most teachers go ‘by the book’. And so over time they forget that 3/4 and 6/8 may be numerically equivalent, but have quite different meanings. They forget that 20 is not just a number but could have been obtained in many ways, most of which are incorrect. They tow the line and look at the bottom line, that is, the final answer, concluding, if the answers match, that the student must have correctly executed the right sequence of operations. But as we have seen, this need not be the case.

It is time that the examination boards took a firm stance and actually favored the development of mathematical understanding of students rather than their ability to use a tool that is obsolete. I mean, where in the world does anyone use a calculator these days? Some old style retail outlets may use four function calculators to add the prices of items on a list and maybe also to calculate any sales tax or GST. However, where does anyone use a calculator for trigonometric or logarithmic or exponential functions except in school where it is forced onto the students? If we want students to be able to use some kind of technology to evaluate complicated functions, then we should teach them a programming language rather than make them use calculators. At least the former has real world relevance for many jobs while the latter has absolutely none!
Behind the Smoke and Mirrors

July 26, 2024

Setting the Stage

Smoke and Mirrors. Photo by theilr. (Source: Flickr)

I have long been fascinated by patterns that exist in numerical sequences. Early exploration led me to the discovery of the Fibonacci series and the associated Golden Ratio. I also came across Pascal’s Triangle, which remains a source of intrigue. Hence, when I was introduced to the tests for divisibility, I devoured them with a voraciousness unmatched till then in my life. By some accounts, I stumbled upon the tests for divisibility by 3 and 9 all by myself, without the direct involvement of my teachers.

And when my teachers ‘failed’ – or so it seemed to my young mind – to give me a test for divisibility by 7, I attempted to derive one such test. Of course, I failed at the task. Try as I might, I could come away with no universal test that would hold true for a number having any number of digits. What was it about 7?

You see, we have tests for divisibility by 2 (number ends with 0, 2, 4, 6, or 8), 3 (sum of digits is divisible by 3), 4 (last 2 digits are divisible by 4), 5 (number ends in 0 or 5), 6 (number satisfies conditions for 2 and 3), 8 (last 3 digits are divisible by 8), 9 (sum of digits is divisible by 9), 10 (number ends with 0), 11 (difference of the sums of alternate digits is divisible by 11), and 12 (number satisfies conditions for 3 and 4). Why was 7 so stubborn?

Moreover, why did these tests work out the way they did? Perhaps if we uncovered the reasons for which the tests for 2, 3, 4, 5, 6, 8, etc. are what they are, we can gain some insight into why 7 refuses to submit to our efforts.

In what follows, I will use the notation [wxyz] to denote a 4 digit number which has w in the thousands position, x in the hundreds position, y in the tens position, and z in the units position. Hence, the value of the number

Divisibility by 2

Tessellation pattern. (Source: Mathematical Mysteries)

Naturally, the first test of divisibility is that by 2. In order to derive this test, we do the following:

Here it is clear that the sum of the first three terms is divisible by 2. Hence, the number [wxyz] as a whole will be divisible by 2 if and only if z is divisible by 2, which happens when z = 0, 2, 4, 6, or 8, giving the test for divisibility by 2.

Divisibility by 3

When we reach divisibility by 3, we can do the following:

That is

Here it is clear that the first term is divisible by 3 and 9. Hence, the number [wxyz] as a whole would be divisible by 3 or 9 if the second term, that is the sum of the digits, is divisible by 3 or 9 respectively. This gives rise to the tests for 3 and 9.

Divisibility by 4

For the test for divisibility by 4, we manipulate our number as follows:

Here it is clear that the first term is divisible by 4. Hence the number [wxyz] as a whole will be divisible if the second term, which is the number [yz] formed by the last two digits, is divisible by 4. This yields the test for 4.

Divisibility by 5

The test for divisibility by 5 is easy to remember. But it is obtained by doing the following:

Here it is clear that the first term is divisible by 5. Hence, the number [wxyz] as a whole will be divisible by 5 if z is divisible by 5, which happens when z is either 0 or 5, giving us the test for 5.

Divisibility by 6 and 12

Since 6 is the product of 2 and 3, a number that satisfies the condition for 2 and 3 will be divisible by 6. Similarly, since 12 is the product of 3 and 4, a number that satisfies the condition for 3 and 4 will be divisible by 12.

Depiction of the Mandelbrot Set. (Source: Boston University)

Divisibility by 8

The test for divisibility by 8 is an extension of that derived for 4. We proceed as follows:

Here it is clear that the first term is divisible by 8. Hence, the number [wxyz] as a whole will be divisible by 8 if the second term, which is the number formed by the last three digits, is divisible by 8. Hence, we get the test for 8.

Divisibility by 10

The test for divisibility by 10 follows the same logic as that for 5. In order to derive it we do the following:

Here it is clear that the first term is divisible by 10. Hence, the number [wxyz] as a whole will be divisible by 10 if z is divisible by 10, which happens when z is0, giving us the test for 10.

Divisibility by 11

When we come to divisibility by 11, we realize that it is not as straightforward as the ones that came before. Indeed, while the test for divisibility by 11 was taught when I was in school, it has mostly been removed from the mathematics curriculum today. Nevertheless, the test for divisibility by 11 is quite instructive. So let us see what we can learn from it. Given the number [wxyz],, we can do the following:

Here it is clear that the term in red is divisible by 11. Hence, the number [wxyz] will be divisible by 11 if the term in green, which is the difference or the sum of alternate digits, is divisible by 11, yielding the test for 11.

Divisibility by 7

When we come to the number 7, there is indeed a test for divisibility by 7. It goes as follows. Suppose we have the number […xyz]. Then if […xy] – 2z is divisible by 7, the original number is divisible by 7.

For example, consider the number 658. We can proceed as follows:

Since 49 is divisible by 7, it follows that 658 is divisible by 7. But how does this work? Consider the number […xyz]. We can proceed as follows:

Here the term in green is obviously divisible by 7. The term in blue is the one we are asked to test for divisibility by 7. If it is divisible by 7, then the whole number is divisible by 7. But what if we had a much larger number? How would we proceed? Suppose, for example, that we were testing for 548975. We would have to proceed as follows:

Since 35 is divisible by 7, we can conclude that 548975 is divisible by 7. This is a recursive method that is fool proof. It is also easy to remember. However, due to the recursion and the fact that most of us probably barely remember multiples of 7 that are 3 or more digits long, for a number with n digits, we will have to perform n – 2 recursions before reaching a 2 digit number that we could recognize as a multiple of 7 or not.

Each recursion involves a multiplication and a subtraction. This results in 2n – 4 operations for the n – 2 recursions.

Attempt at Divisibility by 7

We can see that in most of the tests, we generate 2 terms, the first of which is automatically divisible by the number under consideration. Then the test devolves into testing if the second term is divisible by that number. For the tests for 3 and 9 we saw that one less than a power of 10 is automatically divisible by 3 and 9. For the tests for 4 and 8 we saw that multiples of 100 and 1000 are automatically divisible by 4 and 8 respectively.

Even for the test for 7 we followed the same basic idea indicated by the 21z term. However, while the other tests yielded the answer in a single step, the test for divisibility by 7 involved a recursive approach. Hence, for a 6 digit number, we had to perform 4 recursions to obtain a 2 digit number for which divisibility by 7 can be placed in memory.

The test for 11 was most illuminating. We saw that some powers of 10 (i.e. 10 and 1000) are 1 less than a multiple of 11 while other powers of 10 (i.e. 1 and 100) are 1 more than a multiple of 11. Could we use a similar approach to obtain another test for divisibility by 7?

To attempt that, let us see how we fare when we divide powers of 10 by 7. The table below shows the remainders.

After the 6^th power of 10, that is 1,000,000, the patterns repeats. And this is because

Hence, when we have a number of the form [xyzxyz] it is necessarily divisible by 7. Since 1,000,000 is 1 more than a number that follows this pattern (i.e. 999999) the pattern of remainders will repeat after every 6^th power of 10.

Now we observe that the first three powers of 10 give remainders of 1, 3, and 2 respectively. The next three powers of 10 give negative remainders in the same order.

So if I had the number [uvwxyz], we can rewrite it as

Here the term in red is obviously divisible by 7. Hence, the whole number [uvwxyz] is divisible by 7 if the term in green is divisible by 7. Let’s put this to the test and let’s label the term in braces as D.

Consider a 2 digit number (say 42). Here, u = v = w = x = 0, y = 4 and z = 2.

Since 14 is divisible by 7, 42 is divisible by 7.

Consider 91. Here

Since 28 is divisible by 7, 91 is divisible by 7.

Consider a 3 digit number (say 434). Here, u = v = w = 0, x = 4, y = 3 and z = 4.

Since 21 is divisible by 7, 434 is divisible by 7.

Consider a 4 digit number (say 2072). Here, u = v = 0, w = 2, x = 0, y = 7 and z = 2.

Since 21 is divisible by 7, 2072 is divisible by 7.

Consider a 5 digit number (say 24962). Here, u = 0, v = 2, w = 4, x = 9, y = 6 and z = 2.

Since 28 is divisible by 7, 24962 is divisible by 7.

Consider a 6 digit number (say 548975). Here, u = 5, v = 4, w = 8, x = 9, y = 7 and z = 5.

Since 14 is divisible by 7, 548975 is divisible by 7.

Now I grant that this process is not intuitive. There is no identifiable uniformity that would allow it to function as a test that is easy and quick to use as the other tests. There is, of course, a pattern. In each block of 3 digits, the digits are multiplied by 1, 3, and 2, respectively going from right to left. Alternate blocks are to be added and once all that is done, the difference between the sum of alternate blocks is to be evaluated. The resulting number, which will, in all likelihood, be a 2-digit number, is tested for divisibility by 7 and the result here tells us about the divisibility by 7 of the original number.

If we counted the number of operations involved, we will see that each of the parentheses in the green term above, involves 4 operations. For a number that has 3m digits, we will have m groups with m – 1 operations linking the groups, yielding 5m – 1 operations in all. With the previous test, a number of similar length (i.e. 3m), would require 6m – 4 operations. It is clear, then, that the second test will be quicker than the first. However, the first test is more straightforward than the second.

Stepping Back

However, despite the fact that the second test is not intuitive, such a process reveals the logic behind the other tests. Those tests are quite easy to implement, allowing us to take for granted their efficacy. And if my memory serves me right, none of my teachers ever really explained to me why those tests work, as I have done above. Hence, these algorithms come across to many students as some kind of mathematical wizardry, where reliable results are obtained by some numerical sleight of hand. Actually, I have a nagging feeling that many mathematics teachers themselves have no idea why the tests for 3, 4, 6, 8, and 9 work and most probably have no clue about the test for 11.

If we take a step back to understand what happened when we derived this test for divisibility by 7 we can see that, when we divided powers of 10 by 7, we obtained 6 different remainders. We took powers of 10 because that is precisely what the place value system of writing numbers shows us as we have seen when we expanded [uvwxyz] and other numbers. Concerning the remainders, we can further observe that in fact we obtain all the possible remainders we can get when we divide by 7. This is actually revealed in the recurring pattern ‘142857’ when we divide any integer that is not a multiple of 7 by 7. The increasing powers of 10 only shift this pattern rightward to give us the remainders as follows

Here the numbers with the same color indicate how the pattern is generated, that is, how one remainder morphs into the next as division is continued. Note especially the numbers in parentheses in the last column and compare them with the numbers in the second column of the previous table.

Since the recurring pattern for division by 7 is a string of 6 numbers (142857), we get 6 different remainders, which, in the case for 7, are all the remainders we could obtain. Due to this our test involved 6 different numbers u, v, w, x, y, and z. Since the pattern of remainders 1, 3, 2, -1, -3, -2 conveniently formed 2 blocks of 3 numbers, we were able to reduce the test to a replicating pattern involving the sum of 3 numbers for which the difference is finally taken.

Now, when we obtain tests for divisibility, we need focus only on the prime numbers as seen in the very brief descriptions for divisibility by 6 and 12. Powers of 2 present a unique situation as seen in the tests for 2, 4, and 8. However, if we focus on the prime numbers, the next one we encounter is 13. What would a test for divisibility by 13 look like? It pays to note that the recurring pattern for division by 13 is ‘076923’ and then to generate the following table:

Now it helps to not rush to any conclusions. While division by 7 and 13 produced patterns of length 6, this is not the case for all primes, as we should expect. For example, division by 17, the next prime, produces the full complement of 16 possible remainders. Anyway, getting back to 13, I encourage the reader to use the table above and the ones before it to obtain a test for divisibility by 13. Also do attempt a test similar to the recursive first one. Please do put your findings in the comments. And I will see you next week.