Tag: algebra

Factorising quadratics by focusing on the sum first

This blog post is about my way of helping students factorise quadratic expressions by inspection, which is the opposite of how most people do it.

Factorising

When you multply out two monic linear factors to make a quadratic, the same thing always happens:

$\begin{aligned} & (x+a)(x+b) \\ &= x^2+ax+bx+ab \\ &=x^2+(a+b)x+ab \end{aligned}$

You end up with the sum of the two constant terms as the coefficient of $x$ , and the product of the two constant terms as the final constant term.

Therefore, if you want to do this process backwards – that is, to factorise a quadratic expression – then you need to think of two numbers that add to give the coefficient of $x$ and multiply to give the constant term.

For example, to factorise $x^2+5x+6$ , you need to think of two numbers that add to give 5 and multiply to give 6.

There are at least two ways you could go about doing this systematically.

One way is to think of pairs of numbers that multiply to give 6, and then test them to see if they also add to give 5. So, you’d think of 1×6 and test 1+6=7, which isn’t right. And you’d think of 2×3 and test 2+3=5, which is right. So your factorisation is $(x+2)(x+3)$ .

Another way is to think of pairs of numbers that add to give 5, and test them to see if they also multiply to give 6. So, you’d think of 1+4 and test 1×4=4, which isn’t right. And you’r think of 2+3, and test 2×3=6, which is right. So your factorisation is $(x+2)(x+3)$ .

Every maths teacher I’ve ever met tells students to list the product first and check the sum. I think that it’s much better to tell students to do the sum first and check the product.

Examples

Let me do several examples to compare the sum first approach with the product first approach.

Example 1: $x^2+13x+40$

Product first

I need two numbers that multiply to give 40, which could be 1×40, 2×20, 4×10, 5×8 and I think that’s it. The matching sums are 41, 22, 14, 13. So the numbers I need are 5 and 8 and the factorisation is $(x+5)(x+8)$ .

Sum first

I need two numbers that add to give 13. I’ll start at 10+3, and the product is 10×3=30, which is too low. Now I’ll try 9+4, and the product is 9×4=36, which is higher but still too low. Now I’ll try 8+5, and the product is 8×5=40, which is just right. So the numbers I need are 8 and 5, and the factorisation is $(x+8)(x+5).$

Example 2: $x^2+20x+91$ .

Product first

I need two numbers that multiply to give 91. 1×91 obviously, and the matching sum is 1+91=92. So I need something else. What else? 2? Doesn’t go. 3? Doesn’t go. 5? Doesn’t go. 7? Oh yes that does work because 91=70+21, which is 10 and 3 sevens, so 91=7×13. The matching sum is 7+13=20, so that works. The factorisation is $(x+7)(x+13)$ .

Sum first

I need two numbers that add to give 20. My first thought is 10+10, and the matching product is 10×10=100, which is too high. Now 11+9=20, and 11×9=99, which is still too high, but lower. Next, 12+8=20 and 12×8=80+16=96, which is still too high, but lower. Next 13+7=20, and 13×7=70+21=91, which is just right. The factorisation is $(x+13)(x+7)$ .

Example 3: $x^2+30x+144$ .

Product first

I need two numbers that multiply to give 144. What goes into 144? It’s 12×12, so 1, 2, 3, 4, 6, 12 will all work. Have I missed anything? Oh 9, taking a 3 from each 12. Anything over 12 will go with one of the small numbers. Right, so what have we got?
1×144, but 1+144 is way too big.
2×72, but 2+72 is too big.
3×48, but 3+48 is too big.
4×36, but 4+36 is too big.
6×24, and 6+24 is just right.
So the factorisation is $(x+6)(x+24)$ .

Sum first

I need two numbers that add to 30. How about starting with 10 and 20?
10+20=30, 10×20=200, too big.
11+19=30, 11×19=110+99=209, that’s worse. I should be going the other way.
9+21=30, 9×21=189, still too big, but the right direction.
8+22=30, 8×22=160+16=176, closer.
7+23=30, 7×23=140+21=161, closer.
6+24=30, 6×24=120+24=144, just right.
So the factorisation is $(x+6)(x+24)$ .

Reasons for a sum first approach

The above examples point to the many reasons why I think focusing on the sum first is better than focusing on the product first. I’m going to list them, but they overlap quite a bit, so be prepared for me to repeat myself a lot in the explanations below each reason.

Reason 1

Doing the product first requires you to know or figure out what numbers divide into another. Doing the sum first doesn’t require any special knowledge about factors.

Look at Example 2. We had to figure out that 91 had 7 as a factor at all before we could get to the answer. With the sum, it just fell out along the way.

You may argue that guessing factors is a really important skill, and I don’t disagree, but honestly students don’t have much practice at that when they start factorising quadratics, and it’s a huge barrier to success. Focusing on the sum first allows them early success without the need for this skill. And you know what, they do a lot of multiplications along the way and might even notice what numbers tend to be multiples of what other numbers.

Also look at Example 3. The number 144 has a lot of factors, and you kind of need to find all of them to be able to have things to try to see if they come out to the right sum. Most worked examples for students dont even list all the options for factors, but just zero in on the magically right one, picking from an unspoken list in the teacher’s head. With the sums first approach, it doesn’t matter if you missed a factor.

And look, all the work you’ve done in the past to get good at seeing factors isn’t wasted! If one of the sums is 7+23 so you test 7×23 going for 144, you can actually say to yourself that 144 isn’t a multiple of 7 and just skip that one. I actually think developing this instinct for ways to shortcut the process can be quite an exciting idea to students.

Reason 2

With sums first, you can get started right away.

When you do product first, you have to think of some factors to begin with, and it’s very rare that 1×something is going to work, so there’s this job to do before you can even get started. When you do sums first, it’s not hard to think of a sum that works and you can just get on with it.

And there’s no wrong place to start either. You can just do a couple and you’ll know then if you’re going in the right direction. (See what happened with the one with 144.) So there’s no need to worry about your first inspiration – you can just get going.

Reason 3

With sums first, you feel like you’re getting somewhere.

When you investigate the sum first, you systematically change them by 1 each time and the product changes along with it, getting closer and closer to the right answer. There is a real feeling of progress, like the work is paying off. And to reinforce the previous Reason, this feeling happens right at the start, rather than having to wait for finding factors first.

I will concede that you can be systematic with the product first approach too, as you saw in my example with the 144. But to many students, the examples they see seem random, or worse, go straight to the right answer with no trial and improvement. If you do want to do product first, then I recommend being more systematic about it so that students can feel like they’re getting somewhere, rather than waiting for the lightning strike of the right one.

Note that the feeling of getting somewhere has another advantage: if you’re a long way away from the right result, it makes you feel safer to skip some steps to get there quicker. This way lies developing instincts for when some combinations of numbers are unlikely to work.

Reason 4

With sums first, there’s cool things you can help students to notice.

I personally think the experience of running through several possible sums and testing the products is some excellent fuel for helping students notice cool things, which are totally lost on a products-first approach.

For example, in the example with $20x$ , the highest possible product happened when the sum was 10+10=20. That is, when it was two of the same number. This is very cool and that way lies completing the square. Also the further apart the numbers were, the further away the next product is from this one. Indeed, the differences were two apart.

And I’ve already mentioned students noticing that a certain sum would require 144 to be a multiple of 7 and skipping it, that sort of thing makes the skill of noticing factors feel like a cheat code they’ve discovered, rather than a burden upon them. That sort of noticing is empowering for lots of students.

More examples

You’ve probably noticed that all the examples I’ve shown so far have had all positive coefficients, and they’re all monic (the coefficient of $x^2$ is 1). Well it’s time for some examples to deal with that. First I’ll deal with the negative coefficients, then later I’ll deal with non-monic quadratics. Mostly I’ll just do them straight using the sum first approach as if I didn’t know the answers yet, rather than compare them to a product first approach.

Example 4: $x^2-13x+40$

We have to think of two numbers that add to -13 and multiply to 40.

Positive numbers won’t add to a negative number, so I need two negative numbers, which will indeed multiply to a postive number.

(-10)+(-3)=-13, (-10)×(-3)=30, too low.
(-11)+(-2)=-13, (-11)×(-2)=22, even lower. Need to go the other way.
(-9)+(-4)=-13, (-9)×(-4)=36, higher.
(-8)+(-5)=-13, (-8)×(-5)=40, correct!

So the factorisation is $(x-8)(x-5)$ .

Example 5: $x^2+3x-40$

We have to think of two numbers that add to 3 and multiply to -40.

If you think about the product first, there’s twice as many options as there were before, because while 4×10=40, both (-4)×10 and 4×(-10) are -40 and you have to decide which one. If you only think about the product long enough to realise you need one positive and one negative, then you can start your search with sums that add to 3 like this:
4+(-1)=3, 4×(-1)=-4, not low enough.
5+(-2)=3, 5×(-2)=-10, lower, so I’m going the right way.
6+(-3)=3, 6×(-3)=-18, lower.
7+(-4)=3, 7×(-4)=-28, getting there.
8+(-5)=3, 8×(-5)=-40, and we’re there.

I probably could have skipped a couple since there was a long way to go, but it was so pleasant watching it get closer.
Anyway, the factorisation is $(x+8)(x-5)$ .

Example 6: $x^2-11x-26$

We have to think of numbers that add to -11 and multiply to give -26.

I’ll need a positive and a negative number to get a negative product, so let me start with -12+1.

-12+1=-11, (-12)×1=-12, not low enough
-13+2=-11, (-13)×2=-26, correct!

So the factorisation is $(x-13)(x+2)$ .

What if I had decided to start with something less obvious, like -20+9?

-20+9=-11, (-20)×9=-180, way too low.
-21+10=-11, (-21)×10=-210, even lower, so I need to go the other way.
I’ll skip some since I was so far away.
-15+4=-11, (-15)×4=-60, getting closer.
-14+3=-11, (-14)×3=-42, getting closer.
-13+2=-13, (-13)×2=-26, just right!

Example 7: $x^2+11x+26$

We have to think of numbers that add to 11 and multiply to give 26.

Two positive numbers will work, so I’ll start with 10+1.

10+1=11, 10×1=10, too low.
9+2=11, 9×2=18, higher so I’m going the right way.
8+3=11, 8×3=24, closer.
7+4=11, 7×4=28, too big.

So there’s definitely a factorisation that will work with roots somewhere between 7 and 8 and between 3 and 4, but there’s not one with integers.

Example 8: $x^2+8x+20$

We have to think of numbers that add to 8 and multiply to give 20.

I’ll start with 1+7=8, 1×7=7, too low.
2+6=8, 2×6=12, too low, but closer, so I’m going the right way.
3+5=8, 3×5=15, too low.
4+4=8, 4×4=16, too low.
But there’s nowhere else to go from here. I’ll never get to 20.
So this one doesn’t factorise at all.

Interlude

It’s time to stop for a short break. I’m hoping that this set of examples has convinced you that this approach has some merit for helping students understand how quadratic equations work, and indeed making the process a bit more playful.

I also sneakily wanted to cover some objections people have brought up, such as how you could be sure it doesn’t factorise if there’s infinitely many choices for numbers that add to the x-coefficient.

I just have one more thing to deal with, which is what do do with a non-monic quadratic. I’m just going to do one example in two ways.

Two more examples

Example 9a: $6x^2+x-12$

There is this method called by many “the ac method” which allows you to factorise a non-monic polynomial. I didn’t learn it at school, so I don’t think of it first, but it’s always something people bring up when I talk about factorising quadratics.

The way it works is you multiply the constant term and the leading coefficient, and then think of two numbers that add to the x-coefficient and multiply to give this new answer. (If your quadratic was $ax^2+bx+c$ , that means making the sum $b$ and the product $ac$ , hence the name of “ac method”.) Then you split the x-term into two parts with these numbers as the coefficents and continue from there.

(For a proof, consider the product of two linear factors:

$\begin{aligned} & (ax+b)(cx+d) \\ &= acx^2+adx+bcx+bd \\ &= (ax)x^2+(ad+bc)x+(bd) \end{aligned}$

Notice how the numbers $ad$ and $bc$ add to give the x-coefficient and multiply to the same answer as the x²-coefficient times the constant term. There can only be one pair of numbers with a specific sum and product, so if you find these numbers, they will be $ad$ and $bc$ and you will be able to do that algebra in reverse. )

Anyway, this still requires you to find two numbers with a specific sum and a specific product, so you can still do sum first.

The quadratic is $6x^2+x-12$ . So I need numbers that multiply to give 6×(-12)=-72 and add to give 1. I’ll need a positive and a negative.

2+(-1)=1, 2×(-1)=-2, which is way too high.
3+(-2)=1, 3×(-2)=-6, which is lower, so I’m going the right way, but I have a long way to go. I’ll skip some.
6+(-5)=1, 6×(-5)=-30, which still has a long way. I’ll skip some more.
10+(-9)=1, 10×(-9)=-90, which is too far, but quite close.
9+(-8)=1, 9×(-8)=-72, which is just right

So I need to split the $x$ into $9x$ and $-8x$ .

(You could argue that if I went product first, I might have realised immediately that 8 and 9 would be right, but I can guarantee you that a heap of students would not realise that. This way, they’ll get there in the end.)

So,

$\begin{aligned} & 6x^2+x-12 \\ &= 6x^2+9x -8x-12\\ &= 3x(2x+3)-4(2x+3)\\ &= (3x-4)(2x+3) \end{aligned}$

If you wanted this in fully factorised form so that it shows the roots, you’d have to pull out a 3 from one factor and a 2 from the other to get

$\begin{aligned} & (3x-4)(2x+3)\\ &= 3\left(x-\tfrac43\right)\times 2\left(x+\tfrac32\right)\\ &=6\left(x-\tfrac43\right)\left(x+\tfrac32\right) \end{aligned}$

(It’s worth noting that for many people, this “splitting the middle term and then factorising twice” thing is the way that you’re supposed to do all quadratic factorisations, including the monic ones, which I can see the appeal of if I’m honest. But I’m not rewriting my entire set of examples now.)

Example 9b: $6x^2+x-12$

There is a far more prosaic approach than the ac method, which is just to do what I’ve been doing all along but with fractions. Let me show you:

$\begin{aligned} & 6x^2+x-12 \\ &= 6\left(x^2+\tfrac16 x-2\right) \end{aligned}$

Now I’ll factorise the monic quadratric in the brackets there. I need two numbers that add to give 1/6 and multiply to give -2. They’ll have to be a positive and a negative.

2/6+(-1/6)=1/6, 2/6×(-1/6)=1/3×(-1/6)=-1/18, which is not low enough.
3/6+(-2/6)=1/6, 3/6×(-2/6)=1/2×(-1/3)=-1/6, which is lower but not low enough, and I’ve got quite a long way to go, so I’ll skip some.
7/6+(-6/6)=1/6, 7/6×(-6/6)=7/6×(-1)=-7/6, so much closer.
8/6+(-7/6)=1/6, 8/6×(-7/7)=… yeah that won’t work out right.
9/6+(-8/6)=1/6, 9/6×(-8/6)=3/2×(-4/3)=-2 yay!

So the factorisation is $6\left(x+\frac32\right)\left(x-\frac43\right)$ .

I have to say I prefer this one to the other one in a lot of ways. But yes a big fly in the ointment is the fraction arithmetic. But honestly this seems to me to be quite low stakes, and it certainly gives a lot of practice! You have to decide how you want to play it.

Oh, and why did I choose to count in sixths? Well it turns out that in a monic quadratic with rational coefficients, if there are any rational solutions, they’ll be able to be written with the common denominator of the coefficients. (But that’s another story and shall be told at another time.)

Conclusion

So, I’ve given a lot of examples to show how the reasoning works when you factorise quadratic expressions by first focusing on the sum and checking the product, rather than the other way around as is more traditional. And I’ve tried to describe why I think it has a lot of advantages. I hope you give it some consideration when you next help students with their factorising.

(And one little addendum: I think it’s worth considering this for all of your students first, rather than just reserving it for students who struggle with doing products first. Don’t let it become an othering explanation.)

25 April, 2026
Home & Away: Geometric Arithmetic

Introduction

This blog post is about a way to define addition and multiplication on a number line using the geometry of the plane that surrounds the line.

Recently I talked about how numbers have multiple purposes and one of those purposes is locating. When you draw a number line, that’s pretty much what you’re doing – saying that the numbers are exactly locations on that line. There is an issue with this idea, which is this: how do you add or multiply locations?

The usual way of dealing with this is to think of numbers as not just locations but journeys too, so that when we see something like 10+3, the first number is a location and the second one is a journey and the answer 13 is the location we arrive at by beginning at 10 and travelling onwards 3. This is fine and I like it very much, actually, but there is still the huge problem of how to multiply locations or indeed how to multiply journeys. You can say that 3×10 is three lots of a journey of 10, which makes sense, but now the 3 is neither a journey nor a location but a literal amount and we have three different things so far that the numbers mean. Is there a way that preserves the location-ness of everything?

Yes, there is.

There is indeed a way to define addition and multiplication of locations on a number line that preserves their fundamental location-ness, by looking outwards to the other lines of the plane your line is part of. The system doesn’t directly use lengths or angles, but only uses the most fundamental geometrical actions of drawing lines through two points, finding where two lines meet, and drawing lines parallel to other lines. This is my favourite thing about it, that it’s fully based on the relationships between the points and the lines as objects. As a pure mathematician and a finite geometer specifically, geometry isn’t really about measurements at all but is all about relationships, so something that focuses on the relationships deeply appeals to me.

I created this method in October 2020, heavily based on the method invented by Marshall Hall Jr in the late 1950s. Hall’s method works by adding coordinates to all the points in the plane including the ones on your number line, making equations for the lines, and referring to the coordinates of specific points on specific lines to define the arithmetic. I studied his method in 2001 while doing the honours year in my undergraduate maths degree, but it wasn’t until 19 years later that I realised there was a way to do it without referring to coordinates, by focusing my attention on the number line itself rather than the coordinate axes. My method for multiplication also has striking similarities to René Descartes’ original definition for the multiplication of lengths published in the 1630s, even though I didn’t mean it to and only found this out later. An important difference between them is that his method has the two factors on different sides of a triangle instead of along the same line. I find it very interesting that Hall’s book is called “The Theory of Groups” and Descartes’ book is called “Geometry”, highlighting the deep connection between geometry and algebra which all three methods point to.

Anyway, enough historical notes. Let’s get to it.

How to do geometric arithmetic

You can watch me doing both processes live in a video here, or you can read a text description and see screenshots from the video below.

Setting up

To do parallel line arithmetic, you need to set up a few things.

First, choose a line to be your number line. The points on the number line are your numbers.

Next, you’ll need to choose two different points on the line to call 0 and 1. The point 0 will be important for defining addition and both 0 and 1 will be important for defining multiplication.

Finally, you’ll need to choose two lines other than the number line itself that are not parallel to the number line and not parallel to each other. It doesn’t really matter where they are because the specific lines themselves aren’t important, only their directions, since during the constructions for addition and multiplication, you won’t be making them intersect with any lines. Instead you will draw several lines parallel to each of these lines. I call one direction “home” and the other “away”. (The reason why I chose these words specifically rather than “in” and “out” for example is because I am Australian and “Home and Away” means something to me.) To make it easier to focus when I’m drawing diagrams for the arithmetic, I usually put my home and away lines a bit away from the part of my number line I drew.

Addition

To add two numbers geometrically, you follow this process.

First, you need to have your two points a and b, and the point 0 on your number line. You don’t need a and b to be different from each other or different from 0, but I’ve drawn them as different to make it easier to show how the process works.

Create a journey from 0 to a first following the away direction and then following the home direction. That is, draw a line through 0 parallel to the away line.

Then draw a line parallel to the home line that passes through a.

Find the point where these two lines meet. If you follow the journey from 0 to a first along the away direction and then along the home direction, this point is where the journey turns from going away to going home.

Now draw a line parallel to the number line, through this turning point. This is the turning line for all additions that go some number plus a.

Now you are ready to do b+a.

Draw a line through b parallel to the away line and find where it meets the turning line. This is where the journey from b will turn and return home to the number line.

Now draw a line through this turning point parallel to the home line, and find where it meets the number line.

This point is the point b+a.

Multiplication

To add two numbers geometrically, you follow this process, which is similar to the process for addition, but with two very important differences.

First, you need to have your two points a and b, and the two points 0 and 1 on your number line. Just like before, you don’t need a and b to be different from each other or different from 0 or 1, but I have chosen them that way to make it easier to show how the process works.

Create a journey from 1 to a first following the away direction and then following the home direction. (This is the first point of difference between multiplication and addition, that the journey begins at 1 and not 0.)

That is, draw a line through 1 parallel to the away line.

Then draw a line parallel to the home line that passes through a.

Find the point where these two lines meet. If you follow the journey from 1 to a first along the away direction and then along the home direction, this point is where the journey turns from going away to going home.

Now draw a line that passes through both this turning point and 0. This is the turning line for all multiplications that go some number times a. (This is the second point of difference between multiplication and addition, that the turning line passes through 0 rather than being parallel to the number line.)

Now you are ready to do b×a.

Draw a line through b parallel to the away line and find where it meets the turning line. This is where the journey from b will turn and return home to the number line.

Now draw a line through this turning point parallel to the home line, and find where it meets the number line.

This point is the point b×a.

Thoughts

So that’s David Butler’s methods of geometric addition and multiplication. I love it so much, especially the multiplication. Addition is a little more complicated than you’d expect if you are used to adding numbers by joining journeys head to tail, but multiplication somehow feels so much simpler than it has a right to be.

My favourite thing to do is to convince myself that various algebraic properties of numbers have to be true using these two definitions of addition and multiplication. For example, this diagram shows that a+b = b+a (at least for positive numbers).

And this diagram shows that b+b = b×2 (at least for numbers more than 1).

And this diagram shows that when you multiply two negative numbers, you get a positive number. (I know those numbers are negative because they’re on the opposite side of 0 from 1.)

Yeah none of them are formal proofs, but I still love them.

The truly remarkable thing is that it doesn’t matter how you choose your home and away directions, it will still work! For example, here are three diagrams showing 1+1=2 and 2×2=4.

The geometry of the real plane is so neatly structured that it works every time. If our number line was in a plane with a different structure, this might not work the same every time. Indeed, you may end up with entirely different rules for how addition or multiplication work, such as multiplication not being associative.(That is, (a×b)×c not being the same as a×(b×c), which is very annoying!)

I may do future blog posts about some of that other stuff, but for now, I’m enjoying just revelling in the coolness that is the existence of a method for multiplying locations, and the even higher coolness of watching geometry cause the algebraic properties of numbers. I hope you enjoyed it too.

10 April, 2026
Digit Disguises
This blog post is about a game I invented this week, and the game is AWESOME, if I do say myself.
- You can read the rest of this blog post in PDF form here , including game instructions and reflections on playing and creating the game. The document also contains a later blog post with instructions for how to introduce the game to a class using a smaller version of the game.
- A printable version for the game can be downloaded here , which includes rules and templates to fill in and can be turned into battleships-style stands.
21 September, 2019
The Number Dress-Up Party

I created the Number Dress-Up Party puzzle way back in 2017 and every so often I stumble across it again when searching Twitter for other stuff. When I stumbled across it today, I decided it was time to write it up in a blog post.

The puzzle goes like this:

The Number Dress-Up Party

All the numbers have come to a dress-up party in full costume. They all know themselves which costume everyone else is wearing, but you don’t know.

If you pick any two of them and ask them to combine with +, -, × or ÷, they will point out which costume is the correct answer, and they’ll happily do it as often as you want. For example, you might ask for robot + bear and they will point to unicorn. (If the answer isn’t at the party, they’ll tell you that too.)

How do you correctly identify the numbers 0, 1, 2 and 3? How do you do it in as few steps as possible?

(Photo from http://www.worldrecordacademy.com/mass/most_mascots_to_do_the_same_danc… )

It is worth clarifying now that when I say “all” numbers I originally meant it to be all the real numbers. (But it is very interesting to think about how the problem is different or even possible if it means all the rational numbers or all the integers or all the natural numbers or some other set of numbers entirely.) It’s also worth pointing out that it is actually possible for you to guarantee that you actually find these numbers – you shouldn’t have the possibility of having to go through all the infinitely many costumes to be sure of finding 0, for example.

It’s also worth clarifying that the rules say you have to ask two different costumes to combine with an operation. If you can see how using two of the same costume might help you identify actual numbers, then you are thinking along some helpful lines. However, the puzzle is much harder and much more interesting if you have to use two different costumes every time.

I love this puzzle so much! The reason I love it is that it forces you to think about numbers and algebra in a completely different way to any other puzzle or problem I have seen. You really need to think about how numbers are related to each other and how they behave under operations in order to figure out a way to correctly identify these numbers from the way the costumes interact.

Also, in order to tell someone about your solution, or even figure it out in the first place, you need to find a way of talking about or writing about this, which is a lot more difficult than it might seem at first glance. I find it really interesting to see how people attack the problem of describing what they are doing in this problem.

The feel of this puzzle to me is the feel of an abstract pure maths course like abstract algebra or number theory or real analysis, where you are digging deep into how numbers work without reference to specific numbers per se. I would love to go into such a course and use this in the first lecture/tute to get students in the right sort of mindset to attack the rest of the course. I’d also love to do an extension to this as an investigation into how the Euclidean Algorithm for natural number division works. To all those people who haven’t had that sort of training I say you are doing what a pure mathematician does when you think about this problem! You never thought it would be so much fun, did you?

Anyway, there is the Number Dress-Up Party puzzle for posterity. There are several different solutions and they are all lovely. Have fun!

PS: If you feel like seeing how people have thought about this problem, and are ok with spoilers, then check out the replies to this tweet .

These comments were left on the original blog post:

mau 4 February 2019:

I did not see much of a difference between real and rational numbers: at least, the solution I devised is the same (and no real number can be generated in a finite number of steps, unless we stumbled into a costume specially related to that number: but maybe I am wrong). Integer and natural numbers need a different approach, indeed.

David Butler 11 February 2019:

Yes I didn’t see much of a difference between the real and rational number parties. Or indeed the complex number party either. I think once they’re closed under all four of the operations we’re allowed to use, you don’t get much extra.
And yes I agree the natural numbers/integers are very interesting!

20 January, 2019
Finding an inverse function

There is a procedure that people use and teach students to use for finding the inverse of a function. My problem with it is that it doesn’t make any sense, in two ways.

You can read the rest of this blog post in PDF form here.

19 January, 2017
Things not sides

When doing algebra and solving equations, there is this move we often make which is usually called “doing the same thing to both sides”. Quite recently this phrase of “both sides” has begun to bother me.

You can read the rest of this blog post in PDF form here.

27 May, 2016
The trig functions are about multiplication

When I was taught trigonometry for the first time, I learned it as ratios of sides of right-angled triangles.

You can read the rest of this blog post in PDF form here.

25 February, 2016
There is only one kind of function that distributes over plus

There is a very common thing that students do that causes pain, distress, confusion and depression in any maths educator who witnesses it. Both the error itself and the educator’s response to it are very clearly described by this excellent picture from the blog “Math with Bad Drawings”:

You can read the rest of this blog post in PDF form here.

7 August, 2015
Splitting logs

In our bridging course (and indeed in Maths 1M and Maths 1A and several other courses) there is a section on differentiating logarithmic functions. One of the classic questions that we ask in such a section is to differentiate the log of some horrifying function, with the intention that the students use the log laws to simplify the original function first and then differentiate. There is something about this particular type of question has long bothered me and I only just figured out how to resolve my issue with it. I’m so excited I need to share it somewhere!

You can read the rest of this blog post in PDF form here.

12 November, 2014
Two wrongs make a right

Students make a lot of mistakes when doing their maths, but sometimes they will make two mistakes in such a way that their final answer is still correct. This happened last week with one student quite spectacularly, because his doubly wrong method of doing a particular problem always produces the correct answer.

Let me explain: the Maths 1A students are currently learning about vectors in Rⁿ and one of their assignment questions gives them several lists of vectors and asks them to decide if they are linearly independent or linearly dependent.

What they are supposed to do (based on what they are shown in the lectures) is put the vectors into a matrix as columns, then do row operations until they can tell where the pivots will be. If every column has a pivot, then the vectors are linearly independent; if there is a column with no pivot, then the vectors are linearly dependent. Check out this example to see how it works (it has a little extra for later on).

The reason for this method goes right back to the definition of linear independence: the vectors v₁, …, v_r are linearly independent when the equation x₁v₁ + … + x_rv_r = 0, has only the trivial solution of x₁ = 0, … , x_r = 0. If there are any solutions where any of the x_i‘s are not zero, then they are linearly dependent.

If you write out the vectors in full using their coordinates, then you find that using the first coordinates, you can make a linear equation involving the x_i‘s, and similarly each successive coordinate produces a separate equation. So solving the vector equation is the same as solving n linear equations. And the students know how to do this: you put the coefficients of the equations into a matrix with one row per equation and do row operations. The final matrix represents a reduced set of equations, and if you can get to a stage where each row has a 1 in one spot and 0’s in the others then you’ve got x₁ = 0, …, x_r = 0. This would mean your original vectors had to be independent. If any one variable doesn’t have a pivot, then you can let it be any value at all and still find a solution. This is called a free variable. Since it can be anything, it could be something other than zero, and so your vectors will be linearly dependent.

So this is what the student was supposed to do. However, what they did do was put the vectors into a matrix as rows, do row operations, and then look to see if there was a row of zeros. If he got a row of zeros, then he said the vectors were linearly dependent, and if he didn’t, then he said the vectors were linearly independent.

His first mistake was to put the vectors in as rows. I am often repeating the mantra “vectors are columns, equations are rows” to students to remind them that for most situations, vectors really ought to be columns. We usually mutliply matrices on the left of a vector (Ax) which would only make sense if the vector was a column, and also this arrangement corresponds exactly to doing a specified linear combination of the columns of A. Finally, in this specific situation, the matching coordinates of each vector together make an equation, and equations are definitely rows, so this forces the vectors to line up their coordinates in rows. That is, they have to be columns (if we are to be solving equations anyway).

His second and much more fundamental mistake was to think that a row of zeros meant there had to be a free variable. (I know he was thinking this because he actually said it to me.) This is wrong because if a matrix does represent a set of equations, then whether you get rows of zeros at the end is actually not strongly related to whether there are free variables. Firstly, a row of zeros with a nonzero number in the answers position indicates no solution at all, independently of what the rest of the matrix is doing. Secondly, if there’s a pivot in every column, then there will be a unique solution regardless of how many rows of zeros there are at the bottom. Whether you get infinitely many solutions is all about the pivots in the columns, not about the zeros in the rows!

It’s not surprising that the student had this misconception because many high school students only ever see square matrices, and a row of zeros in a square matrix prevents there being a pivot in one of the columns and so does in fact indicate a free variable. But it doesn’t apply for non-square matrices, especially ones with more rows than columns.

I explained all of this to him and he was happy with what I said. But then he frowned and asked, “Then why did I still get the answers right?” Why indeed?

If my student had put his vectors in as columns and looked for zero rows, he would have gotten his answers wrong. This is because with the vectors as columns, the rows are equations and the aim is to solve the equations and look for free variables. Since zero rows do not always force free variables, he would be looking for the wrong thing and would have been wrong a lot of the time.

If my student had put his vectors in as rows and looked for columns without pivots, he would also have gotten his answers wrong. This is because pivots are supposed to refer to variables in equations, whereas this matrix wouldn’t actually represent equations at all, letalone ones related to the definition of linear independence.

However, he did neither of these combinations and instead put the vectors in as rows and looked for rows of zeros, and he got his answer right every time! The key to resolving this paradox is to figure out what the row operations represent when the rows of your matrix are vectors rather than equations.

Row operations basically perform linear combinations of rows. So when you do a sequence of row operations, you are actually doing a linear combination of the original rows. Therefore any row produced by this process is a linear combination of the original rows. So if you are able to produce the zero vector then it is actually possible to produce the zero vector by linear combinations! And it’s not the trivial one either. Hence the original vectors must have been linearly dependent. Amazing huh?

An important question arises: what if you wanted to know what the actual linear combination was? With the vectors-as-columns approach, you are literally solving to find the x_i‘s, and so at the end if you pick a nonzero value for the free variables, you can find the rest of them and there you have it!

With the vectors-as-rows approach, the linear combinations you do are recorded in the row operations. We need a way to keep track of the linear combinations / row operations we have done. One way to do this is to reason that if the original vectors were the standard basis, then whatever final vector we got would tell us what linear combination we had done (for example, (1,3,-2) = 1(1,0,0) + 3(0,1,0) -2 (0,0,1)). So why don’t we do the same operations on the standard basis as we do on the original matrix?

What this means is that if you place an identity matrix next to your original matrix and do the same row operations on both, then whatever vector is next to the row of zeros when it happens will be the linear combination of the original rows that produced the zero vector! Check out the same example as before but by this new method.

Isn’t it amazing the things you can learn by being wrong first!

This comment was left on the original blog post:

Stephen Wade 20 May 2014:
It’s funny how convention comes into play and that you could prove that either approach works. I had to check quickly, and I think this is right, that if you put vectors as rows in an m x n matrix A, then if dim ker A^t > 0 means you have linear dependence. Using rank theorem, dim ker A^t = m – dim col A = numbers of rows – number of rows with pivots = number of rows of zeros in the rref of A. So if the number of rows in rref of A > 0, you should have linear dependence. Cool 🙂

20 May, 2014