Masses of Maths: what should pupils learn by rote?

Should maths be learned by rote?

Some of the most egregious pedagogy is born when the answer to that question is ‘100% yes’ or ‘100% no’.

“100% yes” conjures up – perhaps rightly – an image of maths as a joyless subject whereby pupils are learning algorithms without meaning. Although it can feel like an easy way to teach, pupils are unlikely to succeed with equations such as (4a+4)/3a = 17 if the approach to linear equations has simply been ‘change side, change sign’ and practise only the simplest problem types (e.g. 4a + 3 = 23). Automaticity with times tables, simple written calculation and being able to regurgitate the order of operations is of limited help if the pupils aren’t taught how to think flexibly (i.e. if they can’t see the deep structure of a question).


“100% no” is also problematic. Typecast as the progressive approach to maths, it is founded on exploring maths as a way to develop deep understanding (and an assumption that fluency and confidence arise from there). It is championed by academics such as Jo Boaler and many teachers (and maths consultants…), and the heart of much debate. This approach argues that relational facts needn’t – and shouldn’t – be taught as such and certainly don’t need to be explicitly memorised.

jo boaler quotation

Relational facts are those that can be derived from a smaller field of arbitrary conventions (such as ‘angles in a straight line sum to 180o’ is derived from the convention that angles around a point sum to 360o) or easily understood and recalled relationships (e.g. I can calculate 3 x 8 by doubling a relationship I do recall – 3 x 4 = 12 – to get 3 x 8 = 24).

There is clear merit in an approach that builds relational understanding1. It is an important part of building the storage strength of concepts2 (how well a concept or fact connects to other memories and concepts) but, used alone, it ignores what is happening in pupils’ brains as they work.

Simplistically put: as pupils work on a new problem or idea, their working memory is gradually being ‘used up’ until there is little capacity for additional processing. Take this problem:

0.8 + 0.4 x 52 ÷ 0.01

A pupil has to think about all of the following:

  • The order of operations (that they should complete the multiplication and division first AND that, within that, that they should work from left to right)
  • What the notation []2 means
  • The value of 52
  • A strategy to multiply an integer by 0.4
  • A strategy to divide by 0.01
  • How to add 0.8 to the answer

That is a lot to think about! If trying to think about each idea from scratch, their working memory will soon overload, making the calculation seem more complex than it is.

In comparison, the problem is much simpler for a pupil who confidently knows the following facts by heart:

  • 52 = 25
  • ÷0.01 = x100
  • To multiply an integer by a decimal, I can ignore the place value at first and adjust afterwards
  • 4 x 25 = 100
  • 4 x 25 = 4 x 2.5
  • 4 x 2.5 = 10

…they will see this instead:

0.8 + 0.4 x 52 ÷ 0.01 = 0.8 + (0.4 x 25 x 100) = 0.8 + (4 x 2.5 x 100)

A much less daunting calculation, and one where much less tricky processing or self-doubting thought has taken place.


What does a knowledge grid have to do with it?

In the Michaela maths department, we aim to identify all the facts and relationships that can be codified as a single nugget of knowledge (or set of clear steps) that will reduce pressure on pupils’ working memories. This frees them up to tackle more complex and interesting problems and allows them to feel confident in their reasoning and solutions.

This does NOT mean teaching without understanding. It is the opposite: we aim for pupils to understand why something works, or is the way it is, and then to be so confident of that fact or relationship that they can recall and use it with minimal effort and worry.

The purpose of a knowledge grid – explained in detail by Joe Kirby – is to set out what these facts and relationships are, and to support pupils in learning them by heart.

Take indices, which the Y7 pupils have just learned about:

indices knowledge grid

This sets out what we expect pupils to know by heart if they are going to be able to tackle more complex or interesting problems involving indices (e.g. What is the final digit of 10100+999+598?). Knowing by heart that ab x ac = ab+c doesn’t replace knowing why this relationship is true. But, knowing it by heart – and practising explaining why it is true – frees pupils up to tackle problems like ‘evaluate 23 x 52 x 22 x 53′.

Here is the grid for Y8 pupils at the outset of learning to solve linear equations:

masses of maths 3

Here is an example for Y8s learning to substitute and use formulae:

masses of maths 2.PNG

Sometimes it is solely a collection of relationships, such as the grid Y7 are about to work from:

masses of maths 4

(shading in grey typically indicates ‘optional’ knowledge, in that it is possible to be successful in maths without knowing those facts by heart…at least not at their stage!).

A useful rule of thumb is: if we, as maths teachers, know these facts by heart because they help us work more efficiently and confidently, then the pupils should know it by heart too.

How is it used?

In lessons, the knowledge grid lays out the agreed definition and procedures that we want to share with pupils. The constraint of the definition means we teach to a higher technical standard, ensuring that we stick to language like ‘eliminate this operation’ (instead of saying ‘get rid of the 4’ in a bid to make the maths feel more accessible). Knowing that the pupils must understand and use a phrase like ‘isolate the unknown’ forces us to explain it with greater clarity, check they understand it precisely, and then use it constantly.

In most lessons, pupils are quizzed on the terms and facts in the knowledge grids. This can be cold calling (asking questions and picking students), checking everyone’s answer on mini-whiteboards, or giving a 1-minute quiz in books (e.g. “write the formula for the area of each of these shapes” or “rewrite each of these as a multiplication: ÷0.5, ÷0.1, ÷0.25, ÷0.125, ÷0.01, ÷0.2”).

Once a week, pupils ‘self-quiz’ at home on the definitions and facts the teacher has set for that week. Typically, this is 10-15 facts/definitions. Pupils first practise saying the facts to themselves, then cover the right-hand side and write the definitions based on the prompts on the left-hand side, and then correct their errors in green. They continue this until a page is filled. It is possible to game it by mindlessly copying, but it becomes obvious if they’re doing so because…

Once a week, pupils take a formal, but low-stakes, written quiz, of which half will be a knowledge grid test (the other half tests their ability to apply procedures and try unfamiliar problems).

The levels of scaffolding vary; these are the knowledge grid sections Y8 took recently:

masses of maths 5.PNG

masses of maths 6.PNG

Pitfalls We Fell Into

An easy temptation is to produce a ‘revision mat’ full of facts, examples, diagrams and mnemonics. Although this is close to a knowledge grid, it isn’t as useful. It must be REALLY EASY to test yourself from a knowledge grid without ‘accidentally’ seeing the answer, or having prompts. It must be really clear what they should know by heart (the definitions and terms and facts) and what is just useful for jogging their memories (examples, where appropriate).

Another easy error is to go overboard with how much you try to codify and write down. If you, as teachers, struggle to articulate the definition or steps for something, it probably isn’t useful or suitable. Make steps for a strategy (e.g. solving equations) as generalised as possible so that pupils aren’t learning multiple minimally different steps and becoming muddled and frustrated. The more generalised the steps, the more they can be used to illuminate the common features of varied problems (and thus help pupils see the underlying structure).

Pitfalls We’re Still Trying to Avoid

We are still struggling to decide which aspects of algebraic simplification can be listed as facts: here is the start of a debate I was having in my head this morning for updating the facts in the ‘expressions and simplification’ grid:

masses of maths 7.PNG

Any that are included are there because pupils had become faster by recalling them as facts (as opposed to working them out) or their work was slowed because they weren’t confident when simplifying a fundamentally identical expression.


I hope it goes without saying that we would love to know what you think and if you have tried anything similar. Do you have facts and rules, besides those set out in examination specifications, that make a big difference to your pupils when learned by heart?


Whether this fascinates or enrages you, get in touch and come see the pupils (and grids…!) in action. You’ll have a great time 🙂


1: See Skemp, R.R (1977) Relational Understanding and Instrumental Understanding, Mathematics Teaching, 77: 20-6

2: See for Bjork discussing research in this field.



Filed under Interesting or Fun

14 responses to “Masses of Maths: what should pupils learn by rote?

  1. Lara

    I teach 7th grade in the U.S. and would be interested in seeing more. Do you have a downloadable copy of the knowledge grids you’ve created?

    • Hi Lara,
      We’re still very much in the midst of making these and designing our curriculum. We’ve been reluctant to share too much as – and this will be no surprise – we keep having new epiphanies and making changes. As a result, there’s a worry that people will download something thinking ‘this is a “good” resource’ and miss out on changes made later.
      So, in the long run, yes, but in the short term…not yet!

  2. I like the knowledge grids created for Year 7s and 8s. These would also be great as posters in classrooms for students to be constantly reminded. Great work.

    • Hi Angela,
      Thanks for your comment. We have debated what to use in display and have ended up displaying….almost nothing! We want pupils to commit things to memory and were concerned about them relying on glancing at the walls every time, rather than practising drawing it from their brains. Our walls have ended up quite bare!

  3. Yes, conceptual understanding vs memorization is another false dichotomy.

    Other than the working memory argument, having some things memorized makes finding more complex patterns more possible. Here’s an example of playing with Nichomachus’s theorem: Two points of recognition. It would have been difficult to make progress if we didn’t recognize square and triangular numbers.

    Two questions about implementation:
    (1) what are the best ways to memorize in math? I guess there will actually be a lot of overlap with exercises that support conceptual understanding.
    (2) what habits keep you, as a teacher, and the overall department from gravitating too strongly to one pole? Human nature seems to have a strong bias for simplification and “the one true answer.”

    • That post on squares and cubes is definitely a case in point. We’ve been designing a unit on sequences and been keenly aware of how important it is to have a really strong recognition of square, cubes and powers of 2/3/5/10 to start de-constructing sequences like 11, 81, 271, 641 etc (which is still relatively simple!).

      Best ways to memorise… so far, I think it’s building conceptual understanding and working on multiple problem types using a procedure or concept, to build storage strength, and then spaced revisiting (in regular reviews and in new content by interleaving). The key, I think, is regularly practising remembering and using it, and creating more and more new scenarios to see an idea in action.

      Natural human bias means I think we’re very vigilant re: Q2 but, of course, I could be so wrong! A lot of reading, attending conferences, debate within the department… Being focused on the result – getting as many pupils as possible to be as confident with maths as possible – helps to avoid being wedded to the means.

      But questions like yours help 🙂

  4. Your blog post made a lot of sense and I loved the grids (very satisfying and peaceful to look at).

    To map out all of the pre-requisite knowledge (and various permutations) must take an incredible amount of thought and care. I wonder if this has already been done though in Singapore with their state textbooks?

    Btw loving the background picture to this blog too.

    • I can’t tell! We were a little disappointed when we attempted to use Singapore textbooks in our own department. They still didn’t go down to the fine level and ‘taxonomy’ we are looking for. However, my colleague Hin-Tai believes he’s seen such a document and is on the hunt for it!

  5. This is a fascinating post. I have a number of thoughts about it:

    1. I have read the Skemp paper and have it here on my PC, I think the difference between instrumental and relational understanding can be exemplified with reference to indices. I am sure that you explicitly teach why a^b x a^c = a^(b+c), and that for me is sufficient for relational understanding. An instrumental approach would require memorising the rule without explanation, and I don’t think that is what you are advocating. Therefore, I would only have students memorise the rule after is has been explicitly explained to them.

    2. Clearly there are some pieces of knowledge that don’t need to be explained first: the times-tables, the fraction and decimal equivalences, what the words base and index actually mean, BIDMAS or order of operations, what a^4 means, mathematical vocabulary in general, … and so on. Therefore, this would constitute knowledge that students could immediately and continually be memorising. My understanding of the power of a knowledge base and how it relates to learning is from Willingham and your colleague Joe Kirby.

    3. Attempting to codify how to solve a linear equation seems to me to be a tremendously difficult task, and I can understand the difficulty you must have in deciding where to draw the line. Here’s my attempt to codify the solving of a linear equation: Prerequisite knowledge: know what is meant by Coefficient, Both Sides, Term, BIDMAS, Lowest Common Multiple (LCM), Isolate, Multiply/ Divide Through by …, What a Linear Equation is. Then, Steps to Solve a Linear Equation: 1. Multiply out any brackets; 2. Simplify both sides of the equation (remembering BIDMAS); 3. If there are any fractions, multiply both sides by the LCM of the denominators; 4. If there is an unknown on both sides of the equation then choose the one with the smallest coefficient, if the coefficient is negative then add this term to both sides, and if it is positive then subtract this term from both sides; 5. Isolate the unknown by subtracting or adding a constant to both sides; 6. Divide through by the coefficient of the unknown.

    This would only make sense once it had been explicitly explained in class.

    4. I very much like this: ‘The constraint of the definition means we teach to a higher technical standard, ensuring that we stick to language like ‘eliminate this operation’ (instead of saying ‘get rid of the 4’ in a bid to make the maths feel more accessible). ‘

    I think that by explicitly teaching factual knowledge (and getting students to remember it) like this then yes, you enable students to think more and more like mathematical experts (Willingham again).

    Thank you, it’s been a very interesting and thought-provoking read. Rufus.

    • We try not to worry too much about whether the procedural explanation/practice precedes or comes after the rationale/explanation. For example, we’re comparing negative fractions with Y8 at the moment. We realised that discussing the ‘why’ before the ‘how’ was very confusing for all but those who already got it implicitly. So, we’ve completely separated the procedure (write down and compare the positive version, use the opposite for the negative version), focused on getting that fluent, and then look at technically accurate and flexible explanations of why it works.

      I would disagree, actually, with the examples in 2! I think it’s important that FDP aren’t simply presented as facts to memorise. I like for pupils to see the relationships and be able to derive them if they must, and from there build the memorisation. Otherwise it’s just random relationships and they can’t build strong mental connections (I’m sure I’m using the language of cog psych badly!).

      We have tried to codify solving linear equations – it’s in one of the knowledge grids / quizzes above. It’s worked relatively well with weaker classes, so we’re hoping it won’t need too many more changes. Fingers crossed…

      I’m not sure we’re even aiming for them to think like experts yet – we’re just trying to avoid the fuzzy language that sows terrible seeds for KS5! But time will tell on that one…

  6. Very interesting, thanks, I’ll give this some thought.

  7. Pingback: What should comes first, memorising or knowing why? – No Easy Answers

  8. Thanks for posting about your work at Michaela, Dani. Two quotes from this post stuck out to me. First:

    A useful rule of thumb is: if we, as maths teachers, know these facts by heart because they help us work more efficiently and confidently, then the pupils should know it by heart too.


    ‘The constraint of the definition means we teach to a higher technical standard, ensuring that we stick to language like ‘eliminate this operation’ (instead of saying ‘get rid of the 4’ in a bid to make the maths feel more accessible).’

    Both seem to say, “Here is a standard, either set by the example of experts or the government. Let’s take students directly there.” I suspect this is like saying, “Strong people lift very heavy weights. We want weak people to become strong people. Why would we have weak people lift light weights? Let’s give them the very heavy weights.”

    I think we agree on the standard students should reach. But we’re looking for the most productive ways to help them reach it and drawing different conclusions. I think we agree that reaching that standard will involve the guidance of an expert in many cases. But I think we disagree on when that guidance should occur. Continuing the metaphor, if “the parabola has a line of symmetry at x = 4” is the very heavy weight, could “the curve has a mirror at 4” be the light weight?

    I’m sure the analogy is incomplete. Perhaps you can help me complete it.

  9. Pingback: The Blogosphere in 2016: Roaring Tigers, Hidden Dragons | Pragmatic Education

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s