A Rule is any set of instructions which given the Levels of one or more parent Variables specifies a result in terms of the Levels of a child Variable. It is an algorithm which humans, and/or possibly machines, can follow in order to calculate some outcome on a child Variable in terms of the Level(s) of one or more parent Variables. We use Rules inside our Theories about Mechanisms to make more or less precise predictions about how things will turn out. Within Theories, Rules encode our more or less limited (“causal”) knowledge of how the world works.

Student satisfaction with school 

 Student feels they live up to expectations

 Student feels fulfilled

 Student feels supported and liked

The two red symbols on the predicted Variables show that we have some kind of Rule or principle, however vague, which tells us how the expected Level of the child Variable depends, at least to some extent, on the Level of the parent Variable(s). In particular, the rising red arrows (which are the default for simple Theories) show that we believe that in each case the influence of the parent Variable(s) are in some sense positive, i.e. the child Variable increases as the parent Variable(s) increase.

The upward-sloping black triangle shows that each Variable in this Theory is ordered - see xx.

We will look at the red half-circle symbol later.

We can think of the black symbol and the red arrow symbol as representing two different aspects of the same Variable: the “data Variable” represented by the black symbol and the “Rule Variable” represented by the red symbol. In a specific instance, the Rule will most often predict a Level which is more or less different from the Level recorded for the actual “data Variable”. So the two are in a kind of tension.

Specifying a Rule

In this example, two no/yes parent Variables completely control one no/yes child Variable. (In practice, other Variables will be involved … and we would have to say something about the timing of these Variables too, see xx)

Paper catches fire ((no,yes)) !Rule: see table below

 Presence of enough oxygen ((no,yes))

 Match is struck ((no,yes))

So in this case it is easy to specify the Rule completely.

Fire? match struck =no match struck = yes
enough oxygen = no no no
enough oxygen = yes no yes

In this case, given fixed mass, the acceleration of the car depends completely and linearly on the force applied. Double the force, double the acceleration.

Acceleration of car !Rule: directly proportional

 Force applied to car

What does it mean to say one Variable influences another?

If A influences C, we can and should draw an arrow from A to C in our diagram. What does this mean? It just means that some Rule or other connects them.

In the simplistic first sections of this book, we will usually assume that it means changing between any of the Levels of A will always affect C.

More generally, we mean there are at least two Levels of A for which the Level of C is different: C is not always indifferent to the Level of A. So if I have reasonable soundproofing, I maybe don’t notice the children playing downstairs. But there is certainly a Level at which I do notice them, i.e. my level of irritation responds to the level of their noise.

The arrow means there is some kind of Rule, however incomplete, to say how the Level of the Consequence is influenced by one or more the Influences.

We already saw that a simple Theory may include the specification of a Rule, written as part of the Variable and preceded by !Rule.

A Rule is any set of instructions which given the Levels of one or more parent Variables specifies a result in terms of the Levels of a consequence Variable.

Often we have only a vague idea of how A influences B and nevertheless this influence might be critical to us.

Some Variable   !Rule: A very important influence

 Another Variable

A Rule may be vague or probabilistic or chaotic or non-linear or incomplete, providing it gives at least some information about effects of the influence Variables on the consequence Variable. Ideally, a Rule tells us something about the overall strength of the effect of the influence Variables on the consequence Variable. For example, do we only claim the mere existence of such a link, however weak, or is the consequence Variable mostly or even completely determined by the influence Variables?

Whether I notice the children downstairs((no,yes))

 How loudly the children are playing ((lo-hi))

So there is an arrow from influence to consequence.

But, if I put in maximally effective soundproofing, I can’t hear them at all, however loud they shout.

-Context: soundproofing *high--* ((lo-hi))

Whether I notice the children downstairs ((no,yes))

How loudly the children are playing ((lo-hi))

There is no longer an arrow.

Graphically, if the only arrow to C comes from A, we are saying that just A influences C. In particular, if there is no arrow from A to C, A doesn’t influence C.

What does it mean to say several Variables influence another?

On the one hand, this means that at least for some Levels of B, A has an influence on C, (and for at least some Levels of A, B has an influence on C). So it might be that at some particular levels of A, B has no influence on C. (Again, it also implies that any other Variables in our diagram do not influence C, at least not directly.)

We can redraw the above example, making the Variable “quality of the soundproofing” explicit.

Whether I notice the children downstairs ((no,yes)) !Rule: I notice more as volume increases   and/or as soundproofing decreases,   but beyond a certain level of soundproofing,   volume no longer has any influence on me

 How loudly the children are playing ((lo-hi))

 Quality of the soundproofing ((lo-hi))

We can’t be sure of the exact Rule about what influences what. We just write down what we know. That is fine.

Note for nerds: while the bare structure of Theories express graphs according to Pearl (Pearl 2000), the Rules express the parameters.

Combining more than one simple Theory: how do the Rules combine?

We already saw xx that simple Theories can be combined into composite Theories. What does that actually mean in terms of Rules?

If we have a Rule about how C is influenced by A (and maybe B), and another Rule about how A is influenced by X (and maybe Y and Z), we can then work out how C is influenced by X (and maybe Y, and maybe Z). In other words, we can work out a Rule for a composite Theory if we are given the Rules for the simple Theories which make it up. So if we can control some of the Variables on the left of the diagram, we can make at least a guess at how much influence we might have on some Variable far away on the right of the diagram.

From the point of view of someone designing, managing or evaluating a project, these basic building blocks are quite familiar, and we see them everywhere in project plans such as logical frameworks. The benefits of thinking in terms of simple Theories are obvious: we can show what needs to happen in order to achieve a certain outcome. By linking these units up, we can continue back up the chain: in order to achieve that, this needs to happen … and so on, back to something we can actually control. We end up with a composite Theory of how some desired change can come about.

Rules and functions

Mathematicians call these Rules functions but Theorymaker native speakers prefer the word Rule.

In any case, these “functions” can be very general and possibly quite vague.

Kinds of Rule

In project planning and evaluation, we meet even more different kinds of Rules than there are kinds of Variables.

There is no reason to expect that the Variables are numerical (though they might be) or that the function is exact. All we mean is that there is some kind of Rule which tells us how the consequence Variable is influenced by the influence Variables. The Rule can take any imaginable form - for example:

  • all the Variables are true/false Variables and the consequence Variable is only true if all the influence Variables are true …
  • … or Level of the consequence Variable stays very low until the Level of the influence Variable reaches a certain tipping point, beyond which the Level of the consequence Variable soars …
  • the consequence Variable has one Level for every influence Variable and the result shows which influence Variable has the highest Level …
  • and so on.

In particular, there is no reason at all to expect that the consequence Variable is influenced by the influence Variable(s) in any kind of linear way. While linear relationships are convenient for mathematicians, they are the exception rather than the Rule in the natural and social world.

Statisticians usually think of very specific kinds of influence, e.g. A is exactly twice B, or grows exponentially as B grows. But when making a Theory of Change, we often need many other kinds of influence, perhaps more or less vague or fuzzy. For example, I am pretty sure that if my son had a bad day at school he is more likely to start an argument with his sister. Even though I couldn’t start to formulate this in terms of a strictly mathematical function, I still may need to be able to include this influence in any Theory of Change about my son’s behaviour at home.

Moreover, in this example we suspect that a bad mood does not always result in a fight but merely increases the chances of a fight - so the Rule is probabilistic, not deterministic.

Independence of influence Variables

We should also remind ourselves of the very important but often overlooked condition that all the other possible paths in a Mechanism are zero. In the kind of single-step Mechanisms we are discussing here, this just means that the influence Variables are causally independent of one another. (We already know that they are logically independent, see the definition of Mechanism, above.)

Incomplete Rules

Notice that our definition above does not require that every Level of A is assigned a unique and discrete Level of B, allowing for probabilistic Rules, see below. Also the Rule might only concern some Levels of A while being agnostic about other Levels15. So we might perhaps know that very high temperatures will make a lizard sleepy, and medium temperatures might make it active, without having any idea what its behaviour is at very cold temperatures.

What do the arrows mean? Causation, the mysterious force?

In this book I avoid using the words “cause” and “causality”; instead we talk about one Variable influencing another. When many Earthlings hear the word “cause” they tend to think of one Variable totally controlling another, whereas in real life such cases are rare. Most influences are partial, and it seems more natural to talk about a partial “influence” than about a partial “cause”.

A link in a diagram says that the presence of arrows from A (and possibly B, and C) to X implies at a minimum that at least some changes in A can affect the expected Level of X for at least some settings of B and C. So the arrow doesn’t represent a secret force called causality or anything else, it only says if you are going to predict or control this consequence Variable, you are going to have to consider this influence Variable. That’s all we mean by “causal connection”.

Actually, if you like, it is the absence of arrows between a given pair of Variables which is a stronger statement, because it implies these two Variables are causally independent (which is arguably a stronger claim).

We could say that causal explanations are explanations which follow a particular pattern, rather than explanations which invoke a particular force or phenomenon. Bertrand Russell claimed xx that scientists don’t actually talk about causation; as Pearl (Pearl 2000) points out, it is true that we don’t see words like “cause” in the body of scientific theories, but scientists use these words informally all the time. Pearl and colleagues have introduced a new and very useful kind of calculus (a mathematical language) which does indeed allow us to discuss “cause” and, for example, prove theorems about the difference between causal and merely correlational relationships.

How is the algorithm to be specified?

It doesn’t matter how the rules are specified as long as there is intersubjective agreement. They can be as ambiguous as you like as long as there is intersubjective agreement also about the ambiguity. So, say, they might be specified only for one unusual level of a parent variable, for which the influence on the child variable is set out, and for all the others there is no information. As long as almost everyone agrees on this understanding of the rule, we have a valid Rule. Equally, the Rules give may give you probabilistic predictions. Traditionally, this is modelled by introducing an error variable.

Specific and general Mechanisms and Theories

When we use the word “Mechanism” in everyday life, or when we assert a Theory, and especially when talking about science, we tend to mean quite a general idea. So we might say the Theory of gravitation applies across the whole universe for almost the whole of time. The way we use the word “Mechanism” and “Theory” in this book and indeed in project monitoring and evaluation in general is a bit different because here we are much less interested in ideas which are generally true - except perhaps as a source of knowledge which we can apply - and much more interested in specific contexts. See chapter xx. Variables (and Theories) can be more or less general or specific with reference to context and in particular with reference to time. In particular they can be valid for just one time-point or set of time-points, or more generally for relative time (if you do X in this school year, you will probably get result Y in the next school year ….). We will deal with this more in the chapter on Context. ## Zooming in, zooming out{.inprogress}

The child who kept asking “and what’s inside that black box” died of exhaustion.

Theorymaker native speakers say they are zooming in when they replace a simple Theory with a composite Theory which reproduces the same Rule; and they say they are zooming out when they do the opposite; replacing a composite Theory with a simple Theory (with all the same exogenous Variables) with the equivalent Rule, i.e. a Rule which produces the same results on the output Variable.

However, just as with transformations of (sets of) Variables, these transformations of Theories & Mechanisms are not unproblematic.

In the correlational paradigm, they are innocuous.

If you think of a Rule as purely extensional (a mapping from some finite set of input configurations to a Variable with a finite set of Levels, i.e. a probability distribution on the set of all the combined Variables), there is no problem. But when Rules may be iterative, and/or when they are subject to unpredictable interference (see open systems, xx), it is in general no longer possible to understand Rules in a purely extensional way.

So when we zoom out to replace a whole composite Theory with just one simple Theory, we have to make the new, simple Rule internalise the whole Theory. If the original, composite Theory was complicated, the new Rule will probably have to be quite complicated internally too.

Reality of zoomed-out Mechanisms

The arrow from the intervention to the outcome is surely a summary of a more complex mechanism, which surely involves peoples thoughts, motivations etc. It is merely an abbreviation, but not an abbreviation in the sense of a mere correlation. An abbreviation of a mechanism. We do this all the time when we say e.g. “inviting too many guests leads to stress” or “playing music in front of the kids might encourage them to play themselves”. You can pick apart any link in any mechanism and say “this is just an abbreviation - there are more details in here to talk about”.

Rules can themselves be composite


Pearl, Judea. 2000. Causality: Models, reasoning and inference. Cambridge Univ Press.

  1. something which mathematicians usually call a partial as opposed to a total function