# Preamble

This book introduces a simple written and visual dialect of English called “Theorymaker”. Theorymaker is based on English but adds special ways to express ideas like contribution, causal influence, feedback, evidence, etc. The purpose of Theorymaker is not to suggest another new kind of template for planning, monitoring and evaluation of projects in practice. Rather, Theorymaker provides a unified, theory-based language for conversations and discussions not only around project and programme designs but also evaluation designs, more general evaluation approaches, and specific and general theories of change. In particular, Theorymaker has radical suggestions for modelling complex, chaotic and emergent processes, providing a way to bring both closed/rigid and open/emergent evaluation paradigms under one roof. Theorymaker models of various evaluation paradigms will be presented, from Outcome Mapping & Harvesting to Logical Frameworks via some of Scriven’s Logic of Valuing.

Theorymaker can be used for expressing and analysing theoretical problems on the one hand and also for writing down some key ideas in practical evaluation designs and reports on the other.

### Warnings!

#### Work in progress!

This book is work in progress. You are heartily invited to make comments and contributions briefly via Twitter (stevepowell99) or in full by email (steve@pogol.net).

Some sections are marked as “in-depth” and printed in lighter grey; these can also be skipped on a first reading.

There are also some chapters marked as work “in progress”, which are printed in very light grey and should be skipped at this stage.

#### What are all these Words with Capital Letters?

When we use words like “Variable” and “Theory” in the Theorymaker way, we will spell them with capitals: Variable, Theory. We call them “meta-Theorymaker” - a way to speak about Theorymaker. At the end of the book is a Theorymaker dictionary.

Most of the diagrams in this book present Theories. You will see some text in Theorymaker, followed by the corresponding diagram.

Actually, most of the diagrams are only fragments of Theories (and Theories of Change); pieces which would normally be included in a larger diagram.

## Who is this book for?

This book might be of interest for social scientists in general but I am writing in particular for monitoring and evaluation (“M&E”) professionals - people who are responsible for monitoring and evaluating the success of and processes within many different kinds of projects and programmes, for example in international development, education and so on.

This book is not meant as an introduction to monitoring and evaluation (M&E)1. It will make most sense to people who have some practical experience and who are familiar with the theoretical outline of more than one well-known approach to these topics such as the logical framework approach (LFA) or Outcome Mapping.

I have marked some passages as “In Depth” where the ideas are intended more for readers with some background in, and/or penchant for, maths & logic.

### What kind of M&E?

I will mostly discuss the monitoring and evaluation of “projects” and “programmes”, see below.

I will mostly write about evaluation in the field of international relief and development because that is what I know most about. I really hope it is relevant to M&E people in other fields.

### What’s a project, what’s a programme?

Roughly, we can say that a project is someone doing one thing in order to get another: a time-limited endeavour by an actor with a more or less clear intent to contribute to a specified valued result or results in a more or less limited period of time, whereas a programme is a large, project-like endeavour which is however not necessarily time-limited and often has less clearly defined and/or emerging valued results and may also contain one or more projects. We will see more precise, Theorymaker definitions in Chapter x.

## Why I am writing this book

I’m a freelance evaluator as well as working for an NGO in Sarajevo, proMENTE social research. I’ve worked in quite a few different countries for a variety of clients, but most frequently I’ve worked for different wings of the Red Cross Red Crescent Movement, quite often on evaluations and evaluation frameworks related to disaster, resilience and vulnerability. Before that I studied and conducted research in psychology, especially clinical psychology, in Middlesex (briefly, just compiling my PhD) and Sarajevo and Munich (for almost ever). But before that I studied philosophy, maths and formal logic at Manchester. Some ideas I had then, like a contrastive view of causation and a fascination with both phases of Wittgenstein’s work, never quite left me alone and though I tried to shoo them away when I became a working evaluator with real clients, they have been barking at my door ever since. This book is what happens when those philosophical dogs are let loose amongst the portly and theoretically languid chickens in the evaluation hen-house.

## Motivation: what problems does Theorymaker address?

Evaluation has become a very hot field. Evaluations are often taken into account in funding decisions with a combined level of billions of dollars (even if political and other considerations are even more important). Yet there is a whole range of essentially conceptual or even philosophical problems around everyday concepts in evaluation. We will look at many common challenges involving key evaluation concepts which plague evaluators and we will show how they arise because the concepts they involve are unclear and/or contested. Basically, we don’t have the right notation, the right grammar, for writing about how things influence one another. Many of the problems disappear if we use Theorymaker.

Central to project evaluation are assessments of a doubly thorny family of concepts like causal impact, attribution and contribution. This family is doubly thorny because we are lacking consensus both on how to assess or measure them as well as on exactly how to use them to judge projects when we have them. This lack of consensus is not primarily political or practical, it is philosophical. Project staff and evaluators have to use concepts like attribution and causation, yet controversy on the correct use of some of the key concepts has been the subject of sometimes bitter debate at least since the time of the Greek philosophers; and between and substantive differences still remain.

Here are some of the kinds of problems addressed in this book.

xxproblemxx

These are not edge cases but part of the bread and butter of evaluation. So a nurse might not be able to precisely define what “symptom” means but this does not affect their daily work. But if evaluators cannot agree on what “effect” means, how can we conduct evaluations?

These are vexed questions. Learning Theorymaker will help you see and understand these kinds of issues more clearly.

## Theoretical background

Many of the key ideas2 are based on the theory of causal networks set out in Pearl (2000)3 More on the similarities and differences with Pearl’s approach here.

You could say this book puts forward roughly the following theoretical manifesto.

To the extent that evaluation has a theoretical homeland, it is the social sciences.

Many social science theories, at least within the quantitative paradigm, have been expressed in the form of relationships between variables which can be numerically measured. So the theories are of this kind of form: “if the value of variable A is high, and the value of variable B is not too low, then the value of variable C will be high” - a relationship which could in principle be expressed as a mathematical function:

$C=f\left(A,B\right)$

There will always be a certain amount of noise in this prediction, as no-one thinks social science theories can enable perfect predictions. (Functions like this are referred to as “Rules” in Theorymaker.)

We would all like to understand this “if … then …” (and the equals sign in the equation above) as a causal statement. Yet, we are taught to believe that the only data actually available to social scientists is observational data on numerical variables. This has made many scientists wary of even claiming that our theories involve causal links and start to think that even they are really only correlational (in the tradition of the grandfather of statistics for social science, Karl Pearson). But if from correlational data one cannot make the causal conclusions we actually want to make, we are left with the mystery as to how we do, at least as ordinary citizens, actually have the masses of information about how social relations work (like, if you shout at your teacher you’ll get into trouble) which we use to navigate our lives.

Whereas M&E staff don’t usually have time for such worries: the upward links in a logframe for example are explicitly and unashamedly supposed to be causal, or at least to describe influence. However, as soon as we want to give a theoretical explanation of what a logframe (or any other project planning, monitoring or evaluation framework) actually is and how to validate it, as soon as we turn to a statistician or social scientist for help, we are referred back the correlational tradition, which is obviously inadequate for our needs. We are told for example, that it is terribly difficult to demonstrate a project’s impact. This gaping hole has been filled by just one supposedly magical tool, the “Randomised Controlled Trial” which supposedly does, uniquely, enable us to identify causal influences - even without any proper understanding of what this means or, for example, how to distinguish formally between the statements that two variables are a) correlated and b) causally connected.

However, the work of Judea Pearl and colleagues at the end of the 20th century has finally enabled a formal treatment of causal influences, part of what Gary King has called “a causal revolution” in the social sciences.

First,

$C=f\left(A,B\right)$

is re-interpreted as a causal claim with an asymmetric equals sign.

Second, Pearl and colleagues show us how we can indeed, under certain circumstances, gain causal information from correlational data.

Third, Pearl’s analysis makes the pieces of causal knowledge which we do have seem more like relatively independent heuristics, heterogeneous pieces of causal knowledge which can self-organise for a while into a system but which have an existence independently of any given system, as follows.

The basic causal link in Pearl’s approach is functional in the sense that it says how the values of downstream variables are influenced by the upstream variables in terms of some mathematical function. What makes this relationship causal is that it is to be understood in terms of actual interventions in a system. So even though an upstream variable is usually itself causally influenced by other variables further upstream of it, Pearl says that the function explains what will happen regardless of those further-upstream variables; when someone or something intervenes, they potentially break the further-upstream links and the variable becomes “parent-less”, free to influence variables downstream of it. Formally, the downstream causal effect of an intervention on a variable in a larger system is to be understood as if the equations determining it were deleted from the system of equations describing the system. Pearl says that an intervention is like “doing surgery” on such a system. It is this autonomy which gives the individual causal links a subtly new role and make them seem more like relatively independent heuristics which can survive even if their neighbours are “deleted” by an intervention.

Just to remind ourselves: no-one has discovered anything here. Pearl just offers us a different notation. This kind of notation seems better fitted than grand, sweeping social science theories to the kind of jumble of knowledge we have in real-life projects.

Theorymaker follows this lead and gives us an English-like language for project evaluation which allows us to re-establish it as a thoroughly causal enterprise.

We don’t actually use the word “causal” very much in this book because some people are allergic to it. These people jump to the conclusion that we are suggesting that there are any important phenomena in the social world which are fully determined by other factors. But we aren’t. We prefer the word influence, to make it clear that in nearly every case we are happy to know that one factor has a partial (yet nevertheless causal) influence on another. (More subtly, some people are allergic to the word “causation” in the special case of human decisions because, they argue, the right way to understand such decisions is looking for reasons rather than causes - Theorymaker native speakers are sympathetic to this, see Chapter xx).

So this book provides a notation, a dialect, for describing projects and programmes which is explicitly and unashamedly causal rather than correlational. As such it is merely the very beginning of the splash made as Pearl’s causal revolution hits the relative backwater of evaluation science.

However, this book also extends Pearl’s ideas a little, covering for example:

• how to apply Pearl’s ideas to the kind of explicitly vague, incomplete and non-numerical data we have in evaluation
• what distinguishes a project from other social phenomena;
• how to distinguish more clearly between the little “theories” that constitute our knowledge and the variables and “mechanisms” they describe, while at the same time showing how those actual theories and plans and maps are at the same time also part of the same physical world as the variables and mechanisms they describe;
• how to understand evaluations as also a kind of causal process in which the elements which make up our plans and reports describe, more or less accurately, our projects and programmes;
• how to extend essentially the same ideas to causal links which are “wicked” in various different ways: under-determined, emergent, chaotic and so forth.

In the first three sections of this book we will introduce a simple way of looking at “simple Theories” (essentially, numerical and non-numerical functions from input to output variables) and we will see how useful they are, combined with one another, in understanding the outlines of what a project is and how we are to evaluate one. But we will also smirk heartily at the idea that projects can really be understood as simple, discrete interventions which can be implemented like pressing a button and which run a predetermined course, or even a course with predetermined probabilistic boundaries. What is exciting about the Theorymaker approach is what happens when we look beyond these simple models into realms usually described as adaptive, emergent and chaotic: we don’t need to throw up our hands and say “anything goes”; we find we can keep our simple models, and our simple understanding of what constitutes evaluation, by just making one change: being less strict about the kind of rule or function they embody.

### But can you really apply all this stuff in practice?

I don’t think for a moment that many M&E people will actually start using Theorymaker straight away in what they do, though I hope it might stimulate many to understand and even do evaluation a bit differently. The purpose of this book is more to take a good look at some of the basic concepts we rely on in evaluation than to provide immediate practical solutions, templates or anything like that.

### Doesn’t “Learning Theorymaker” skirt all the political issues around evaluation?

This book is emphatically not about the application of evaluation approaches or about the political or cultural background in which evaluation is carried out, though these things are very important. Covering them would mean writing something different. But I do need to address the narrower criticism that speaking Theorymaker intrinsically implies taking a particular, technocratic political world-view. I do that in chapter xx.

### Is there a lot of maths?

There really isn’t very much maths, nothing you don’t see at school.

While this book certainly takes a somewhat formalist approach, it certainly isn’t primarily numerical or “quantitative”; only a minority of the Variables which interest most evaluators are actually numerical, whereas many are what Theorymaker native speakers call “intensity” Variables, which do follow formal rules, not of maths but the softer rules of what I call Soft Arithmetic: see chapter xx.

(More technically, we can say that Theorymaker takes a functional rather than a probablistic approach, see Pearl, p. xx and makes no assumptions about the nature of the Variables; they are not assumed to be numerical.)

### But isn’t it complicated?

Gosh, this all looks pretty complicated … I see lots of different headings talking about “‘for’ Variables” and “grouping boxes” and so on. But I only want to build really simple Theories of Change.

So do I. The point is that the different aspects of Theories of Change which we assemble here can already be found “in the wild” in ordinary logical frameworks, informal Theories of Change etc. We Earthlings use all of the ideas I discuss here already, but we have a very confused way to do it and are constantly mixing them up. Whereas Theorymaker native speakers have a systematic way to construct Theories of Change and all Theorymaker native speakers will understand the same diagram in the same way, rather than guessing what a box or an arrow actually means. That’s a big advantage.

### But why is this approach so formalistic? Isn’t life really much messier?

To learn Theorymaker you have to learn a bit of Theorymaker grammar - in a sense, a formal and theoretical business rather than a practical one. But this does not at all mean that I think that the practice of evaluation or project management should be more formalistic than it is now or that the practical side of evaluation is unimportant; quite the contrary. Rather, I think that very many practical headaches and problems in evaluation practice are actually due to getting some of these theoretical issues wrong; see chapter xx.

### Scepticism: is this approach too optimistic about plan-ability?

As you will see, Theorymaker native speakers are quite sceptical about our ability to know much about the causal links that make our projects tick. Most often, we Earthlings are overconfident in our theorising. Theorymaker native speakers know this and are good at discussing vague Theories. On the one hand I do show you how Theorymaker native speakers really use Theories when planning and evaluating projects, and I do mean real Theory4, not just a vague or plausible story for the donors. But on the other hand they are sceptical about how much real, accurate, established and validated Theory, useful for practical projects, we actually have.

Given the average project, it is pretty difficult to make an accurate model of how it works, to isolate what are the Mechanisms which turn inputs into outcomes. The best model you can get is probably not able to explain very convincingly the effects that your project has. And even if you manage to find such a model it is probably pretty different from what was planned.

And plenty of projects do too much planning, at least on paper, and often it is M&E frameworks which force them to do it. We will hope to get a feel for the limits of predictability and how to manage our projects when predictability fails us.

But as you will also see, I also think we can only evaluate a project to the extent that we do have a best guess about what makes it tick; in what way and to what extent is it an engine for change. If there is no predictability at all - even in retrospect - there is no evaluation, not even in the sense of being able to say “this led to that” or “our project is succeeding”. We can do team-building, or write a song about the project, or teach Yoga but we can’t, in most important sense of the word, evaluate it.

### Thanks to …

1. Sometimes “M&E” is used to mean a relatively low-level function, namely the mere monitoring of projects and programmes, contrasted with the more illustrious discipline of evaluation. I make no such distinction and refer to both as “M&E”.

2. Unfortunately I am not a proper academic and cannot trace where many of the other ideas in this book came from. So I make no claim of originality.

3. Causality: models, reasoning and inference. Cambridge Univ Press.