# For-each Variables

Theories often include a large number of similar Variables, for example similar Variables for many similar persons (e.g. the height of many children in a class). So in practice we often see different ways to group together these kinds of repeated Variables and often represent them as if they were just a single Variable. We often do this without thinking - on our Theory of Change, we put a box for one Variable for “people’s attitudes to recycling” and another for “the law is passed” without thinking that the first is probably conceived as a whole column of data, one for each person, whereas the second is just a single datum, yes or no.

Any set of Variables which differ only by one Feature can be rewritten as one for-each Variable.

For example, each of these Variables represents the temperature outside this house at a different time point.

Temperature outside this house,  time=yesterday

Temperature outside this house,  time=today

Temperature outside this house,  time=tomorrow

These can be rewritten like this:

Temperature outside this house    !for-each time (yesterday, today, tomorrow) 
- !for-each teacher

Teacher ability

!Rule (mean)  Average Student achievement;Teacher feeling of work satisfaction, Rule ??

-- !for-each student

Student achievement

Teacher ability

Student motivation

Teacher ability

-

Teacher ability

School-level support to teacher development

## One Variable, one data point

The word “variable” is used in different ways in M&E and even within mathematics. Not in Theorymaker. When someone says they have, or could have, simultaneously different data points for the same variable in an intervention, presumably from different times or places or persons, Theorymaker native speakers would say this is not one Variable but a set of Variables, one for each data point.

The main reason people get confused about this is that in ordinary language (and in statistics, but not most of the rest of mathematics) we also use the word “variable” for a whole set of similar Variables repeated across time (and/or across places or across groups of people).

In Theorymaker, we call sets of Variables like this “for-each Variables”. “For-each time” Variables are very common, but Variables can be “for” other things, see below. They are actually sets of ordinary Variables each of which only belongs to one time-point.

So that’s why Theorymaker native speakers say “One Variable for each piece of data in an intervention” (or in any other single application of a Theory). They are quite well aware that behind any particular simple Theory there might be, in fact there should be, a whole mass of prior evidence, data, collected in relevantly similar contexts. But any particular intervention is essentially just one more case, and each piece of relevant data collected by the intervention needs a dedicated Variable to store it - often, as described above, grouped into for-each Variables.

So if you look at a series of temperature measurements for your town throughout the day you can say, “look, the temperature variable is changing”. And that is a perfectly reasonable thing to say, provided we are clear that we are using the word “variable” here for a whole set of states or measurements, one for each moment of time21; but in the Theorymaker sense of “Variable” I introduced above, each one of these states or measurements is a Variable, each at a single point of time; and those can’t change in the sense that they can individually vary across time, though they can change in the sense that they could be different (or could have been different).

In Theorymaker, it is acceptable to use singular forms with for-each Variables, saying, for example, “this ‘for’ Variable records level of interest every month”, although perfectionists prefer to be more precise and say “these ‘for’ Variables record level of interest every month”, using the plural form.

More formally, any set of Variables which differ only by one Feature can be rewritten as just one for-each Variable with the addition of the marker !for-each followed by the same criterion, and mentioning the list of possibilities (e.g. “1-3”) if this is known.

Student achievement, student=1

Student achievement, student=2

Student achievement, student=3

The diagram above can be replaced by the diagram below.

Student achievement   !for-each students 1-3

## Rules with for-each Variables

We can display Rules where either parent or child Variables, or both, are for-each Variables. But we should be aware that these actually represent multiple Rules.

Here are two perfectly good simple Theories which do not use for-each Variables.

Student achievement

Teacher ability

… it talks about the generic effect of the ability of some teacher on the achievement of some student.

… whereas this …

Average student achievement

Teacher ability

… talks about the effect on “Average student achievement” for some set of students. Now behind this average is presumably a set of data for each student in a class, (and the individual achievement scores may themselves be composite scores from a further mass of variables per student). But these individual scores are not mentioned or considered in the Theory. Average achievement is a perfectly good, single, Variable, and it is perfectly acceptable to have a Theory about how teacher ability can affect it. We may have plenty of evidence to back up links between teacher ability and average student scores and maybe that evidence is agnostic about specific effects on individual scores.

### Rule is same across all cases

… whereas the following Theory is different; it talks about the effect of teacher ability on the individual scores of three different students.

Student achievement, student=1, !Rule same Rule as other cases

Teacher ability

Student achievement, student=2, !Rule same Rule as other cases

Teacher ability

Student achievement, student=3, !Rule same Rule as other cases

Teacher ability

In this case, there are multiple Variables which we can visually compress into one using a for-each Variable.

So in the above case, we have one teacher and one teacher Variable (ability); but we have several students, each with one Variable (student achievement). Teacher ability actually influences several different Variables, one for each student. So the above diagram can be compressed into this one:

Student achievement !for-each students 1-3

Teacher ability

### Rule differs from case to case

Student achievement//student=1, !Rule stronger effect here

Teacher ability

Student achievement//student=2, !Rule stronger effect here

Teacher ability

Student achievement//student=3, !Rule weaker effect here

Teacher ability

… that could be displayed something like this:

Student achievement//!for-each student !Rule stronger for students 1 and 2

Teacher ability

## Aggregating for-each Variables

Ordinary Theories of Change often draw lines between for-each Variables and simple Variables without reflecting that these actually join multiple to single Variables. But we can’t in fact ignore this issue; rather than just an ordinary Rule to tell us how one or more single Variables influence the Consequence Variable, we need to specify how the multiple Influence Variables are aggregated to influence the Consequence Variable (or, in the case of defined Variables, how the defining Variables are aggregated).

As an arrow leaves a for-each Variable and leaves a box to influence another Variable, this is a reminder that this Variable is now bunched, and the receiving function has to take account of this. So in the example, how does the Level of frustration of each individual student combine to influence class climate? Sometimes we just assume it is OK to take a mean or a total. But sometimes something else might be important like the maximum or some other more complicated rule of combination.

In this example, the teacher’s ability influences the achievement of each student individually; but also student achievement collectively goes on to influence the same teacher’s feeling of work satisfaction. There is a whole range of ways in which this can happen - is he or she most influenced by the top-performing students, or the number of students scoring low, or simply by the average of all?

Aggregation is an issue for defined Variables just as much as for consequence Variables. So in the same example, we can define a Variable “Average student achievement” which aggregates the student scores.

!Rule (mean)  Average Student achievement

Student achievement   !for-each student

Teacher feeling of work satisfaction !Rule:?

Student achievement   !for-each student

Teacher ability

## Using grouping boxes with for-each Variables

It is important that the same for-each criterion can apply to different sets of Variables. So you might record student performance and student age in the same table, with one line for each student and several columns for different Variables. Each student has a unique ID and a student with this ID can have several Variables which are entered in a row beginning with that ID. In statistics, this common ID number is a source of measurement strength and allows us to use some more powerful statistical methods.

So when we say this:

Teacher skills

Teacher presence on training course

Do we mean just mean that overall teacher skills are influenced by overall presence on a training course? If I am a teacher and I join a course, will it help my colleagues in Brazil improve their skills?

Here are two ways we could say this in Theorymaker:

Skills. !for-each teacher in our school.

Presence on training course. !for-each teacher in our school.

When no Rule is specified, we assume that the second Variable has an influence on the first for each and every teacher. You can see this more clearly in this equivalent, alternative phrasing:

-!for-each teacher

skills

presence on training course

This version is exactly equivalent because if follows the rule that defines for-each grouping boxes as equivalent to the same diagram without the box, in which each Variable which was within the box has the !for-each added to it.

This is a more general claim, because without any further information about context, it might apply to any teacher, anywhere in the world, ever.

Above we saw how it is convenient to use grouping boxes to optically organise a Theory, grouping together Variables with the same attribute like person, organisation or time-point. In the same way, we can use grouping boxes to group together for-each Variables which share the same criterion.

!Rule (mean)  Average student achievement

- !for-each student

Student achievement

Student motivation

-

Teacher feeling of work satisfaction, !Rule ??

Student achievement

Teacher ability

Student motivation

Teacher ability

So the variable “Average student achievement” is outside the box of “for student” Variables. For this group of many students there is a single Variable representing the average score.

As there is only one teacher in this example, we don’t really need to group the teacher Variables together. But in this next example, there are many classrooms, each with one teacher and multiple students. Quantitative social scientists say that “Students are nested within classrooms”. This way of thinking is very familiar to statisticians; but we need it for project M&E too, even if we don’t have numerical variables and we are not worried about statistics.

- !for-each teacher

Teacher ability

!Rule (mean)  Average Student achievement;Teacher feeling of work satisfaction, Rule ??

-- !for-each student

Student achievement

Teacher ability

Student motivation

Teacher ability

-

Teacher ability

School-level support to teacher development

## Continuous for-each

While we are talking about “repeating Variables” as if they were always discrete, often we think about Variables as belonging to infinite sets which repeat continuously, in particular across time and space.

## More to come!

To be completed

### Repeated and selected Variables

#### Selection

When constructing Theories of Change, often, sets of individuals might go through a programme but some are selected in or drop out.

Repetition and selection are extremely common in programmes but standard Theories of Change almost never take account of them. Statisticians know how to do this and will try to tease them out when analysing a dataset, but most times if programme staff weren’t clear about the repetition and selection then the data won’t have been recorded using the correct “for criteria”.

But even more importantly, repetition and selection are a pretty essential part of how many programmes work. Theorymaker native speakers talk about them and diagram them quite fluently. So can we.

(more about selection and filters …)

Use WITH for selections? But even an individual can be selected, yes/no.

Sport ability !for-each child

Initial running speed !for-each child with good Sport ability

The word “with” is compulsory.

This way, selection and for-each go together allowing for lots of different patterns:

Gender !for-each Student with height above 1m/Class/School, Student/Family with more than one child ((male, female))

#### Theorymaker syntax for nesting

Theorymaker native speakers sometimes use a syntax like this to express how for-each can be nested.

Student !for-each Student/Class/School

which is equivalent to this:

-!for-each School

--!for-each Class

---!for-each Student

Student 

But sometimes the “fors” are not completely nested within another, so we can not use grouping boxes in each case.

Student !for-each Student/Class/School, !for-each Student/Family

So student gender is nested within classes and then within schools, and at the same time it is nested within families.

#### Nesting and aggregation

Boxes are a great way to show which Variables share a common for-each criterion. This is particularly useful when indices are nested within one another:

-School layer

School climate  !Rule complex, chaotic interaction

--Class layer

Class climate  !Rule complex, chaotic interaction

Teacher skills

---Student layer

Student social skills

Student frustration

---

Student social skills

Teacher skills

Student frustration

Teacher skills  

In the above example, class climate contributes to school climate, which is seen as being a separate Variable. But school climate just as well be defined as at least partly equal to some aggregation of class climates by definition rather than as being something above and beyond a mere aggregation of class climates. So even though we might have a clear idea in principle of the distinction between definitional and causal rules, in practice there is often a large grey area.

#### Subgroups

There are different kinds of subgroup. What about when a core group of 30 educators gets to train up 10 peers each in some kind of cascade training, so that all 330 end up with key skills.

-EACH teacher;N=300+30

Have key skills

--SELECT core teacher;N=300

Conduct training for peers

Know how to conduct peer education

Have key skills too

Receive training

It might be useful to distinguish between two different kinds of nested groups.

1. EACH
2. SELECT
-EACH school;N=300

sc

--EACH classroom; N=300*20

cl

---EACH student; N=300*20*30

st

----SELECT girls

Variables just for girls

----SELECT boys

Variables just for boys

-

proportion=.6

sc;label=school characteristics

cl;label=classroom characteristics

st;label=common student characteristics

edge;dir=both;color=indianred

1. Of course there might be a whole lot, or even an “infinite number” of such variables; that doesn’t really matter at this point because we certainly won’t actually be collecting an “infinite amount” of data