Evaluation: Appraisal of projects and programmes
We define Evaluations as Reports on the value of Theories of Change, i.e. Assessments of the valuable Difference made by a Project.
So Evaluation in this sense necessarily involves assessing to what extent an Intervention can actually add Value, i.e. maximise valued Variables.
So designing an evaluation process involves essentially designing a complex Mechanism which takes a Project as its input and outputs a Report on the quality of the Project; did it do something valuable?
We look at evaluations as performing different kinds of reporting, from those with more clearly defined outputs (“summative evaluation”, Scriven xx) to those providing narrative judgements (closer to Scriven’s “formative evaluation”).
Reporting and value
Evaluation isn’t just about producing any old Statement, but is about producing Statements which are intrinsically about the worth or value of something.
Michael Scriven famously claimed that evaluation is a “meta-discipline” which involves finding the worth or value of anything, from fine wines to, presumably, playground jokes and football cup finals.
Essentially that is about generating a statement on what we have called a V-scale.
This table shows Scriven’s and the Theorymaker view of what Evaluation is:
|Subject of Report||Not valued (Descriptions)||Valued (Appraisals)|
|Anything||Not evaluation according to Scriven or Theorymaker||Scriven’s evaluation|
|Projects||Not evaluation according to Scriven or Theorymaker||Scriven’s evaluation; Theorymaker Evaluation|
Note that even when the reported Variable is valued, the reporting Variable will not usually be valued, at least not necessarily in the same way by the same people.
Theorymaker native speakers agree with Scriven that Evaluations essentially involve reporting Value.
|Numerical||…||Scriven: summative evaluation?|
|Comparative (intensity)||…||Scriven: summative evaluation?|
|Fuzzy||…||Scriven: formative evaluation?|
About the right-hand bottom cell: Of course a Report about the Value or worth of something can be multi-dimensional, i.e. can consist of several valuing reporting Variables, not necessarily summarised by some Rule into an overall score. But what about a theatre review? Is a theatre review essentially a score on a few scales plus a non-valuative essay? Or is there a way of writing narrative which is essentially valuative without being completely reducible to scores on scales?
The evaluation Rule
The job of the evaluator is to follow the Rule set out in the Evaluation Theory (Wittgenstein xx).
This Rule-following may also be wicked in various ways, for example in the sense that part of the Rule itself (the Rule which defines the evaluation Theory) may be changed iteratively as part of the continuing evaluation process, see section xx. In a real-life evaluation, this can happen in different ways - for example, when identifying emergent Variables, unexpected results, etc.
As usual, we can flip our perspective between normative (seeing the Evaluation Theory as a Theory) and descriptive (seeing the Evaluation Theory as an actual composite Mechanism which includes the actual evaluation team, various partners and pieces of evidence, etc etc.) (The actual Mechanism might well deviate from those principles, given fallible evaluators, etc.) - As in this case both the Mechanism and its Theory will share the same Rule, we might find it more convenient to refer to the Evaluation Rule than to either then Theorymaker Theory or the Theorymaker Mechanism.
So in primitive terms we can see the whole evaluation as another Mechanism with the Evaluand as input and the evaluation Statement as output.
The reporting Statement
Some of the most familiar kinds of Evaluation Statement are ordered Variables with just a few Levels, for example:
The project management was ((poor < adequate < good < excellent))
… and we are also familiar with ratings expressed as percentages and combinations of such Variables. For example, it is common for an evaluation ToR to require 1-4 ratings for each of several evaluation criteria, say “Relevance” “Effectiveness” etc. Sometimes the Levels may be expressed in a standard way across all criteria, sometimes not. Sometimes the ToR may specify that these sub-Statements are to be synthesised into a global Statement; sometimes the method for doing this is specified and sometimes not.
Global evaluation statement ((lo-hi)) !Rule: some kind of average Relevance ((lo-hi)) (!Rule: what is the Rule?) Evaluand ((unlimited)) Efficiency ((lo-hi)) (!Rule: what is the Rule?) Evaluand ((unlimited)) Effectiveness ((lo-hi)) (!Rule: what is the Rule?) Evaluand ((unlimited)) Sustainability ((lo-hi)) (!Rule: what is the Rule?) Evaluand ((unlimited))
On the other hand, we also recognise evaluations in which the evaluation Statement is not a limited Variable. A good example would be the Most Significant Changes (MSC) evaluation process, (Dart and Davies 2003) which is evaluative in nature (? question to self: why?) but whose output is in essence the identification of a narrative around a particular change due to the implementation which is seen (normally by the participants themselves) as being of particular significance. So there is no way of delineating in advance all the possible outcomes of the process; yet it is not difficult to recognise an MSC report and there would also be some consensus about whether or not a given report is a good MSC report.
Dart, J, and R Davies. 2003. “A Dialogical, Story-Based Evaluation Tool: The Most Significant Change Technique,” 137.