Skip the NI Direct Bar
Skip navigation

Technical Supplement

A.

Rationale for Government Intervention

A1.

Government generally implements expenditure programmes in order to overcome problems caused by 'market failure' or to address social/equity considerations. The following describes a number of 'market failures' and looks at the social/equity rationale.

Market Failure

A2.

The main economic justification for government intervention in an economy rests on the concept of market failure. As a result of market failure the government may choose to restructure, complement or supplement the unrestricted workings of the market economy. Several types of market failure exist:

Public goods - certain goods or services which if they provide benefits to anyone provide benefits to everyone. These are known as public goods. An example of a public good is national defence. An adequate defence system protects everyone even if they would otherwise choose not to pay for it. If the free market were left to provide for national defence some people would buy the service but others would not, knowing that they will have equal protection, (the "free rider" problem). In a free market system there is no mechanism to compel payment for a public good since there is no way of preventing a person from receiving the benefits of the good even if they choose not to pay. Only government through their power to tax can compel payment.

Externalities - an externality exists where all the costs and/or benefits of production of a good or service are not reflected in the price of that good or service at its transaction. For example, firms which discharge trade effluent into a lake may impose a negative external cost in the form of reduced catch on those who depend on fishing from the lake for their livelihood.

Merit goods - Merit goods are those which it is socially desirable for people to consume. However, without government intervention, individuals might not consume the optimum amount in terms of benefiting society as a whole. Education and vaccination are examples of merit goods. Government usually provides education and vaccination services because it recognises that individuals, left to their own devices, might not choose to consume these.

Natural Monopolies - these services were originally provided by the public sector because the start-up capital costs were so high that the private sector was unwilling to undertake the risk involved, for example, the network infrastructure of water supply, electricity and telecommunications industries. Current economic debate focuses on whether these services should be provided by the public sector or by a commercial concern which is controlled by a regulatory authority e.g. OFREG, OFWAT, OFTEL.

Incomplete Information and Uncertainty - this form of market failure could lead to insufficient investment for the future if people or firms are unwilling to undertake the efficient amount of risky activities because, for example, there was not an adequate set of contingent or insurance markets for dealing with risk. A complete set of insurance markets would allow risk to be transferred from those who dislike risk to those who are prepared to bear risk at a price and so build risk into the market system to achieve an efficient outcome. There is also the problem that information is incomplete because gathering information is costly and this may lead to socially inefficient allocations, for example, a worker who is unaware that exposure to high levels of benzene, as happens in some chemical plants, might cause cancer, will be willing to work for a lower wage than if he knew the risks associated with benzene. The firm's production costs will understate the true social cost and the good will be over-produced. Government can intervene, in this case, to regulate health and safety.

Social Priorities - Here the Government's concern is not about the efficiency of the market but rather about the equity or the social justice of the distribution of resources and with the quality of life of members of society. It may, for example, be considered important that income is distributed so that no individual's income falls below a level adequate for basic subsistence. The Government could therefore intervene to bring the distribution of incomes into line with that considered to be just and fair.

B.

The Base Case and Evaluation Analysis

B1.

Evaluation of a programme necessarily involves consideration of what would have happened if the programme had not been implemented, i.e. the Base Case (or sometimes referred to as the Counterfactual position). This can be viewed as an attempt to predict an alternative outcome and is essential in assessing whether the activity has changed things for the better or would they have improved anyway. This is a crucial aspect of evaluation, bringing out its comparative nature and it also provides the basis for analysis.

Types of Evaluation Designs
B2.The base case should be constructed within an evaluation design and quantified as far as possible. Its main purpose is to enable comparisons to be made between the actual outcomes and what would have happened in the absence of the programme under consideration. The main types of evaluation design considered here are micro (bottom up) designs as opposed to macro (top down) designs which examine changes in statistical aggregates. The designs considered here are based on combinations of time periods of measurement and the type of group under consideration. Also considered is a 'what if' type of design. A typology of main evaluation designs is summarised in figure B1.

Figure B1: Evaluation Designs

Groups

Time Series

Period of Measurement

Pre & Post Policy

Post Policy Only

True Control Group Design 1

Comparison of policy affected group and a randomly selected control group.

Able to track policy impact over time.

Able to check that the 2 groups initially have similar characteristics
Design 2

Comparison of policy affected group and a randomly selected control group.

Able to check that the 2 groups initially have similar characteristics
Design 3

Comparison of policy affected group and a randomly selected control group.
Non-equivalent control group

Design 4



Comparison of policy affected group and a similar as possible control group

Able to track policy impact over time.

Able to check that the 2 groups initially have similar characteristics
Design 5

Comparison of policy affected group and a similar as possible control group

Able to check that the 2 groups initially have similar characteristics
Design 6

Comparison of policy affected group and a similar as possible control group
Single group Design 7

Comparison of extrapolated 'policy off' trend with actual 'policy on' trend

Able to track policy impact over time.
Design 8

'Before and after' comparison of the policy affected group
Design 9

Comparison of what would have happened to the policy affected group in the absence of the policy with what actually happened to the group

Description of Designs - True Control Group Designs

B3.

Two truly random groups are selected from the client population. Only one of these actually receives the programme, the other - the control group - does not. Comparisons of the selected output measure(s) can then be made between the two groups in order to determine if the policy or programme has made any significant difference. A comparison of the groups at the pre-policy period either as a time series (design 1) or at single points in time (design 2) can act as a check that the two groups initially do have identical characteristics. However, if there are statistically significant differences in the pre-programme period then it may be necessary to have, for example, a stratified sample based on the characteristics of the programme affected group. If pre-programme implementation measures for the two groups are not available, then the groups will have to be compared in the post-programme implementation period only (design 3).

Non-equivalent control group designs

B4.As with the true control group design, comparisons can be made between the two groups in the post-implementation period only (design 6). However, the designs will be stronger if there are pre-programme measures either as a time series (design 4) or at single points in time (design 5) so as to ensure that the two groups are initially alike in terms of the characteristic being measured.
B5.

In the situation where a time series is available for the non-equivalent group, it may be possible to use trend extrapolation (see paragraph B7) of the group. Here the trend of a non-equivalent control group is applied to the baseline position of the policy affected group to determine the base case situation. This design could be used in the absence of pre-programme time-series data for the policy affected group but selection of an appropriate comparison group is crucial. An example of this is where the number of small businesses in NI was not monitored prior to the programme's implementation, then the trend of small business growth in an area in Britain (the non-equivalent control group) could be applied to the baseline to determine what would have happened in the programme's absence. The use of this design assumes that the non-equivalent control group has not been affected by GB policies.

Single Group Designs

B6.Single group designs involve the policy affected group only. Where time series information on the group is available (design 7) then it is possible to incorporate trend extrapolation into the designs. It is essential in this design to have pre and post policy time series data for the policy affected group. The pre-programme time series is projected forward into the operation period to act as a base case. The base case is then compared with the actual trend to estimate the impact of the programme.
B7.. Trend extrapolation designs involve trying to predict an alternative outcome by projecting the patterns (trends) identified before the programme began into a period when the programme is in operation. This involves the assumption that these patterns are an adequate representation of what would have happened in the absence of the policy. It should be recognised that events such as external shocks can make the past a poor predictor of what might have happened and so complicate constructing a base case.
B8.The pre and post policy single group design (design 8) is a simple comparison of measures taken 'before and after' policy implementation. The assumption in this case is that the pre-policy position would have continued in the absence of the policy. The pre-policy situation therefore acts as the base case. However, as with the trend extrapolation approach, the 'before' or pre-policy position may not be a suitable projection of what would have happened in the absence of the policy.
B9.

With the single group post-policy only design (design 9), information is necessary on what would have happened without the policy for the design to be feasible. Usually this type of information can be obtained through a survey of the policy affected group and it may even be possible to derive retrospective data on the pre-policy position and so have a 'before and after' or more appropriately, a 'with and without policy' comparison.

'What if' design
B10.

A further type of design -'what if' - is also possible but it is a design of the last resort where there is little or no information for the pre and post phases of the programme to enable proper comparisons to be made. In this case the key question is 'what would happen if the programme was terminated now?' The design then takes the form of comparing the consequences of stopping the programme with allowing it to continue. The evaluation then becomes like an appraisal (a look forward) rather than an evaluation (a look back).

Selecting and Using designs

B11.The derivation of the base case is crucial to the evaluation as it is the comparison of the base case with the actual outcome which will determine whether the policy is achieving value for money. The evaluator should therefore spend some time deciding on which design is best suited to his requirements. The choice of base case design will often hinge on data availability and whether or not the programme is already in operation.
B12.Ideally an evaluation should be planned and designed before the programme has been implemented. However, in practice the decision to evaluate may only be taken once the programme has been in existence for some time. In this case an evaluation plan will not have been incorporated into the programme's operation and use of certain evaluation designs will not be possible.
B13.The ideal situation is for the evaluation to include a true control group and a policy affected group with measures taken for both groups before and after the policy has been implemented. However, much depends on identifying a suitable comparison group together with a reasonable time series of indicators associated with that group. The pre programme comparison of the two groups serves as a check on whether the groups are at least comparable with respect to whatever the programme intends to achieve or change. True control groups are the strongest type of evaluation design. Comparisons in the pre-programme period should show no significant differences between the two groups if they have been truly randomly selected.
B14.

Comparison of the policy affected and a non equivalent control group in a single period (i.e. post programme) is less reliable as the two groups may not have had identical characteristics in the period immediately prior to the programme's implementation. If, in practice, it proves problematical to find a suitable comparison group then reliance has to be placed on the single policy affected group and constructing a base case using trend projection and/or surveys. However, evaluations in which only the policy affected group is measured makes interpretation of the results more difficult. Moreover, with respect to trend extrapolation, it should be recognised that, for example, external shocks can complicate the construction of a counterfactual position in the 'policy on' period.

Quantitative and Qualitative research methods
B15.The plan for the evaluation will consider the methods to employ and the measures to use to obtain the information needed for the analysis. Research methods are often divided into two broad categories, quantitative and qualitative
B16.A quantitative approach emphasises the measurement of outcomes and attributes causal effects by means of comparison. Where possible, an attempt should be made to quantify the outputs of a programme. Quantitative methods may involve measuring the levels of inputs and outputs using existing monitoring records or the collection of information by means of surveys or standardised tests. Comparisons of quantitative indicators generated from surveys, tests or monitoring data are frequently a feature of experimental designs (i.e. with control groups). Surveys may take the form of questionnaires, interviews and observation (questionnaires are the most commonly used). The issues to be considered in survey research include the size of the sample, method of sampling, the acceptable level of sampling error and whether the survey should be one-off or repeated over time.
B17.Quantitative methods can provide reliable measurements and comparisons which can be summarised easily and accepted as representative of the population as a whole. They can also be straightforward to repeat, for example by re-running a survey or test, but have limitations in that they tend not to be able to study respondents in depth nor be adaptable to individual circumstances.
B18.Qualitative approaches emphasise the description and understanding of a programme's operation and effects and are useful for exploring concepts, attitudes and behaviour. They are concerned more with the nature of the programme than with providing quantification. The main research methods used for qualitative work are unstructured in-depth interviews (characterised by open-ended questions), group discussions with operators, participants and decision makers, focus groups, participant observation and case studies. Qualitative work may use direct quotation, careful description and open-ended narrative. This can make analysis difficult as responses are neither systematic or standardised. However, insights may be gained through in-depth interviews which might not be revealed in a structured questionnaire. A limitation of the qualitative approach is that sample sizes tend to be small which can prevent wider generalisation from the research.
B19.Whilst qualitative methods permit the evaluation to explore selected issues in detail, quantitative methods fit diverse experiences into predetermined response categories. An advantage of the quantitative approach is that it measures the reactions of a great many people to a limited set of questions thus facilitating comparisons and statistical aggregation of data. However, while it is tempting to place more importance to the perceived objectivity of statistics the figures may have limitations. Qualitative methods, on the other hand, can, indicate the complexities of the change process, help in understanding how programmes work and how those involved (target groups and providers) view their success and failure. However, care is needed because their selective nature may distort the findings.
B20.

The type of evaluation design chosen and data availability will influence whether the quantitative or qualitative approach is more appropriate, although most evaluations would benefit from a combination of the two. Conclusions which are supported by a range of methods and data sources should be the most reliable. Quantitative and qualitative approaches should therefore be seen as complementary rather than alternatives and can be used together in addressing different questions within each evaluation. The choice of methods and designs must however be made in the context of the questions that need answered and the timescale and resources for the evaluation. The selection of the most appropriate research methods can therefore be a difficult task but the starting point is a recognition that there are options.

Analysing Net Additionality

B21.The aim of the analysis is to assess the programme's effectiveness and to give an indication of what the programme is buying. To achieve this a comparison should be made between the base case and the actual outcome using one of the designs outlined earlier. The difference between the two cases is a measure of the net additionality of the programme.
B22.

Different types of net additionality can be measured:

  • Full additionality is where the programme's benefits are wholly attributable to the programme, i.e. deadweight and displacement are zero.
  • Partial additionality is where the activity would have been carried out earlier, or on a larger scale or to a higher specification or has displaced existing activity.
B23.

The comparison between the base case and the programme's actual outcome, in terms of the output indicator(s) chosen, encapsulates activity that would have occurred in the programme's absence, (deadweight) and also activity which has been displaced by the programme's existence, ( displacement). It also takes account of supplier multiplier and local multiplier effects. These impacts are described in more detail below.

Deadweight
B24.Deadweight is activity that would have occurred regardless of the policy. Deadweight is a difficult concept to measure as the beneficiaries of schemes may be reluctant to admit they would have produced the same outputs without the schemes. Given the difficulty of targeting expenditure, deadweight of 50 per cent or more may often be found. Attempts are often made to improve the targeting of programmes and this should reduce deadweight over time.
Displacement
B25.

Displacement of activity within a local area can occur:

  • - through product markets, where the output of a supported programme takes market share from other local firms producing the same or similar goods or services;
  • - through factor markets, where a supported programme uses locally scarce factors of production (e.g. certain skills or land) or by bidding up their prices.
B26.Displacement varies with the programme supported and with the size of the area covered. For some local services e.g. food retailing, hairdressing, vehicle repairs, displacement within a travel-to-work area (TTWA) may be close to 100 per cent. Departments who find that their spatially targeted schemes support such activities need to consider displacement effects carefully.
B27.In order to improve comparability, evaluation studies should provide information on displacement on a local, regional and national (UK) basis, not just the areas of particular interest in terms of the programme concerned. If the policy measure being evaluated applies to more than one area, credit should not be claimed for activities transferred between the areas.
Supplier Multiplier Effects
B28.The supplier multiplier effect results from one industry or sector making purchases from other sectors in the local economy and so boosting employment in these sectors. This process of sectoral interaction continues until the amount of money being re-spent during each round of activity becomes negligible. Evaluation estimates of supplier multipliers, in terms of effects on employment in local labour markets (TTWA or equivalent) have ranged from 1.05 (Enterprise Zones) to 1.11 (Regional Enterprise Grants). Estimates above that range should be supported by robust analysis and empirical evidence.
Income Multiplier Effects
B29.

Additional local activity is likely to raise local income and this will generate additional expenditure in the area adding to local employment. The wider an area is defined, (provided it remains small relative to the total national economy), the higher will be the income multiplier. For small areas it is important to estimate how many of the additional employees are resident within the policy area as those who came in from outside (e.g. for construction work) may spend little of the additional income in the area covered by the policy or programme. For most activities local income multiplier effects are fairly small: estimates are generally around 1.1. Estimates significantly above this will need strong analytical support. Regional multipliers, where relevant, may be larger: estimates have ranged from 1.2 - 1.5. If such an estimate is proposed, supporting analysis will be required for the particular region or wider area.

Use of Control Groups to measure net additionality

B30.When using control groups (true or non-equivalent) to measure net additionality the output indicator is measured for both groups, preferably before the policy is implemented and during and after the policy implementation period.
Example:
B31.In February 1990, a one year training programme was introduced aimed at reducing the number of young long term unemployed in North Belfast. Prior to the programme's implementation, there were 550 long term unemployed under the age of 21 registered in the area. Of 50 trainees randomly selected to receive the programme 10 became employed on completion of the training programme. To determine the extent to which the programme contributed to the trainees finding employment, the evaluator must examine the performance of the 500 who did not receive the programme (the control group). It was found that 10% of this group found employment over the period of the programme.
B32.The effect of the programme, in terms of the output indicator chosen i.e. employment, is the difference between the control group and the policy affected group at the time of the evaluation. Assuming that the performance of the trainees would have been the same as that of the control group, then 10% i.e. 5 of the trainees would have obtained employment regardless of the programme (the base case). Comparing the base case with the actual outcome it can be concluded that the programme resulted in an additional 5 of the target group gaining employment.

Figure B2: Control Group Analysis

Time period

Output Indicator

Control Group

Policy Group

t0

Long term unemployed under 21

500

50

t1

Long term unemployed under 21 finding work

50

10

% finding work

10%

20%

t0: Time period prior to policy/programme implementation

t1: Time period at time of evaluation

B33.The pre-policy or programme measures act as a check on the results and are essential when using a non-equivalent control group in order to ensure that the control group and policy affected group are initially comparable.
B34.

Using this approach will not allow separate identification of deadweight, displacement or the multiplier effects for the indicator chosen. Moreover, it should be recognised that the programme may impact on other factors not covered by the indicator(s) chosen, for example, it may impose external costs or benefits on third parties.

Use of Trend Extrapolation to Measure Net Additionality

B35.When a single group design is being used to measure net additionality, trend extrapolation may be possible if a time series of the output indicator is available. The effect of the programme is estimated, as described earlier, by extrapolating the policy off period trend of the indicator onto the policy on period and comparing this projected trend with the actual outcome. Again the difference encapsulates the elements of deadweight, displacement and multiplier effects.
B36.A variation of the trend extrapolation is to base the projection of the policy-on period on a non-equivalent group. This method would overcome the problem of not having a complete time series in the policy-off period but is a weaker approach.
Example:
B37.In December 1985 a programme was introduced aimed at increasing the number of small businesses in NI. In 1995 it is decided to evaluate the programme. The number of small businesses immediately prior to the programme's implementation (the baseline) is known to have been approximately 1,280. The number of small businesses in 1995 is approximately 1,560 an increase of 280. The evaluator has to determine how much this increase was due to the programme, i.e. how much was net additional. If, prior to programme's implementation, the number of small businesses rose on average by 1% pa, then projecting the trend would result in approximately 1,420 small businesses in 1995. Comparing the base case with the actual outcome it can be concluded that over the operation period the programme has resulted in 140 small business start ups.

Figure B3: Single group trend extrapolation

B38.

As with the control group analysis, this approach will not allow separate identification of deadweight, displacement or the multiplier effects for the indicator chosen. Moreover, it should be recognised that the programme may impact on other factors not covered by the indicator(s) chosen, for example, small business growth might be obtained at the expense of a reduction in larger businesses. This will not be picked up by the estimate of the base case but it should be considered in the analysis.

Identifying deadweight, displacement and multiplier effects
B39.The use of surveys, or the use of results from previous relevant studies, can provide information about deadweight, displacement and multiplier effects. Surveying those in receipt of the programme, e.g. UDG recipients, interviews with local businesses not in receipt of the programme and interviewing the wider community may indicate the extent of deadweight, displacement and income and supplier multiplier effects. Qualitative aspects can also be gauged using surveys. Again surveying true control groups will generate more robust results than simply surveying the policy affected group only. Ex-Post Cost-Benefit Analysis
B40.So far the analyses have been couched in terms of one indicator (usually the key output indicator for the programme concerned). In reality there are likely to be a number of key indicators and a useful way to present these is within an ex-post cost benefit framework. With Cost-Benefit Analysis, all relevant costs and benefits over time associated with the programme can be compared with the base case (see figure B4). This is particularly useful if the programme being evaluated has multiple objectives and where quantification is difficult items can be listed and considered within the CBA framework alongside the quantified items. It also allows for the analysis of efficiency and effectiveness measures.
B41.

The framework for an ex-post CBA will normally follow a similar sequence as that outlined for evaluation:

  • Construct the base case;
  • Identify, quantify and, where possible, value the costs and benefits of the programme;
  • Weigh up the uncertainties;
  • Compare the outturn with the base case;
  • Present the results in terms of whether the programme has represented value for money and the extent to which the market failure which was the original justification for the programme has been resolved.
B42.

Costs and benefits covered by an evaluation will often include:

  1. initial capital costs: including buildings, equipment and land;
  2. capital cost of any buildings or equipment which need to be replaced over the evaluation period;
  3. residual values of capital assets at the end of the evaluation period;
  4. operating costs over the whole term of the evaluation;
  5. other costs or benefits which can be valued in money terms, in the form of revenues, cost savings or non-marketed outputs;
  6. measures or descriptions of those costs or benefits which cannot be valued in money terms.
B43.In wider evaluation, any significant costs and benefits which have affected other parts of the public sector or the private sector should be included and separately identified. Expenditure may have lead to both gainers and losers and information on how the costs and benefits were distributed among different organisations, sectors of the economy, or individuals can be an important part of the evaluation.
B44.Where costs and benefits can be valued, the basis for valuation should be their economic cost, i.e. their 'opportunity cost', which is the value of the resource in the most valuable alternative use. This is usually given, near enough, by market values.
B45.Economic costs do not necessarily involve spending or receiving cash, for example, an organisation may already own an asset which, if not employed in the policy or project could have been used for other purposes or sold. Use of this asset therefore has an opportunity cost.
B46.Any important costs and benefits which cannot be valued in monetary terms should at least be recorded and whenever possible quantified. The money values of costs and benefits should normally be expressed in 'real terms' at the general price level applying when the evaluation is carried out. In the absence of relative price changes, general inflation simply raises all cash values by a given percentage and thus it is convenient to express all costs and benefits at the same general price level.

Figure B4: EX-POST CBA

Example: Enhanced Skills Training Programme

  Base case Programme outturn Comparison
(What would have happened without the programme) (Actual outcome of the programme) (Difference between Base case and outturn)
Direct Costs Capital (£) Capital (£) Extra capital costs
Current (£) Current (£) Extra running costs
Direct benefits/ outputs
Intermediate output

Final output

Final output in money terms

No. persons trained

No. trainees finding work

Value of jobs (£)

No. persons trained

No. trainees finding work

Value of jobs (£)
Extra people trained

Additional no. of trainees who find work as a result of enhanced training (this is net of deadweight. i.e. those who would have found work regardless of training)

Value of additional jobs (£)
Indirect effects (+/- spin-offs)
Displacement Displacement of jobs as skilled workers enter the labour force (-) Displacement of jobs as extra skilled workers enter the labour force (-) Displacement (-)
Supplier Multiplier Supplier multiplier (+) Supplier multiplier (+) Net Supplier multiplier (+)
Local income Multiplier Local income multiplier (+) Local income multiplier (+) Net Local income multiplier (+)
Net Total (£) Net value of base case output Net value of programme output Net value of additional output
Outputs which cannot be valued in money terms List significant items

- possibly compile an impact statement of the effect of each item**

- it may be feasible to use weighting and scoring to combine a number of outputs into a single overall measure even though they cannot be valued in money terms

*Assume programme runs at former level of provision.
** See HMT, Appraisal and Evaluation in Central Government,
"The Green Book".

Read on to References

Back to Table of Contents