1.4.6 Overview of Evaluation

by Marie Baehr (Vice President for Academic Affairs, Coe College) and
Yolanda L. Watson (Educational Consultant, Anthropology & Sociology)

Evaluation is a process for determining the quality of a performance or a product. An effective evaluation process includes the use of reliable data for the conduct of the evaluation, the establishment of predefined benchmarks against which performance is measured, and the monitoring of product or performance outcomes. This module outlines the principles of evaluation (Table 1), discusses appropriate uses of evaluation within higher education, and describes the relevant factors that can impact the quality of an evaluation (for detailed steps for conducting an evaluation, see 1.4.7 Evaluation Methodology.)
Table 1 Principles of Evaluation
  1. Evaluation focuses on the level of quality of a product or performance based on established standards and criteria.

  2. Evaluation is optimized when the performer is fully aware of the performance criteria, consequences, and rewards associated with a given performance.

  3. Evaluation criteria are consistent throughout the evaluation, and should not be changed during the evaluation to reactively meet the needs of either the evaluator or the performer.

  4. An evaluation is optimally effective when the performer (or product) has been given sufficient opportunity to fully perform in the roles to be evaluated, based on pre-defined evaluation criteria.

  5. In order for evaluative judgments to be accepted and perceived as sound, evaluation techniques should be perceived as fair and trusted by both the evaluator and the performer.

  6. The evaluator should be able to appropriately use or develop an evaluation tool that can measure the level at which performance criteria are met.

  7. Evaluation systems should be assessed after each evaluation.

Discussion of the Principles of Evaluation

1. Evaluation focuses on the level of quality of a product or performance based on established standards and criteria.

Evaluation takes a retrospective look at a given process, program or individual, and based upon pre-established standards decides its utility, its value, or its applicability. It attempts to measure performance and also to monitor progress toward preestablished benchmarks. Evaluation is heavily data-driven, and requires that a definitive decision be made about the process, program, or individual, in relationship to rigid criteria (Chatterji, 2004). The purpose of evaluation is to determine the level of quality of a performance, regardless of the skills used or needed for the performance. The evaluator’s knowledge of the optimal performance level can be used in making judgments or decisions or to track the performer’s progress.

2. Evaluation is optimized when the performer is fully aware of the performance criteria, consequences, and rewards associated with a given performance.

Because the performer is rarely involved in designing the evaluation process, the evaluator should fully and clearly communicate to the performer the purposes and performance expectations of the evaluation; the decisions that will be made based on the findings; and the performance criteria upon which the evaluation will be based. In most instances, the performer has no idea, or at least has a limited conception, of the purposes of the evaluation, and the current or future uses of the outcomes associated with the evaluation; the entire exercise therefore, appears to be evaluation for evaluation’s sake (Chatterji, 2004; Rossi, Freeman & Lipsey, 1999). When evaluation criteria are fully disclosed prior to the initiation of the evaluation, the performer is fully aware of the basis for the decisions to be made about his or her future performance, and can therefore optimize his or her performance based on the pre-defined criteria.

3. Evaluation criteria should be consistent throughout the evaluation, and should not be changed during the evaluation to reactively meet the needs of either the evaluator or the performer.

Although it is sometimes tempting to deviate from the original performance criteria, once the performance criteria are established and communicated to the performer, all decisions associated with the evaluation should be based upon these criteria, and these criteria only. If the preestablished criteria do not clearly or sufficiently link to the decisions that must be made, and the evaluator believes that it is necessary to add additional performance criteria to the evaluation, the evaluator should defer the evaluation and, rather than evaluate on uncommunicated criteria, first confer with the performer about the need to change the performance criteria, then develop buy-in from the performer on the new criteria, and initiate a new evaluation.

4. An evaluation is optimally effective when the performer (or product) has been given sufficient opportunity to fully perform in the roles to be evaluated, based on pre-defined evaluation criteria.

An evaluation process should not be initiated without proper planning, preparation, or outlining of expectations to the performers. Toward this end, a performer should be given sufficient time to demonstrate performance levels in the areas set forth in the evaluation criteria. The evaluation should not emerge as a reactive tactic to penalize or punish a performer for observed substandard performances, or in response to a newly-initiated external or internal mandate. These reactionary implementations of evaluations will surely impinge upon the performer’s ability to sufficiently demonstrate true performance levels. The evaluation should not occur in the midst of the performance either, but instead should occur at the conclusion of a predefined performance time frame.

5. In order for evaluative judgments to be accepted and perceived as sound, evaluation techniques should be perceived as fair and trusted by both the evaluator and the performer.

Because evaluation decisions often have long-term effects, it is critical to both the evaluator and the performer that the decisions be based upon valid information. When data collection is done well and in an unbiased fashion, the performer may not be happy if a negative evaluation decision is rendered, but he or she will be more likely to accept the decision and use it to improve future performance if the evaluation itself is based upon fair, data-driven evidence. The evaluator should therefore collect evaluation data that address all areas of a performer’s performance, and should ensure that this data is reliable, quantifiable, and generalizable to the performer’s overall performance, as well as to the expected performance levels associated with a particular task.

6. The evaluator should be able to appropriately use or develop an evaluation tool that can measure the level at which performance criteria are met.

Many fee-based and free evaluation tools are readily available for use in the conduct of evaluations. Evaluators should review and test an array of potential evaluation tools to find the most appropriate fit for the performance or product being considered for evaluation. If no appropriate evaluation tool exists, the evaluator should develop a tool which utilizes appropriate scaling techniques with construct and internal validity, and that will yield quantifiable results based on the preestablished performance criteria (Jacobs & Chase, 1992).

7. Evaluation systems should be assessed after each evaluation.

No matter how carefully an evaluator develops an evaluation process, there is always potential room for improvement. After each use of an evaluation system, therefore, the performer and evaluator should assess the evaluation’s areas of strength and areas for improvement, based on preestablished performance criteria.

Things to Consider When Conducting an Evaluation

1. Evaluation of one performance can be used to evaluate or assess another performance.

While there are many different reasons to evaluate, there are two main types of evaluation: direct and indirect. Both direct and indirect evaluation are used and are useful within the context of higher education.

Direct Evaluation

In a direct evaluation, quality is determined using artifacts collected directly from the performer’s performance. Some examples of direct evaluations include job performance evaluations, and grades earned in a course. Direct evaluations can be used to collect evidence of the quality of a performance or product. The determination of the level of quality in an evaluation can have a range of consequences. In a job evaluation, for example, the performer could potentially receive a pay raise or a disciplinary action; in the evaluation of student performance in a class, a student can earn a passing or failing grade based on their performance; this grade affects their grade point average and eventually their ability to graduate. In a tenure decision, a candidate may receive or be denied tenure. If no consequences are attached to the outcome of an evaluation, it usually makes little sense to evaluate.

Indirect Evaluation

Indirect evaluation occurs when the performance of one group (Group A) can only be determined by studying the performance of another group (Group B). In this type of evaluation, Group B’s performance must be evaluated to determine the quality of Group A’s performance. The decision-making process resulting from an indirect evaluation does not directly affect the performers.

For example, a college or university’s performance, based on accreditation standards and criteria, is determined by the performance of the faculty, staff, and students. A visiting accreditation committee evaluates the performance of the key stakeholders within the academic setting. This evaluation of others is used to evaluate the overall performance of the institution, and conclusions are made regarding the quality of its educational offerings.

2. Many exogenous and endogenous factors may impact the results of an evaluation.

Context, environment, and circumstances are all very key factors to the evaluation process. When a faculty member evaluates students in a course via a mid-term examination, for example, he or she should take into account testing conditions, student exam preparation time, student majors, and other a priori knowledge that may have an impact upon the large standard deviation in the exam scores.

3. Standards can be applied based on either established criteria or norms.

Criterion-Based Standards

With criterion-based standards, quality is defined ahead of time and benchmarks are set before the evaluation takes place, independent of the sample being evaluated. For example, many colleges and universities set a minimum score students must achieve on the ACT or SAT in order to pass admission standards.

Norm-Referenced Standards

With norm-referenced standards, the benchmarks for quality are affected by the samples being evaluated. For example, when scholarships or fellowships are awarded, students are often selected from a pool of applicants who are all being evaluated based on possession of “the best” credentials, irrespective of the overall quality of the applicant pool. From one year to another, the “best” quality can vary widely. Grades determined on a curve are another example of norm-referencing. The definition of “excellent” when grades are given on a curve is defined by the quality of work of the students in the class, not by a predetermined standard.

4. Good evaluation should distinguish between effort and performance.

Evaluation methods and criteria must take into account how much of the judgment will be based upon effort and how much will be based upon performance. Oftentimes, net change (versus outcomes only) can also be evaluated. For example, a student taking English 201 may not have earned a passing grade on a portfolio examination. However, this student may have exerted a tremendous amount of effort in preparing and presenting the portfolio, though the data presented may have been substandard. Though the poor performance itself will of course also be taken into consideration in the conduct of the evaluation, the student’s effort should not be fully discounted.

5. Problems can arise when the evaluator is not committed to the evaluation system in place.

The evaluator must be able and willing to collect the needed information for evaluation. If not, the information collected can be incomplete, poor, and/or misleading. For example, often in a general education program, an instructor must collect information on the level of student learning in order to indirectly assess the curriculum and instruction. Unless the instructor sees a value in knowing whether or not objectives have been met, the evaluations could be carried out haphazardly and therefore yield unreliable results.

6. Consistent follow-through is an important component for fostering an unbiased and trustworthy evaluation system.

If an evaluation system is put into place in which all persons involved are aware of the potential decisions and actions which may result from the evaluation, it is important that the decisions be made in a timely and professional manner, even if they are unpopular. When such follow-through does not occur, it is difficult to gain buy-in for the evaluation process from organizational stakeholders. Prospective participants in the process might also perceive the process as inconsistent and ineffective, and so inadequate follow-through can, on the whole, decrease morale and support for the evaluation effort.

7. The evaluator must set high, but realistic targets for performers within an evaluation that allow for success.

Setting the bar too low for defining success can diminish an evaluated performer’s potential effectiveness in the future. Setting the bar too high can also quell the aspirations of capable people who would otherwise render valuable performances. For example, it is important to set realistic standards for faculty prior to creating a job description and interviewing applicants. If the standards are set too high for potential applicants, no evaluated applicant will be deemed qualified for the job. If they are set too low, a person lacking necessary skills might be hired as a result of the evaluation process.

8. The time and cost required to conduct an evaluation must be considered.

The timeline established for the conduct of an evaluation should be flexible, though every attempt should be made to adhere to the preestablished timeline. When planning an evaluation process therefore, it is important for the evaluator to carefully plan and consider the time frame and the associated costs of the evaluation. Prior to the initiation of an evaluation process, the evaluator is urged to conduct a quick cost-benefit analysis which will allow evaluation stakeholders to determine whether the decisions that will be made as a result of the evaluation and any anticipated action steps are worth the expected costs associated with the endeavor.

Concluding Thoughts

Evaluation, when done well, can be a very powerful tool to determine quality and compare it to a desired standard. The results of a strong evaluation process can inform an organization’s decision-making processes and help to make institutional stakeholders accountable for their performances (Joint Committee on Standards for Educational Evaluation, 1994). A by-product of good evaluation practices is that performance expectations are explicit, and tough decisions are made fairly and consistently. Other possible benefits of evaluation practices include the efficient use of resources and the facilitated reporting to external agencies. Regular evaluation allows performers to self-assess and to make improvements toward long-term targets that are in the best interest of the individual and the organization.

References

Chatterji, M. (2004). Evidence on ‘what works’: An argument for extended-term mixed-method (ETMM) evaluation designs. Educational Researcher 33 (9), 3-13.

Jacobs, L. C., & Chase, C. I. (1992). Developing and using tests effectively: A guide for faculty (how to create evaluative tools effectively). San Francisco: Jossey-Bass.

Joint Committee on Standards for Educational Evaluation. (1994). The program evaluation standards. Thousand Oaks, CA: Sage.

Rossi, P. H., Freeman, H. E. & Lipsey, M. W. (1999). Evaluation: A systematic approach. Thousand Oaks, CA: Sage.