Chapter 8

Click here to download zip file

Student Evaluation

Learning Objectives :
  • Enumerate the functions of evaluation in the teaching - learning process.
  • State the types of evaluation and their uses.
  • Prepare a table of specifications for your subject area.

 

Terminology

Have you ever thought that if the human body didn't secrete adrenaline and other hormones in response to hypoglycemia, what would happen ? The results would be simply catastrophic. Let us see, what the body did - it 'measured' the levels of glucose, evaluated them in the context of the needs and then sent corrective signals for counter - regulatory mechanisms. This is exactly what we do when we evaluate students. We measure the performance, compare it with previously decided standards and take corrective action in case of any deviation. One important corollary of this sequence is that evaluation is not just concerned with proving a certain point but with improving the whole educational process.

Before proceeding further, let us clarify for you, certain terms which are commonly used in this context.

Measurement refers to the application of mathematical tools for finding the degree of achievement. Awarding marks to an MCQ is an example of measurement.

Assessment is used for those attributes which do not lend themselves to precise measurement and where some subjective decision is involved. Marking of essay type questions is an example of assessment.

Evaluation is a broad term and involves passing a value judgement based on the information obtained from measurement and assessment.

Importance

Let us now come back to the discussion on Evaluation. For most of us, evaluation is taken to be synonymous with end of the course evaluation, with the intention to classify students as pass / fail. However, this is not so. Evaluation is considered a major curricular component, at par with educational objectives and learning experiences. For a minute, refer back to the educational spiral and you will notice that evaluation is influenced and in turn influences the other two curricular components. In other words, other than a pass / fail function, evaluation also serves to modify the objectives as well as the learning experiences.

importance.gif (3369 bytes)

Functions

You may now be wondering about those other 'functions' of evaluation. Let us have a look at them :

  • Diagnosis : The results obtained from evaluation serve to diagnose areas which have not been properly learnt and which require remedical measures.

  • Prediction : Most of the aptitude tests rely on the predictive utility of evaluation with the underlying assumption that a candidate scoring high on these tests will do well in real life situations also.

  • Selection : Entrance tests to MBBS (and other professional courses) make use of this function.

  • Grading : We use evaluation to rank order the students of any given class for prizes, scholarships etc.

  • Programme evaluation : As already stated, evaluation helps us to modify a programme and make it more cost effective.

Go to top

Let us now look at certain other used in the context of purpose of evaluation. These include :

Types of Evaluation
  • Formative evaluation is used to help the learner and the teacher to know the progress of the student in an informal way and take remedial action in case of any difficulty. Questions asked during the course of teaching, class tests, quiz programmes - are all examples of formative evaluation. Since the basic purpose of formative evaluation is to help the learner know about his progress, the results of formative evaluation should never be used for a final pass / fail decision. If this is done, them learners may try to hide their weaknesses and the very purpose of formative evaluation may be lost.

  • Internal evaluation is the term used when evaluation is carried out by the teacher himself, who has taught that subject. To be meaningful, this evaluation has to be of a continuous nature. You will find a more detailed discussion on internal assessment in a later chapter of this book.

  • Summative evaluation refers to the end - of - term or end - of - course evaluation. University professional examinations are examples of summative evaluation.

You will notice that working on this principle, the final pass / fail decision has to be made after taking into consideration the performance on internal assessment as well as summative evaluation. For sale of simplicity, the inter - relationship of various types of evaluations can be represented by the following diagram :

formative.gif (1823 bytes)

Abilities to be evaluated

Having discussed some of the points regarding evaluation, the next question that we are posed with is how to evaluate ? Should it be a paper consisting of either MCQ or essay type questions ; should it concentrate more on practical aspects ; should it have a varying proportion of the two ? The answer to this question is provided by the objectives that we have set for a given course. For a medical student, simply knowing about a disease is not enough - he should be adept at not only performing practical procedures but also to relate to the patient and his family members. You will recall the discussion we had in the chapter, Educational Objectives. It is thus imperative that a medical students is evaluated on all the 3 domains of learning viz. knowledge, practical and communication skills. Here we will like you to recall one more term i.e. Table of specifications. Essentially, a table of specifications is a grid which lists subject matter on one axis and the weightage given to various domains on the other axis. This makes it very easy for the teacher to decide the percentage of total marks which have to be alloted to knowledge, practical and communication skills.

Go to top

Weightage to abilities

You will appreciate that the percentage alloted to various components will vary with the subject area under consideration. For example, while evaluating the student on antibiotics, knowledge and practical skills are student while for evaluating him on history taking, practical and communication skills need to be given more weightage. You should not go with the idea that this table of specifications is arbitarily prepared - rather it requires a lot of effort and discussion amongst subject experts to arrive at a consensus ; however, the advantages it offer are more worthy than efforts involved. You can also initiate discussions in your own departments to reach at this consensus.

Evaluation is not the end - rather it is the means to further the effectiveness of an educational programme. To make full use of the intended purpose of evaluation, it should be undertaken after careful planning, specially keeping the educational objectives in mind. The evaluation tools should be appropriate for the learning outcomes to be evaluated. An appropriate evaluation tool has the following characteristics :

(a) Validity : A tool is valid if it measures what it purports to measure. Thus, using a weighing scale for talking weight or a ruler for measuring length are examples of valid tools.

Characteristics

in terms of educational evaluation, we are concerned about the following types of validity :

(i) Content Validity : This is the most important criterion for the usefulness of a text. It indicates synchronisation between the contents of a test and content of teaching. For obvious reasons, you can not include all that is taught into a question paper. Sampling of questions is the key to build content validity in a test - more representative the sample, more content validity a test has. The easiest and most efficient way to build in content validity is to prepare a table of specifications and then choose questions accordingly. Take a look at the following example :

Types of Validity

Topic

Weightage

A

15%

B

10%

C

5%

D

20%

E

30%

F

20%

Total

100

If we intend to give a test paper of say 100 marks, then 15 marks should be alloted to questions from topic A, 10 to topic B and so on.

(ii) Criterion related validity refers to validity in relation to an external criterion. This criterion may be a set of concurrent data or a future performance. Let us taken an example. A group of students have been rated "excellent" by the class teacher. If we administer them a test and they score well on this also, then this test has a high concurrent validity. On the other hand, if we use a test to select house doctors and those performing well also turn out to be efficient house doctors, then this test has a high predictive validity. You should be aware that unlike content validity, criterion validity can be calculated and numerically expressed

We shall now proceed to look at another equally important attribute of a test i. e. reliability.

(b) Reliability refers to consistency of measurement. The degree of reproducibility determines the reliability of an evaluation tool. Unlike validity, where some subjective judgement may be involved, reliability is strictly a mathematical concept and is numerically expressed.

Go to top

There are various measures of reliability, some of which include :

Types of reliability
  • Test - retest reliability :This is the degree of consistency in the results of a test which is administered twice to the same group of students, provided no additional learning has taken place. You would appreciate that in practice, it is a difficult condition. Moreover, the practice effect may distort the results.

  • Equivalent - forms means consistency of results when two tests of same content and difficulty level are administered to the same group of students.

  • Split halves reliability is a measure of internal consistency or stability of a test. The entire test is divided into two parts (first half / second half or odd / even items) and correlation between scores obtained on two parts is calculated.

  • Marker reliability is the degree of consistency when a test paper is independently marked by two different examiners.

  • Feasibility is the third important characteristic of a test. Take the example of a practical test. The ideal situation would be to actually observe a student doing a lumbar puncture or putting an IV drip but that may not be feasible in actual practice.

Sounds confusing ? Well, we must admit it does. However, after you have read and understood the chapter on Test and item analysis and done some of the practical exercises yourself, things would become more clear and manageable. All the best !

Go to top