System, Vol. 23, No. 3, pp. 337-346, 1995
Copyright © 1995 Elsevier Science Ltd
Printed in Great Britain. All fights reserved
0346-251X/95 $9.50 + 0.00
Pergamon
0346-251X(95)00021-6
EVALUATING COURSE MATERIALS:
A CONTRASTIVE
STUDY IN TEXT BOOK TRIALLING
ROGER BARNARD and MICK RANDALL
Chichester Institute of Higher Education, W. Sussex, U.K.
This article analyses two similar trials of ELT text books which took place in the
Sultanate of Oman within a few years of each other. The article compares the different
approaches used in the two trials and draws some conclusions concerning the most
effective methods to undertake trials of new curriculum material.
INTRODUCTION
Throughout much of the developing world, the first response by governments and institutions to
demands for curriculum renewal is often to produce new curriculum materials. The Overseas
Development Administration (ODA), as an agency providing technical assistance in ELT, has been
involved in many such projects--by employing specialists either to write new course books for
Ministries of Education (e.g. Camaroon, Somalia, Yemen), or to participate with others, such as
publishers or in-country writing teams in the introduction of new materials (e.g. Sudan, Ethiopia).
In any such project, there is clearly a need to ascertain the strengths and weaknesses of the new
materials. This evaluation is undertaken either by the project team, by the receiving institution,
or by an outside agency such as the ODA. In some circumstances, this may lead to a summative
evaluation of a course which has already been written and published; in most cases, however, there
is an attempt to provide formative evaluation of materials as they are being produced. In nearly
all these situations, draft materials are 'trialled' before being produced and used in their final form.
Given that trialling is such a common need in ELT, it is perhaps surprising that it has received
relatively little attention in the literature over the last 10 years. There have been descriptive studies
of the process of trialling (cf. Wilson and Harrison, 1983 and Clarke, 1983), but no attempt to
evaluate the process itself. It is the purpose of this paper to review two different approaches to
trialling new course materials in which the writers were directly involved. Both attempts took place
in the Sultanate of Oman within a few years of each other and differed considerably in both scale
and style, yet both were essentially serving the same ends: the production of curriculum materials
for schools.
The first part of this paper will describe and contrast the two approaches which will be analysed
by reference to the following questions, based on the framework suggested by Lee & Sampson
(1990):
1. What materials were being evaluated?
2. Why were the materials being evaluated?
337
338
3.
4.
5.
6.
7.
8.
9.
10.
ROGER BARNARD and MICK RANDALL
How were the participants prepared for the evaluations?
What were the major issues and questions which the evaluation dealt with?
Who did what?
What were the resources for the evaluation?
What data was collected?
How was the data analysed?
What was the reporting procedure?
How was the evaluation data utilized?
It is hoped that, from this retrospective review of the two evaluations, points will be raised
which will be applicable to the evaluation of other materials-writing projects by shedding light
on such questions as:
•
•
•
•
•
how can formative evaluation be made effective?
what sort of trialling process is most suitable for providing formative feedback to a writing team?
what is the most effective size for a trialling process?
what are the relative merits of quantitative versus qualitative feedback in a trialling process?
what are the effects of different evaluation styles on the attitudes of those involved in
implementing the materials?
BACKGROUND
In Oman, English is taught in nine of the 12 school grades--for the last 3 years of the elementary
school (4th, 5th and 6th classes) and then through three levels each of Preparatory and Secondary
Schools.
In the late 1970s, the Omani government asked Longman Group U.K. Ltd. to produce an English
Language course for use in all nine classes. The plan was to produce a series of books, English
for Oman (EFO), starting from the 4th Elementary class at a rate of one set of books a year. The
process involved the production of 1000 copies of a trial edition for each of the years of the English
curriculum. Following the trial year, a Trial Edition was then produced in a black-and-white version
to be used throughout the Sultanate before a colour edition was printed.
The trial period for EFO examined within this paper is from 1981 to 1984, when the three sets
of Preparatory level books were produced and trialled. During this time, Longman had an author
in residence in the country for at least 4 months each year. He worked closely with staff at the
English Language Teaching Unit (ELTU) which included a team of ODA-funded KELT regional
inspectors.
Trial schools and teachers were selected throughout the Sultanate by ELTU to sample the range
of circumstances in which the book would be used. Thus, there was a selection of different types
of school: girls/boys; urban/rural, mountain/coast/plain; and a range of teachers: expatriate/Omani,
good/average/weak.
For a number of reasons, the Ministry of Education became dissatisfied with the agreement with
the publishers, and in 1987 a joint Omani-British Curriculum Renewal Project was established
EVALUATING COURSE MATERIALS
339
whose purpose was to renew the existing English curriculum by means of writing a new set of
teaching/learning materials; this was eventually called Our World Through English (OWTE).
Trial editions of OWTE materials were written in Muscat by a team of authors and curriculum
officers working at the English Language Teaching Department (ELTD), as the Unit was then
called. The team was headed by a Curriculum Renewal Adviser and supported by ODA-funded
Curriculum Renewal Officers. In contrast to the previous (EFO) project, it was decided that the
OWTE materials would be trialled in full colour versions on a nationwide basis: that is, the trial
materials would be used in all English classes, rather than sample schools or classes. A three-year
schedule was then introduced to write trial materials for each of the nine levels at which English
is taught. At the end of this writing phase, the materials were to be revised and re-written in the
light of feedback received during the trial period.
THE TWO APPROACHES TO EVALUATING THE TRIAL MATERIALS
1. What materials were being evaluated?
EFO and OWTE. Both projects involved the production of several different sets of material for
each school year. Typically, a curriculum package for 1 year would consist of: a Pupil's Book,
a Workbook, a Teacher's book, a cassette, and accompanying wallcharts and/or flashcards.
2. Why were the materials being evaluated?
In both cases the primary aim of the evaluation was to provide feedback to the writing process
to provide an effective, workable set of materials for teaching English in the schools.
3. How were the participants prepared for the evaluation ?
EFO. Each year a one week training course was held in the capital to introduce the trial teachers
and inspectors to the trial materials. These courses concentrated on the format and content of the
new materials and did not focus explicit attention on the evaluation process. As far as possible,
the trial teachers were the same each year and thus a 'cadre' of experienced evaluators was built
up who had a commitment to the project and were able to provide feedback in the context of the
developing course.
OWTE. At the start of the relevant school year, teachers attended seminars (8 hours) led by ELTD
staff to orientate them to the rationale, procedures and sample materials of the trial editions. During
these meetings, one hour was given over to explanation of the rationale, instruments and
procedures of the evaluation. The inspectors, key agents in the evaluation process, were informed
of the curriculum change at various times before and during the trialling process, but made no
substantive contribution to the design of instruments and procedures. Their role in evaluation was
explained by means of printed instructions as to how they should perform their tasks.
4. What were the major issues and questions which the evaluation dealt with?
EFO. The trial was not prescriptive in its approach. It posed the simple question: "how effective
are the materials as a means of effectively teaching English in this class?" Although general
guidelines were given about points to think about (e.g. timing of lessons, cultural problems, etc.)
there were no detailed formats to guide the teachers or the inspectors. They were encouraged to
340
ROGER BARNARD and MICK RANDALL
"write as much as possible" in diaries or logs, ranging from the mundane comments concerning
typographical errors through to major issues of methodology.
OWTE. The evaluation of the trial edition sought to deal with a great number of discrete points,
and the delineation of the issues led to the design of a six-page questionnaire for teachers; shorter
ones were produced for pupils and their parents. Open-ended observation reports were designed
for inspectors, based on blank pages interleaved with the detailed Teaching Notes for each
didactic unit of the course. However, the inspectors were given clear instructions that they were
to act primarily as 'impartial' observers and that their principal role in the evaluation was to note
down the degree to which the outlined procedures were being implemented by the teachers.
5. Who did what?
EFO. The evaluation was managed by the author in collaboration with ELTD. From the author's
point of view, the main client for the evaluation was "the pupil in the school--to make sure that
the pupils got as good a book as possible." The author visited all schools involved in the trial twice
a year, observed the teachers using the materials, and talked to them about their perceptions of
the materials. Each teacher involved in the trial was asked to keep a diary of points noticed while
teaching the trial materials, and to return these notes to the author either in person or through the
local inspectors. One inspector in each region was also asked to observe the trial material being
taught once a week and make notes of the observations. All information from the trial was fed
to the author, who had the final decision as to the relevance of the comments made on the
material and the inclusion of any modifications to the Trial Editions.
OWTE. The Curriculum Renewal Adviser was responsible for the design of instruments and
procedures of the evaluation process, and also data analysis, reporting and dissemination. Sample
populations among teachers, learners and their parents completed questionnaires (learners and
parents under guidance). Area meetings of teachers involved in providing the feedback were held
by the evaluation inspectors and attended by authors and other ELTD staff following the trial periods.
Specific work in these areas was detailed to various members of staff at ELTD.
6. What were the resources f o r the evaluation ?
EFO & OWTE. Neither evaluation had any significant problems with resourcing the trials,
although there was no specific budget set aside for the trialling by either of the projects. The largest
single cost--that of manpower--was provided by the Ministry of Education in terms of directing
inspectors and teachers to take part in the trials.
What data was collected?
EFO.
.
(b) teachers
what
lesson observation
discussions with teachers
lesson comments
(c) inspectors
lesson observation
from whom
(a) author
how
own notes
own notes
diaries
oral feedback to author
notes on lessons
EVALUATING COURSE MATERIALS
341
OWTE.
from whom
(a) parents
(b) pupils
(c) teachers
what
opinions/impressions
opinions/impressions
recollections/opinions
(d) inspectors
classroom observation
(e) authors
classroom observations
how
mediated questionnaires (Arabic)
mediated questionnaires (Arabic)
(a) questionnaires (Eng.)
(b) meetings
notes and reports
impressions
notes
NB. because of pressure from production deadlines, the authors rarely observed lessons, and these
were rarely in schools outside the capital area.
8. How was the data analysed?
EFO. All the diary material, the lesson observation notes and the notes made during the visits to
the schools and discussions with the trial teachers and the inspectors were analysed by the
author.
Annual meetings were arranged between the inspectors and the author on top of the individual
discussions which the author had with the inspectors and teachers as he visited the schools.
OWTE. A database was devised post hoc using Apple Mac software, and all the quantitative data
derived from ratings on the 5-point scales on the questionnaires was entered and collated.
Qualitative data, such as respondents' comments, were added verbatim to the quantitative
summaries for each piece of material. This procedure turned out to be very time-consuming.
9. What was the reporting procedure?
EFO. The Writer discussed proposed changes in the trial materials with ELTU staff. The writer
directly incorporated sensible changes into the materials and these changes were presented to the
ELTU. No specific report was produced.
OWTE. No prior thought had been given to this matter---other than that the eventual audience (and
users) of the evaluation would be the authorial team. In the event, the analyses of quantitative and
qualitative data were considered by the Curriculum Renewal Adviser and summary reports
drafted. Digests of these were disseminated orally by him within ELTD, largely via meetings with
the authorial teams. Summaries of decisions reached at these meetings were presented orally to
inspectors at their regular meetings in Muscat.
10. How was the data from the evaluation utilized?
EFO. The data was used directly to provide guidelines for the re-writing of the materials by the
writer.
OWTE. The authorial teams planned and wrote the First Editions based on a series of meetings
within ELTD at which the implications of the quantitative and qualitative data were discussed.
It should be noted that the quality of the data received from the evaluation could only provide the
crudest of information about the strengths and weaknesses of the materials.
342
ROGER BARNARD and MICK RANDALL
DISCUSSION
Both trialling processes were intended to be pragmatic: that is, they set out to obtain "maximally
useful information" (Cronbach, 1982:4) from a given situation in order to effect certain specific
outcomes. Their manifest concern was not to discover or describe some aspect of universal truth
(Fishman's 1991 "experimental paradigm"), nor to add to the professional kit bag of academic
evaluators (Beretta's, 1990 'researcher without portfolio') but to facilitate practical decision-making
based on the interpretation of data about the effectiveness of materials in specific educational
contexts.
1. Size and type of data
There is no obvious correlation between the size of an evaluation and the type of data it seeks.
EFO was a small-scale evaluation: only a limited number of teachers and inspectors collaborated
with the single author, who, acting also as evaluator, Collected and collated the data and made the
practical decisions. The data was entirely qualitative; no attempt was made to count or collate the
comments made by the various informants. Although the revised material incorporated suggestions
from the collaborators, there was a lack of systematic evidence of the reliability of the feedback.
No attempt was made, for example, to assess the typicality of the information obtained from the
teachers, schools and learners involved.
As a sample population increases, it becomes more difficult to process and assimilate qualitative
information, and hence, in larger scale evaluations, more emphasis is given to instruments which
can convert values into quantitative terms. In the OWTE evaluation, the main source of data was
the set of questionnaires distributed to hundreds of teachers, whose numerated responses were
to be processed via computer technology. In the event, there were considerable problems in the
processing of the data, not least because there had been insufficient thought given to the matter
when the questionnaires were designed. Consequently, much of the information was not used in
the re-drafting of the OWTE materials. It was found that the serial scale and multiple choice
questions did not provide any really interesting information and few respondents put any
significant comments in the open sections. Indeed, the relative 'blandness' of the numerical
information gained from the questionnaires made much of the information unusable in the re-write.
It may be argued that the strength of qualitative information is in its potential validity, whereas
that of quantitative data lies in the direction of reliability. The respective benefits and constraints
of each need to be carefully weighed before decisions are made about how to proceed with any
proposed evaluation, so that appropriate instruments can be devised to ensure maximum utility.
It would seem that there is a definite need for balanced, critical comment in a formative text book
trialling process and that this sort of information is not readily supplied by the questionnaire approach
to evaluation. Thus, it may be argued that the loss of reliability involved in a restricted, smallerscale trial, may be offset by the considerable gains to be made in quality of feedback received by
such a process.
2. Openness and targeting
There is also a correlation between the type of data sought and the degree to which the required
information is targeted. If the views of only a few people are canvassed, questions can be openended and their responses free-ranging in terms both of content and format; re-formulation,
clarification and extension of oral and written response can be sought. However, when large numbers
of questionnaires have to be processed, it is necessary to limit the options open to the respondents
EVALUATING COURSE MATERIALS
343
in order to obtain information that can be quantitatively processed; thus, serial scales and multiple
choice formats are more widely used.
It is also more difficult for a large-scale evaluation to undertake what has been called an
"illuminative evaluation" (Parlett, 1981). The questionnaire formats for the OWTE evaluation
remained the same for each of the trial years. Thus, despite early intentions to revise the
questionnaires each year, because the data for the evaluation was not, in fact, analysed until after
the trial period had finished, the evaluation was in a sense 'static'; it did not focus on different
issues as they arose from the writing process. In the EFO evaluation, on the other hand, the author
was free to ask questions and gather data which was both relevant to the writing process and related
to the data already gathered. The process was thus 'dynamic', allowing for the writer/evaluator
to investigate and probe the areas which he considered important.
There are a number of points which arise from the experience with the evaluation using
questionnaires. Firstly in order to ensure that the questionnaires will obtain the required information
it is essential to pre-test the instruments on a sample population to check that the questions are
not ambiguous, and the choices offered are indeed the most appropriate ones. Secondly, it is
important to examine the data early and, if necessary, alter the type of questions asked in order
to take into account the information already gathered. Finally, and perhaps most importantly, a
battery of closed questions may lend an authoritarian air to the instrument, as it directs the
thought-processes of the respondents even where, as in the case of OWTE, some space was given
after each section for the respondent to add comments. Such an impression may be highlighted
by undue length; the OWTE questionnaire comprised six pages.
3. Accountability and utility
Both processes served essentially the same decision-maker, the publishing authority. In the case
of EFO, this was a cornmerical publisher, under whose auspices the piloting process was designed
and implemented. There was a low degree of overt accountability built into the process; not least
because so few people were involved. Consequently, the process was streamlined and efficient
in several ways; for example, in the assimilation of feedback into draft materials. It may be assumed
that the publishers were satisfied with the process, but little attempt was made to submit formal
reports to their client, the Ministry, regarding the rationale for the changes that were made.
Decisions of the author were largely accepted by the Ministry as there was an atmosphere of trust
between the client and the author.
With regard to the evaluation of the OWTE trial materials, the publishing authority was the Ministry
itself, and the piloting process was designed and implemented by the Project Renewal Team. Thus
the evaluation of these materials appeared much more concerned with public accountability
than the EFO evaluation. One consequence of this greater public accountability was that the OWTE
evaluators sought to legitimise eventual decisions about revising the draft materials by reference
to statistically reliable information. However, when it was realised the quantitative data could not
satisfactorily be processed, the re-writing team searched the questionnaires for critical comments
which might indicate general problems which the teachers were having with the materials. Thus
the evaluation had become 'grossly bureaucratic' (Mackay, 1994: 143) in that the process, while
seemingly serving the requirements of accountability, was virtually useless to those responsible
for revising (and improving) the materials.
344
ROGER BARNARD and MICK RANDALL
4. Investment and commitment
The effect of the two evaluations in terms of commitment to the respective project was also quite
different. Clearly, the senior echelons in the Ministry did not feel fully committed to the EFO project,
as the materials were rejected even before the first editions had run their course. However, it seems
that the change of course was motivated by political and economic decisions not directly related
to the evaluation process itself. Such issues as the ownership of copyright and Gulf Cooperation
Council decisions regarding a common syllabus for English in the region were probably more
important than the quality of the materials or the evaluation process.
There was, however, a considerable difference between the two evaluation styles on the attitudes
of those involved in implementing the innovations.
The involvement of many more personnel in the OWTE evaluation should, theoretically, have
led to a greater acceptance of the new materials by those working at the chalk face than was achieved
by the relatively few (and perceivably "elitist") teachers working on the EFO evaluation. In
fact the opposite seems to have occurred, and it is important to discuss possible reasons for this
reaction.
Partly, the negative reaction to the OWTE evaluation could have been due to the extra work that
the OWTE evaluation imposed on teachers and inspectors. It took time for the teachers to respond
even cursorily to the six-page questionnaires, and no inducement of any sort was offered for more
considered completion. Evaluation feedback meetings frequently involved long and arduous
joumeys, and took the teachers away from what they saw as their prime task of completing the
syllabus. During the evaluation process, the English inspectorate was heavily-pressed at a time
when it was officially acknowledged to be under-staffed. Their various responsibilities for
evaluation were seen as an additional chore with which they had to cope with no obvious reward
or any evidence that their work was being incorporated into the materials.
Although the EFO evaluation equally imposed extra workloads, those involved tended to view
their work in positive terms. As Cronbach noted (1982:8) one of the roles of any evaluator is that
of teacher, and an important element in an evaluation process is the professional development of
those involved. Although no formal training in evaluation procedures was involved, or accreditation
given, those working with the EFO author felt themselves to be a select cadre of teachers and
inspectors in Oman. This clearly led to a 'Hawthorne effect' amongst the teachers and inspectors
taking part. Such an effect casts some doubt on the 'scientific' validity of the data produced by
the evaluation, but is extremely important in gaining acceptance for the new innovation.
In terms of the personal relationships between the participants in the evaluation, it was noted that
the EFO evaluator established strong personal relationships with those among whom he worked.
The collaborators felt that they "were not being 'used' as mere data points but had a significant
part in the evaluation process" (Parlett, 1981: 225). They accepted that the evaluator was fair and
honest. This view was reflected in the immediacy of the feedback from the process in seeing changes
in the published materials in the next year.
In the OWTE evaluation, the relationships were very different. In order to assess reliably the
materials, the participants were asked to 'monitor' the degree to which the teaching deviated from
the procedures laid down in the book. Thus, the inspectors and teachers felt that their function
EVALUATING COURSE MATERIALS
345
was primarily to fill in the questionnaires and to return data to the project. Their role was largely
depersonalised by the evaluation process used. The added problem of the delay between the
evaluation and the results in terms of seeing changes to the materials (a delay of up to three years)
only added to a sense of alienation from the process of evaluation.
Finally, the rapidity with which EFO was to be replaced by OWTE may have been a negative factor
in the teachers' attitudes. While still accustoming themselves to the EFO books, teachers were
asked to use the comment upon a new set of materials and techniques based upon different
assumptions about language learning and teaching. Thus there was a problem of timing. Moreover,
their responsiveness to requests (or demands) for information may have been blunted by the fact
that it was known that their "colleagues" opinions had been sought in the EFO trial, incorporated
in the revised materials and then--apparently--ignored when the EFO materials were rejected.
CONCLUSION
From the experience with these two trialling processes it would seem that the smaller and more
immediate of the trialling processes provided much more usable formative information for the
rewriting of the trial materials. By putting the writer in close contact with the teacher and
inspectors it allowed for the information to be passed directly between those involved with
teaching the materials and the person in charge of writing the materials. This had an important
effect on the morale of those involved in the project.
Clearly the gradualist approach of the EFO project--the production of material for only one level
each year as against the three levels per year of OWTE--placed much less pressure on the
trialling process. In this and many other ways the OWTE trial was working under a much greater
number of restraints than the EFO trial, and the differences between the reactions to the two trials
can, to some extent, be attributable to the differences in the circumstances under which each worked.
However, there are other important factors which contributed to the differences between the trial.
Perhaps the single most important difference between the two trial processes is the relationship
between the participants in the process and the degree of trust which existed between them. It has
already been noted that the EFO process was ultimately successful because there was an
atmosphere of trust built up between the writer, the teachers and the Ministry of Education. Probably,
in the end, the OWTE trial was less successful because there was not such an atmosphere of trust
between the participants. In a trialling situation it is essential that trust exists. The writers need
to trust the evaluators. If they do not, then criticism is difficult to accept and conflict results. The
teachers need to trust the writers. It is the teachers in the end who are going to have to use the
materials and they need to be convinced that the effort that they are going to put in will be useful.
Finally, the inspectors need to believe that what they say will be taken into account as they are
the principle change agents in any project.
Partly the issue of trust is one of the personalities involved, but this is only one aspect of the situation.
More importantly, it was the processes and approaches of the two trials which facilitated or restrained
the development of an atmosphere of trust. The key factor in the EFO trial was its openness and
flexibility. This led to the empowerment of all participants in the process which led to a feeling
of ownership of the project by the participants. The writer was empowered to gather data and
346
ROGER BARNARDand MICK RANDALL
write/re-write the materials as he saw fit. The inspectors and the teachers felt empowered to make
any comment on the materials which they saw fit and there was indication in the rewritten
materials that these suggestions were being acted upon. Such an approach is much more easily
carried out in a small trial, but such factors need to be addressed in larger trials to avoid feelings
of alienation building up which will invalidate even the most rigorous of scientific enquiries.
REFERENCES
BRUMFIT, C. J. (ed.) (1983) Language Teaching Projectsfor the Third World, ELT Documents 116. Oxford: The British
Council/Pergamon Press.
BERETTA, A. (1990) The program evaluator: the ESL researcher without portfolio, Applied Linguistics, 12(1), pp. 1-15.
CLARKE, D. J. (1983) Evaluation of English for Somalia In Brumfit, C. J. (ed.), Language Teaching Projects for the
Third World. ELTDocuments 116. Oxford: The British Council/Pergamon Press.
CRONBACH, L. (1982) Issues in planning evaluation. In Murphy, R. and Torrance, H. (eds), Evaluating Education: Issues
and Methods.
FISHMAN, D. B. (1991) An introduction to the experimental versus the pragmatic paradigm in evaluation. Evaluation
and Program Planning, 14, 353-363.
LEE, L. J. and SAMPSON, J. F. (1990) A practical approach to program evaluation, Evaluation andProgram Planning,
13, pp. 157-164.
MACKAY, R. (1994) Undertaking ESL/EFL programme review for accountability and improvement. ELTJournal
48(2), pp. 142-149.
MURPHY, R. and TORRANCE, H. (eds) (1987) Evaluating Education: Issues and Methods. London: Open University.
PARLETT, M. (1981) Illuminative evaluation'. In Reason, P. and Rowan J. (eds) Human Inquiry.
REASON, P. and ROWAN, J. (eds) (1981) Human Inquiry. London: John Wiley.
WILSON, P. and HARRISON, I. (1983) Materials design in Africa. In Brumfit, C. J. (ed.) Language Teaching Projects
for the Third World, ELT Documents 116. Oxford: The British Council/Pergamon Press.