Academia.eduAcademia.edu
System, Vol. 23, No. 3, pp. 337-346, 1995 Copyright © 1995 Elsevier Science Ltd Printed in Great Britain. All fights reserved 0346-251X/95 $9.50 + 0.00 Pergamon 0346-251X(95)00021-6 EVALUATING COURSE MATERIALS: A CONTRASTIVE STUDY IN TEXT BOOK TRIALLING ROGER BARNARD and MICK RANDALL Chichester Institute of Higher Education, W. Sussex, U.K. This article analyses two similar trials of ELT text books which took place in the Sultanate of Oman within a few years of each other. The article compares the different approaches used in the two trials and draws some conclusions concerning the most effective methods to undertake trials of new curriculum material. INTRODUCTION Throughout much of the developing world, the first response by governments and institutions to demands for curriculum renewal is often to produce new curriculum materials. The Overseas Development Administration (ODA), as an agency providing technical assistance in ELT, has been involved in many such projects--by employing specialists either to write new course books for Ministries of Education (e.g. Camaroon, Somalia, Yemen), or to participate with others, such as publishers or in-country writing teams in the introduction of new materials (e.g. Sudan, Ethiopia). In any such project, there is clearly a need to ascertain the strengths and weaknesses of the new materials. This evaluation is undertaken either by the project team, by the receiving institution, or by an outside agency such as the ODA. In some circumstances, this may lead to a summative evaluation of a course which has already been written and published; in most cases, however, there is an attempt to provide formative evaluation of materials as they are being produced. In nearly all these situations, draft materials are 'trialled' before being produced and used in their final form. Given that trialling is such a common need in ELT, it is perhaps surprising that it has received relatively little attention in the literature over the last 10 years. There have been descriptive studies of the process of trialling (cf. Wilson and Harrison, 1983 and Clarke, 1983), but no attempt to evaluate the process itself. It is the purpose of this paper to review two different approaches to trialling new course materials in which the writers were directly involved. Both attempts took place in the Sultanate of Oman within a few years of each other and differed considerably in both scale and style, yet both were essentially serving the same ends: the production of curriculum materials for schools. The first part of this paper will describe and contrast the two approaches which will be analysed by reference to the following questions, based on the framework suggested by Lee & Sampson (1990): 1. What materials were being evaluated? 2. Why were the materials being evaluated? 337 338 3. 4. 5. 6. 7. 8. 9. 10. ROGER BARNARD and MICK RANDALL How were the participants prepared for the evaluations? What were the major issues and questions which the evaluation dealt with? Who did what? What were the resources for the evaluation? What data was collected? How was the data analysed? What was the reporting procedure? How was the evaluation data utilized? It is hoped that, from this retrospective review of the two evaluations, points will be raised which will be applicable to the evaluation of other materials-writing projects by shedding light on such questions as: • • • • • how can formative evaluation be made effective? what sort of trialling process is most suitable for providing formative feedback to a writing team? what is the most effective size for a trialling process? what are the relative merits of quantitative versus qualitative feedback in a trialling process? what are the effects of different evaluation styles on the attitudes of those involved in implementing the materials? BACKGROUND In Oman, English is taught in nine of the 12 school grades--for the last 3 years of the elementary school (4th, 5th and 6th classes) and then through three levels each of Preparatory and Secondary Schools. In the late 1970s, the Omani government asked Longman Group U.K. Ltd. to produce an English Language course for use in all nine classes. The plan was to produce a series of books, English for Oman (EFO), starting from the 4th Elementary class at a rate of one set of books a year. The process involved the production of 1000 copies of a trial edition for each of the years of the English curriculum. Following the trial year, a Trial Edition was then produced in a black-and-white version to be used throughout the Sultanate before a colour edition was printed. The trial period for EFO examined within this paper is from 1981 to 1984, when the three sets of Preparatory level books were produced and trialled. During this time, Longman had an author in residence in the country for at least 4 months each year. He worked closely with staff at the English Language Teaching Unit (ELTU) which included a team of ODA-funded KELT regional inspectors. Trial schools and teachers were selected throughout the Sultanate by ELTU to sample the range of circumstances in which the book would be used. Thus, there was a selection of different types of school: girls/boys; urban/rural, mountain/coast/plain; and a range of teachers: expatriate/Omani, good/average/weak. For a number of reasons, the Ministry of Education became dissatisfied with the agreement with the publishers, and in 1987 a joint Omani-British Curriculum Renewal Project was established EVALUATING COURSE MATERIALS 339 whose purpose was to renew the existing English curriculum by means of writing a new set of teaching/learning materials; this was eventually called Our World Through English (OWTE). Trial editions of OWTE materials were written in Muscat by a team of authors and curriculum officers working at the English Language Teaching Department (ELTD), as the Unit was then called. The team was headed by a Curriculum Renewal Adviser and supported by ODA-funded Curriculum Renewal Officers. In contrast to the previous (EFO) project, it was decided that the OWTE materials would be trialled in full colour versions on a nationwide basis: that is, the trial materials would be used in all English classes, rather than sample schools or classes. A three-year schedule was then introduced to write trial materials for each of the nine levels at which English is taught. At the end of this writing phase, the materials were to be revised and re-written in the light of feedback received during the trial period. THE TWO APPROACHES TO EVALUATING THE TRIAL MATERIALS 1. What materials were being evaluated? EFO and OWTE. Both projects involved the production of several different sets of material for each school year. Typically, a curriculum package for 1 year would consist of: a Pupil's Book, a Workbook, a Teacher's book, a cassette, and accompanying wallcharts and/or flashcards. 2. Why were the materials being evaluated? In both cases the primary aim of the evaluation was to provide feedback to the writing process to provide an effective, workable set of materials for teaching English in the schools. 3. How were the participants prepared for the evaluation ? EFO. Each year a one week training course was held in the capital to introduce the trial teachers and inspectors to the trial materials. These courses concentrated on the format and content of the new materials and did not focus explicit attention on the evaluation process. As far as possible, the trial teachers were the same each year and thus a 'cadre' of experienced evaluators was built up who had a commitment to the project and were able to provide feedback in the context of the developing course. OWTE. At the start of the relevant school year, teachers attended seminars (8 hours) led by ELTD staff to orientate them to the rationale, procedures and sample materials of the trial editions. During these meetings, one hour was given over to explanation of the rationale, instruments and procedures of the evaluation. The inspectors, key agents in the evaluation process, were informed of the curriculum change at various times before and during the trialling process, but made no substantive contribution to the design of instruments and procedures. Their role in evaluation was explained by means of printed instructions as to how they should perform their tasks. 4. What were the major issues and questions which the evaluation dealt with? EFO. The trial was not prescriptive in its approach. It posed the simple question: "how effective are the materials as a means of effectively teaching English in this class?" Although general guidelines were given about points to think about (e.g. timing of lessons, cultural problems, etc.) there were no detailed formats to guide the teachers or the inspectors. They were encouraged to 340 ROGER BARNARD and MICK RANDALL "write as much as possible" in diaries or logs, ranging from the mundane comments concerning typographical errors through to major issues of methodology. OWTE. The evaluation of the trial edition sought to deal with a great number of discrete points, and the delineation of the issues led to the design of a six-page questionnaire for teachers; shorter ones were produced for pupils and their parents. Open-ended observation reports were designed for inspectors, based on blank pages interleaved with the detailed Teaching Notes for each didactic unit of the course. However, the inspectors were given clear instructions that they were to act primarily as 'impartial' observers and that their principal role in the evaluation was to note down the degree to which the outlined procedures were being implemented by the teachers. 5. Who did what? EFO. The evaluation was managed by the author in collaboration with ELTD. From the author's point of view, the main client for the evaluation was "the pupil in the school--to make sure that the pupils got as good a book as possible." The author visited all schools involved in the trial twice a year, observed the teachers using the materials, and talked to them about their perceptions of the materials. Each teacher involved in the trial was asked to keep a diary of points noticed while teaching the trial materials, and to return these notes to the author either in person or through the local inspectors. One inspector in each region was also asked to observe the trial material being taught once a week and make notes of the observations. All information from the trial was fed to the author, who had the final decision as to the relevance of the comments made on the material and the inclusion of any modifications to the Trial Editions. OWTE. The Curriculum Renewal Adviser was responsible for the design of instruments and procedures of the evaluation process, and also data analysis, reporting and dissemination. Sample populations among teachers, learners and their parents completed questionnaires (learners and parents under guidance). Area meetings of teachers involved in providing the feedback were held by the evaluation inspectors and attended by authors and other ELTD staff following the trial periods. Specific work in these areas was detailed to various members of staff at ELTD. 6. What were the resources f o r the evaluation ? EFO & OWTE. Neither evaluation had any significant problems with resourcing the trials, although there was no specific budget set aside for the trialling by either of the projects. The largest single cost--that of manpower--was provided by the Ministry of Education in terms of directing inspectors and teachers to take part in the trials. What data was collected? EFO. . (b) teachers what lesson observation discussions with teachers lesson comments (c) inspectors lesson observation from whom (a) author how own notes own notes diaries oral feedback to author notes on lessons EVALUATING COURSE MATERIALS 341 OWTE. from whom (a) parents (b) pupils (c) teachers what opinions/impressions opinions/impressions recollections/opinions (d) inspectors classroom observation (e) authors classroom observations how mediated questionnaires (Arabic) mediated questionnaires (Arabic) (a) questionnaires (Eng.) (b) meetings notes and reports impressions notes NB. because of pressure from production deadlines, the authors rarely observed lessons, and these were rarely in schools outside the capital area. 8. How was the data analysed? EFO. All the diary material, the lesson observation notes and the notes made during the visits to the schools and discussions with the trial teachers and the inspectors were analysed by the author. Annual meetings were arranged between the inspectors and the author on top of the individual discussions which the author had with the inspectors and teachers as he visited the schools. OWTE. A database was devised post hoc using Apple Mac software, and all the quantitative data derived from ratings on the 5-point scales on the questionnaires was entered and collated. Qualitative data, such as respondents' comments, were added verbatim to the quantitative summaries for each piece of material. This procedure turned out to be very time-consuming. 9. What was the reporting procedure? EFO. The Writer discussed proposed changes in the trial materials with ELTU staff. The writer directly incorporated sensible changes into the materials and these changes were presented to the ELTU. No specific report was produced. OWTE. No prior thought had been given to this matter---other than that the eventual audience (and users) of the evaluation would be the authorial team. In the event, the analyses of quantitative and qualitative data were considered by the Curriculum Renewal Adviser and summary reports drafted. Digests of these were disseminated orally by him within ELTD, largely via meetings with the authorial teams. Summaries of decisions reached at these meetings were presented orally to inspectors at their regular meetings in Muscat. 10. How was the data from the evaluation utilized? EFO. The data was used directly to provide guidelines for the re-writing of the materials by the writer. OWTE. The authorial teams planned and wrote the First Editions based on a series of meetings within ELTD at which the implications of the quantitative and qualitative data were discussed. It should be noted that the quality of the data received from the evaluation could only provide the crudest of information about the strengths and weaknesses of the materials. 342 ROGER BARNARD and MICK RANDALL DISCUSSION Both trialling processes were intended to be pragmatic: that is, they set out to obtain "maximally useful information" (Cronbach, 1982:4) from a given situation in order to effect certain specific outcomes. Their manifest concern was not to discover or describe some aspect of universal truth (Fishman's 1991 "experimental paradigm"), nor to add to the professional kit bag of academic evaluators (Beretta's, 1990 'researcher without portfolio') but to facilitate practical decision-making based on the interpretation of data about the effectiveness of materials in specific educational contexts. 1. Size and type of data There is no obvious correlation between the size of an evaluation and the type of data it seeks. EFO was a small-scale evaluation: only a limited number of teachers and inspectors collaborated with the single author, who, acting also as evaluator, Collected and collated the data and made the practical decisions. The data was entirely qualitative; no attempt was made to count or collate the comments made by the various informants. Although the revised material incorporated suggestions from the collaborators, there was a lack of systematic evidence of the reliability of the feedback. No attempt was made, for example, to assess the typicality of the information obtained from the teachers, schools and learners involved. As a sample population increases, it becomes more difficult to process and assimilate qualitative information, and hence, in larger scale evaluations, more emphasis is given to instruments which can convert values into quantitative terms. In the OWTE evaluation, the main source of data was the set of questionnaires distributed to hundreds of teachers, whose numerated responses were to be processed via computer technology. In the event, there were considerable problems in the processing of the data, not least because there had been insufficient thought given to the matter when the questionnaires were designed. Consequently, much of the information was not used in the re-drafting of the OWTE materials. It was found that the serial scale and multiple choice questions did not provide any really interesting information and few respondents put any significant comments in the open sections. Indeed, the relative 'blandness' of the numerical information gained from the questionnaires made much of the information unusable in the re-write. It may be argued that the strength of qualitative information is in its potential validity, whereas that of quantitative data lies in the direction of reliability. The respective benefits and constraints of each need to be carefully weighed before decisions are made about how to proceed with any proposed evaluation, so that appropriate instruments can be devised to ensure maximum utility. It would seem that there is a definite need for balanced, critical comment in a formative text book trialling process and that this sort of information is not readily supplied by the questionnaire approach to evaluation. Thus, it may be argued that the loss of reliability involved in a restricted, smallerscale trial, may be offset by the considerable gains to be made in quality of feedback received by such a process. 2. Openness and targeting There is also a correlation between the type of data sought and the degree to which the required information is targeted. If the views of only a few people are canvassed, questions can be openended and their responses free-ranging in terms both of content and format; re-formulation, clarification and extension of oral and written response can be sought. However, when large numbers of questionnaires have to be processed, it is necessary to limit the options open to the respondents EVALUATING COURSE MATERIALS 343 in order to obtain information that can be quantitatively processed; thus, serial scales and multiple choice formats are more widely used. It is also more difficult for a large-scale evaluation to undertake what has been called an "illuminative evaluation" (Parlett, 1981). The questionnaire formats for the OWTE evaluation remained the same for each of the trial years. Thus, despite early intentions to revise the questionnaires each year, because the data for the evaluation was not, in fact, analysed until after the trial period had finished, the evaluation was in a sense 'static'; it did not focus on different issues as they arose from the writing process. In the EFO evaluation, on the other hand, the author was free to ask questions and gather data which was both relevant to the writing process and related to the data already gathered. The process was thus 'dynamic', allowing for the writer/evaluator to investigate and probe the areas which he considered important. There are a number of points which arise from the experience with the evaluation using questionnaires. Firstly in order to ensure that the questionnaires will obtain the required information it is essential to pre-test the instruments on a sample population to check that the questions are not ambiguous, and the choices offered are indeed the most appropriate ones. Secondly, it is important to examine the data early and, if necessary, alter the type of questions asked in order to take into account the information already gathered. Finally, and perhaps most importantly, a battery of closed questions may lend an authoritarian air to the instrument, as it directs the thought-processes of the respondents even where, as in the case of OWTE, some space was given after each section for the respondent to add comments. Such an impression may be highlighted by undue length; the OWTE questionnaire comprised six pages. 3. Accountability and utility Both processes served essentially the same decision-maker, the publishing authority. In the case of EFO, this was a cornmerical publisher, under whose auspices the piloting process was designed and implemented. There was a low degree of overt accountability built into the process; not least because so few people were involved. Consequently, the process was streamlined and efficient in several ways; for example, in the assimilation of feedback into draft materials. It may be assumed that the publishers were satisfied with the process, but little attempt was made to submit formal reports to their client, the Ministry, regarding the rationale for the changes that were made. Decisions of the author were largely accepted by the Ministry as there was an atmosphere of trust between the client and the author. With regard to the evaluation of the OWTE trial materials, the publishing authority was the Ministry itself, and the piloting process was designed and implemented by the Project Renewal Team. Thus the evaluation of these materials appeared much more concerned with public accountability than the EFO evaluation. One consequence of this greater public accountability was that the OWTE evaluators sought to legitimise eventual decisions about revising the draft materials by reference to statistically reliable information. However, when it was realised the quantitative data could not satisfactorily be processed, the re-writing team searched the questionnaires for critical comments which might indicate general problems which the teachers were having with the materials. Thus the evaluation had become 'grossly bureaucratic' (Mackay, 1994: 143) in that the process, while seemingly serving the requirements of accountability, was virtually useless to those responsible for revising (and improving) the materials. 344 ROGER BARNARD and MICK RANDALL 4. Investment and commitment The effect of the two evaluations in terms of commitment to the respective project was also quite different. Clearly, the senior echelons in the Ministry did not feel fully committed to the EFO project, as the materials were rejected even before the first editions had run their course. However, it seems that the change of course was motivated by political and economic decisions not directly related to the evaluation process itself. Such issues as the ownership of copyright and Gulf Cooperation Council decisions regarding a common syllabus for English in the region were probably more important than the quality of the materials or the evaluation process. There was, however, a considerable difference between the two evaluation styles on the attitudes of those involved in implementing the innovations. The involvement of many more personnel in the OWTE evaluation should, theoretically, have led to a greater acceptance of the new materials by those working at the chalk face than was achieved by the relatively few (and perceivably "elitist") teachers working on the EFO evaluation. In fact the opposite seems to have occurred, and it is important to discuss possible reasons for this reaction. Partly, the negative reaction to the OWTE evaluation could have been due to the extra work that the OWTE evaluation imposed on teachers and inspectors. It took time for the teachers to respond even cursorily to the six-page questionnaires, and no inducement of any sort was offered for more considered completion. Evaluation feedback meetings frequently involved long and arduous joumeys, and took the teachers away from what they saw as their prime task of completing the syllabus. During the evaluation process, the English inspectorate was heavily-pressed at a time when it was officially acknowledged to be under-staffed. Their various responsibilities for evaluation were seen as an additional chore with which they had to cope with no obvious reward or any evidence that their work was being incorporated into the materials. Although the EFO evaluation equally imposed extra workloads, those involved tended to view their work in positive terms. As Cronbach noted (1982:8) one of the roles of any evaluator is that of teacher, and an important element in an evaluation process is the professional development of those involved. Although no formal training in evaluation procedures was involved, or accreditation given, those working with the EFO author felt themselves to be a select cadre of teachers and inspectors in Oman. This clearly led to a 'Hawthorne effect' amongst the teachers and inspectors taking part. Such an effect casts some doubt on the 'scientific' validity of the data produced by the evaluation, but is extremely important in gaining acceptance for the new innovation. In terms of the personal relationships between the participants in the evaluation, it was noted that the EFO evaluator established strong personal relationships with those among whom he worked. The collaborators felt that they "were not being 'used' as mere data points but had a significant part in the evaluation process" (Parlett, 1981: 225). They accepted that the evaluator was fair and honest. This view was reflected in the immediacy of the feedback from the process in seeing changes in the published materials in the next year. In the OWTE evaluation, the relationships were very different. In order to assess reliably the materials, the participants were asked to 'monitor' the degree to which the teaching deviated from the procedures laid down in the book. Thus, the inspectors and teachers felt that their function EVALUATING COURSE MATERIALS 345 was primarily to fill in the questionnaires and to return data to the project. Their role was largely depersonalised by the evaluation process used. The added problem of the delay between the evaluation and the results in terms of seeing changes to the materials (a delay of up to three years) only added to a sense of alienation from the process of evaluation. Finally, the rapidity with which EFO was to be replaced by OWTE may have been a negative factor in the teachers' attitudes. While still accustoming themselves to the EFO books, teachers were asked to use the comment upon a new set of materials and techniques based upon different assumptions about language learning and teaching. Thus there was a problem of timing. Moreover, their responsiveness to requests (or demands) for information may have been blunted by the fact that it was known that their "colleagues" opinions had been sought in the EFO trial, incorporated in the revised materials and then--apparently--ignored when the EFO materials were rejected. CONCLUSION From the experience with these two trialling processes it would seem that the smaller and more immediate of the trialling processes provided much more usable formative information for the rewriting of the trial materials. By putting the writer in close contact with the teacher and inspectors it allowed for the information to be passed directly between those involved with teaching the materials and the person in charge of writing the materials. This had an important effect on the morale of those involved in the project. Clearly the gradualist approach of the EFO project--the production of material for only one level each year as against the three levels per year of OWTE--placed much less pressure on the trialling process. In this and many other ways the OWTE trial was working under a much greater number of restraints than the EFO trial, and the differences between the reactions to the two trials can, to some extent, be attributable to the differences in the circumstances under which each worked. However, there are other important factors which contributed to the differences between the trial. Perhaps the single most important difference between the two trial processes is the relationship between the participants in the process and the degree of trust which existed between them. It has already been noted that the EFO process was ultimately successful because there was an atmosphere of trust built up between the writer, the teachers and the Ministry of Education. Probably, in the end, the OWTE trial was less successful because there was not such an atmosphere of trust between the participants. In a trialling situation it is essential that trust exists. The writers need to trust the evaluators. If they do not, then criticism is difficult to accept and conflict results. The teachers need to trust the writers. It is the teachers in the end who are going to have to use the materials and they need to be convinced that the effort that they are going to put in will be useful. Finally, the inspectors need to believe that what they say will be taken into account as they are the principle change agents in any project. Partly the issue of trust is one of the personalities involved, but this is only one aspect of the situation. More importantly, it was the processes and approaches of the two trials which facilitated or restrained the development of an atmosphere of trust. The key factor in the EFO trial was its openness and flexibility. This led to the empowerment of all participants in the process which led to a feeling of ownership of the project by the participants. The writer was empowered to gather data and 346 ROGER BARNARDand MICK RANDALL write/re-write the materials as he saw fit. The inspectors and the teachers felt empowered to make any comment on the materials which they saw fit and there was indication in the rewritten materials that these suggestions were being acted upon. Such an approach is much more easily carried out in a small trial, but such factors need to be addressed in larger trials to avoid feelings of alienation building up which will invalidate even the most rigorous of scientific enquiries. REFERENCES BRUMFIT, C. J. (ed.) (1983) Language Teaching Projectsfor the Third World, ELT Documents 116. Oxford: The British Council/Pergamon Press. BERETTA, A. (1990) The program evaluator: the ESL researcher without portfolio, Applied Linguistics, 12(1), pp. 1-15. CLARKE, D. J. (1983) Evaluation of English for Somalia In Brumfit, C. J. (ed.), Language Teaching Projects for the Third World. ELTDocuments 116. Oxford: The British Council/Pergamon Press. CRONBACH, L. (1982) Issues in planning evaluation. In Murphy, R. and Torrance, H. (eds), Evaluating Education: Issues and Methods. FISHMAN, D. B. (1991) An introduction to the experimental versus the pragmatic paradigm in evaluation. Evaluation and Program Planning, 14, 353-363. LEE, L. J. and SAMPSON, J. F. (1990) A practical approach to program evaluation, Evaluation andProgram Planning, 13, pp. 157-164. MACKAY, R. (1994) Undertaking ESL/EFL programme review for accountability and improvement. ELTJournal 48(2), pp. 142-149. MURPHY, R. and TORRANCE, H. (eds) (1987) Evaluating Education: Issues and Methods. London: Open University. PARLETT, M. (1981) Illuminative evaluation'. In Reason, P. and Rowan J. (eds) Human Inquiry. REASON, P. and ROWAN, J. (eds) (1981) Human Inquiry. London: John Wiley. WILSON, P. and HARRISON, I. (1983) Materials design in Africa. In Brumfit, C. J. (ed.) Language Teaching Projects for the Third World, ELT Documents 116. Oxford: The British Council/Pergamon Press.