ELT Journal
Evaluating teaching practice
Shosh Leshem and Rivka Bar-Hama
ELT J 62:257-265, 2008. First published 13 Apr 2007;
doi:10.1093/elt/ccm020
The full text of this article, along with updated information and
services is available online at
http://eltj.oxfordjournals.org/cgi/content/full/62/3/257
References
This article cites 8 references, 2 of which can be accessed free at
http://eltj.oxfordjournals.org/cgi/content/full/62/3/257#BIBL
Reprints
Reprints of this article can be ordered at
http://www.oxfordjournals.org/corporate_services/reprints.html
Email and RSS alerting
Sign up for email alerts, and subscribe to this journal’s RSS feeds at
http://eltj.oxfordjournals.org
PowerPoint®
image downloads
Images from this journal can be downloaded with one click as a
PowerPoint slide.
Journal information
Additional information about ELT Journal, including how to
subscribe can be found at http://eltj.oxfordjournals.org
Published on behalf of
Oxford University Press
http://www.oxfordjournals.org/
Downloaded from http://eltj.oxfordjournals.org by on 21 September 2008
Evaluating teaching practice
Shosh Leshem and Rivka Bar-Hama
The evaluation of observed lessons has been the subject of much debate in the field
of teacher training. Teacher trainers have tried to define quality in relation to
teaching and to find ways to measure it in a reliable way. Can we evaluate the
quality of teaching by observable behaviour and measurable components, in
which case, can the lesson be assessed analytically by the use of discrete criteria?
Or, does a lesson constitute an entity, which cannot be broken into discrete
components so that it has to be assessed impressionistically? We believe that in
order to construct a more comprehensive view of the issue, it is pertinent to
collaborate with our trainees and provide some space for their voices. Evidence
from a small-scale practitioner-based research project reveals that trainees need
explicit criteria for effective teaching in order to identify their strengths and
weaknesses and use them as guidelines for improvement.
Introduction
This paper presents a three-year practitioner-based research that emerged
from our ‘reflection in action’ and ‘on action’ (Schön 1983) as teacher
trainers and lecturers in EFL pre-service training programmes in a teacher
education college. In the framework of the training programme, one of the
core requirements is the practicum, which is the application of the practical
pedagogical knowledge acquired during the didactic lessons and
workshops.
In the literature, the practicum has been viewed as critical to the
development of trainees. It is their first hands-on experience with their
chosen career. It creates opportunities for trainees to develop their
pedagogical skills and ‘it is the best way to acquire professional knowledge
and competences as a teacher’ (Hascher, Cocard, and Moser 2004: 626).
During the practicum trainees can put into practice their beliefs based on
language learning theories they acquired in the course of their studies. It
also serves as a ‘protected field for experimentation’ and ‘socialization
within the profession’ (Hascher et al. ibid.) and it allows for evaluation of
teachers. Thus it sets the stage for success or failure in student teaching and
a trainee’s future in education may be determined by what happens during
their training period.
These ideas have been mainly expressed by those who design the
programmes and are in charge of pre-service teacher training. Trainees, as
well, consider the practicum experience as the most significant element in
their teacher training (Zeichner 1990). Quite often trainees claim that they
benefit more from spending time in the field watching others teach, than
E LT Journal Volume 62/3 July 2008; doi:10.1093/elt/ccm020
ª The Author 2007. Published by Oxford University Press; all rights reserved.
Advance Access publication April 13, 2007
257
from attending sessions at the university or colleges. This assertion is
supported by Tsui (2003) in her discussion on teachers’ personal values and
beliefs. She claims that teachers consider classroom experience the most
important source of knowledge about teaching.
We found that there is a plethora of literature dealing with multiple aspects
of the practicum but there is a dearth in the field of practicum assessment.
This could be described as ‘surprising’ given that assessing trainees’
practicum is a complex activity, which entails multiple sources of
assessment. Each one of these sources provides information about
a different aspect of teaching. Furthermore, assessment of the trainees’
performances in their practicum has far-reaching implications for their
entry into our profession. In order to achieve a comprehensive profile of
a trainee we, in our programme, use different sources of assessment such
as: reflective journals, portfolios, observation lessons, tests, self-assessment,
peer assessment, cooperating teacher assessment, and pedagogical
counsellor assessment. However, the final grade for the practicum is based
primarily on the grades that trainees receive for their observation lessons.
For the purpose of this study, lesson observation is viewed as a lesson taught
by a trainee and observed by a pedagogical counsellor.
The observation lesson is a critical component of the practicum. How it is
assessed reflects an equally critical issue for both evaluators and evaluees.
This issue is the focus of our paper.
The venue
There are two teacher-training E F L programmes at our college: one is a fouryear programme, which awards the students both a BEd and a teaching
certificate, and the other is a two-year certificate programme for people
holding a BA in English. A significant part of both programmes is the
practicum. The practicum entails weekly observations of trainees in schools
by teacher trainers.
At the beginning of the academic year, a trainee is placed in a host school
with an experienced English teacher, who is appointed as a cooperating
teacher. The main requirement of the trainees in the practicum is to observe
their cooperating teachers teach in their classrooms and gradually to start
teaching on their own. This usually commences after a short period of
getting acquainted with the school. The trainees are assessed informally by
their cooperating teachers who serve more as mentors than as assessors.
The formal assessment is carried out at least twice a semester by
pedagogical counsellors who are usually their methodology teachers.
In our programme, observation has two main purposes: trainees’
development and accountability. Here, development means improvement
of trainees’ performance in class by identifying their strengths and
weaknesses and by raising their awareness through providing feedback and
recommendations. This process can be regarded as formative assessment,
since the focus is more on development and progress than on the final
product itself. The second purpose, which pertains to accountability, is to
determine the trainee’s suitability for entry to the educational system.
This in itself creates conflicting perspectives concerning observation and
role identity. The message that is conveyed to trainees during the practicum
258
Shosh Leshem and Rivka Bar-Hama
is that it represents a trial and error phase which is integral to their learning
and professional development. This is intended to foster an element of
trust and openness in the trainee–observer relationship. However, this trust
can be impeded by the observer having to act as an inspector and final
assessor. Trainees may put on an act in order to satisfy the observer’s
expectations and gain a higher grade for their conduct. If this happens,
then they may sacrifice their own development and rapport with their
observer. These contradicting roles of the observer constitute potential
problems not only for the trainee but for the observer as well. The latter may
feel forced into a situation of assessor due to institutional policy or, at
times, national demands, when their preferred tendency is to function as
a coach rather than as an assessor.
Pedagogical counsellors use different observational tools to record data of
the lessons that they observe. The most common tools are:
1
2
3
4
observation forms;
detailed written notes on the lesson;
audio-recordings for reinforcement of written notes; and
video-recordings for use collaboratively by the trainer and the trainee
during the feedback session. They are sometimes used by the trainee at
a later stage for further reflection.
Our main tool of assessment is the observation form that consists of
several components. Examples for each component are provided to show
a model of what the forms entail:
n instructional components: clarity of instructions, sequence of activities,
and classroom management;
n affective components: giving feedback and reinforcement, awareness of
students’ needs;
n language components: use of L1, oral, and written proficiency;
n cognitive components: lesson planning, stating clear objectives, and
designing activities to achieve lesson objectives; and
n metacognitive components: ability to analyse the lesson and to reflect
upon their professional development.
We are both veteran teacher trainers, department coordinators, and have
been counsellors in a wide range of contexts. From our professional
experience we realized that the observation forms that were used for
assessment were changed from year to year both by us and by our
colleagues. Analysing minutes from three years of departmental meetings,
we noticed that the issue of the assessment forms appeared regularly on the
agenda as a theme requiring modification. Some items were changed due to
different approaches, beliefs, worldviews, or experiences of the teachers
teaching a particular group that year. However, the changes were not
significant and the essence of the evaluation forms has remained the same.
We then analysed our personal diaries where we had recorded comments
from trainees and our own queries and impressions. Common comments
from trainees expressed operational constraints due to a particular school
culture, methodological obligations to the cooperating teacher’s style of
teaching, and dissatisfaction with grades. This evidence made us ponder
upon the issue with our colleagues. We discovered that they shared our
Evaluating teaching practice
259
discontent about the way that trainees’ performance was assessed during
the observation lesson. The feeling that prevailed among us was that, as
experienced observers and assessors, we were able to provide an
impressionistic value judgement of the trainee’s performance. However,
when we assessed the lesson according to the benchmarks on the
assessment form, we realized that quite often there was a gap between the
two results. Three of our colleagues who shared the same professional
experience expressed the ‘gap’ as follows:
Observer 1
While observing I already formulate a grade in my mind. I know that this
lesson does not deserve more than 80 percent, for example. At the end of the
lesson I go over the assessment form and grade each item according to
the weight allocated. If there are incongruities with my grading, I try
to narrow the gap.
Observer 2
I have enough experience to know immediately after the lesson what the
grade is going to be. I personally don’t really need the criteria and would
have preferred to ignore them. However, as I am required to provide
a detailed assessment record, I use it and I often get annoyed with the fact
that I can’t find the criteria that I would like to grade the student on, or I find
some of the criteria irrelevant to the context and to my frame of reference.
Observer 3
I have to admit that initially I determine the grade during observation or
immediately after that. When I use the assessment sheet, I find that the
grade is usually higher. I feel that I cannot take off all the points for a certain
criterion and this leads to an accumulated higher grade.
These views reinforced our problem in accepting the reliability of
assessment in the observation lesson. Taking into consideration the critical
role of the observation lesson in the practicum and in students’ professional
careers, we felt that it was our responsibility to try and assess our trainees in
a way that reflected their performance accurately, reliably, and transparently.
In addition, we realized that the voices of the trainees concerning this issue
were not considered and decisions on assessment were top-down. We
believed that in order to construct a more comprehensive view of the issue, it
was pertinent to collaborate with our trainees and provide some space for
their voices (Nunan and Bailey 1996). Moreover, new trends in current
assessment demand active student participation in their assessment. This is
reinforced by Shohamy (1996) discussing ethical testing and assessment,
who sees a need for students to participate actively in the construction and
use of tests and assessment systems.
Another problem is that despite each assessor having similar criteria against
which to assess the lesson, their interpretation of those criteria is not always
identical. Each lesson is assessed by three people: the cooperating teacher,
the pedagogical counsellor, and the trainees themselves. However, the
weight and the importance allotted by the college to the various assessors are
not evenly distributed. Each of the three assessors makes significant
contributions to the developmental process of the individual teacher.
260
Shosh Leshem and Rivka Bar-Hama
In terms of the teacher’s assessment for the purpose of accountability, the
pedagogical counsellor undertakes most of the responsibility and has the
final say in grading the trainee while the others can only slightly affect the
grade. The observation lesson is considered a high stake test by the trainees
and at times puts them under the tremendous pressure of a major test. It
also entails conflicting decisions concerning whose theories to implement,
their pedagogical counsellor’s, their cooperating teacher’s, or their own.
This led us to investigate the following issues:
1 To what extent are we actually assessing quality of teaching through
observation?
2 What are the perceptions of our trainees regarding the way of
assessment?
Exploring the
literature
While surveying the literature we found unsettled perspectives on issues
that underpin our questions. There is a general consensus about the
importance of observation in the development and assessment of a teacher.
This notion is also supported by O’Leary (2004: 14) who claims that
‘Traditionally, classroom observation has occupied a prominent role in
terms of its use as a tool by which to judge and subsequently promote good
practice’. He also advocates a holistic way of assessing. He contends that
‘although it would be naı̈ve to discount classroom observation per se as
a useful learning tool for teacher development . . . the existing assessment
approach contains a number of inadequacies that directly conflict with the
fundamental aims of genuine teacher development’. One of his objections
is to the assessment of a teacher’s ability by using a checklist of subjective
criteria. He supports his contention by claiming that:
1 a lesson is a complete entity and cannot be dissected into separate parts;
2 criteria for effective teaching differ for every instructional situation;
3 checklists measure low inference skills and these are limited because
they tell us very little about teacher behaviour and the learning process
itself;
4 effective teaching manifests itself in high inference skills, which are
fundamentally qualitative;
5 adopting a quantitative approach is discouraging and undermining to
teachers.
Voices contradicting this approach maintain that observations tend to be
subjective, based on the observer’s own teaching approach. To attain
objectivity it is argued that we have to develop systematic observation tools.
Acheson and Gall (1997) reflect students’ feeling of being threatened when
they are unaware of the criteria by which they will be judged, thus defined
criteria should be provided to lower the level of anxiety among students.
In the same vein, Brooker, Muller, Mylonas, and Hansford (1998) claim that
an increased demand for quality and accountability in teacher education
programmes requires a criterion-based standard reference framework
for assessment.
Leung and Lewkowicz (2006: 27) highlight the point of subjective
interpretation and contend that due to the fact that ‘teachers can interpret
assessment criteria differently, the idea that teachers should observe what
Evaluating teaching practice
261
learners say and do, interpret their work, and then provide guidance for
improvement is an uncertain business’. Moreover, they claim that ‘teachers’
judgements are influenced by wider social and community practices and
values’ and therefore might lead to different perspectives. As we consider
the observation lesson to be a performance test, we found McNamara’s
(1995) point relevant to our argument even though he does not refer to
observation lessons. His assertion is that performance tests that strive to be
highly authentic are often extremely complex due to the extraneous social
influences on the grade awarded.
We also realized that there is much concern about the reliability of
examination scores as determinants of teaching qualifications. Alderson
(1991: 12) refers to the fact that ‘we know little about how to measure the
progress of learners . . . and that we lack sensitive measures’. Broadfoot
(2005: 127) is even more extreme in his assertion and claims that ‘we use
what are a very blunt set of instruments to intervene in the highly sensitive
and complex business of learning’.
As a result of these diverse views, going to the literature was a journey of
mixed blessings. It supported our sense of discomfort and it became
apparent to us that our problem warranted attention.
The study
Data were retrieved from questionnaires, interviews, personal diaries, and
documents that included minutes from meetings and assessment forms.
A questionnaire was designed to explore the preferences that students had
towards how they might be assessed. We drew upon our involvement with
the assessment process to draft a ‘simple survey’ with two closed questions
and one open-ended question. To aid completion, the choices that were
provided reflected the issues that trainees had mentioned to us regularly.
The three questions were:
1 How would you like your pedagogical counsellor to assess your
observation lesson? By giving you a fail/pass or a numerical grade?
2 If you chose a numerical grade, how would you like to be assessed:
analytically or holistically?
3 Which items on the observation form would you omit and which would
you like to add?
We explained to each group that the term ‘holistically’ implied assessing
impressionistically by looking at the lesson as a whole, and that ‘analytically’
implied using set criteria to assess numerically each aspect of the lesson.
The questionnaire was distributed to trainees of two T E F L courses at
a teacher training college. The timing of this corresponded with the end of
the academic year when trainees had already finished their practice teaching
duties.
The interviews with twenty trainees were conducted after the
questionnaires were read and analysed. We concluded from this analysis
that it was important to gain a wider set of trainee perspectives and achieve
a richer picture of the trainees’ reasoning. Thus, we discussed the general
responses that had been provided to the questionnaire with twenty
randomly chosen trainees.
262
Shosh Leshem and Rivka Bar-Hama
Population
The study was undertaken with 58 trainees studying on two different
programmes:
1 A four-year Bachelor of Education programme in an English department
of an Academic teacher training college in Israel. The subjects were
trainees from the second and third years. Trainees of this group pursue
a study programme, which certifies them to teach both general subjects
in the trainees’ mother tongue (Hebrew or Arabic) and English as a
second/foreign language.
2 Second year trainees on a two-year retraining programme. Trainees in
this group hold a BA degree and study for a teaching certificate in English.
These trainees are usually older than those on the BEd programme.
The subjects constituted three groups:
1 second year trainees from group A,
2 third year trainees from group A, and
3 second year trainees from group B.
Findings
In these findings, none of the groups wanted a verbal grade of ‘pass’ or ‘fail’.
All three groups preferred a numerical grade. Two groups (1 and 3) favoured
holistic assessment for different reasons. Group 3 preferred this form of
assessment as they felt they did not need the criteria to analyse the lesson.
They claimed to be competent enough to analyse their lesson and reflect
upon it independently without specific criteria. By that time in their training
they were much more confident in their teaching and assessment.
Group 1 chose the holistic approach for the opposite reasons. They justified
their choice by lack of confidence and fear. They felt intimidated by the use
of clear-cut criteria to analyse their lesson. They actually preferred the
unspecified nature of the holistic approach to a lesson being dissected by
specific teacher behaviours.
Group 2 chose the analytical approach. They explained that they saw the
function of the criteria as guidelines to help them focus and construct better
lessons. They claimed that the criteria helped them identify weaknesses and
strengths and thus contributed to their pedagogical knowledge and their
professional development. In terms of assessment, they felt that this
approach was more reliable since assessing according to set criteria is more
objective.
Evidence from the interview showed how trainees’ voices reflected their
choices:
In favour of specified criteria on the observation forms
The items on the form helped me remember what was discussed when
I had to write a reflective journal on my lesson. I find them very useful.
They were like post signs for me. The whole form is like an outline for a
lesson plan.
It gives me a clear picture of what was good and what needs to be worked on.
It really gives you a picture where you are and what to focus on next time.
The criteria help you see the process. I can compare the form of my first
observation and the second one and know exactly where I improved.
Evaluating teaching practice
263
It gives you a fairer picture of the evaluation. I do not like vagueness. I have
to see how many points have been taken off or given for each item.
Not in favour of specified criteria on the observation forms
There are too many details to process. I can’t focus on all the items. It
confuses me. I would rather focus on one or two features of the lesson.
The criteria should be more general and not so detailed. It is too technical
and robot like. I feel as if my lesson has been put under a microscopic lens
and it does not really depict the dynamics of the lesson.
The following were some of the suggestions from the open-ended question:
1
2
3
4
Insights and
conclusions
Acknowledgement within the items of originality and risk taking.
Credit for preparing extra time activities in their lesson plan.
Evidence of improvement from previous observations.
Awareness of the teacher’s action zone.
Teaching is a web of interrelated dimensions. Some are clearly observable
and others are not. As a consequence, the assessment of teaching quality
through observation entails an internal paradox. This paradox encapsulates
our initial urge to re-examine our own practice. Our research questions
related to the extent to which quality of teaching is assessed through
criteria-based observation and we found that our students felt that it was
a valid method of assessment.
Although all trainees voted for numerical assessment, there were
differences between trainees in the choice between holistic or analytical
approaches, with the majority choosing the holistic approach. The fact that
none of our subjects chose the ‘fail’ or ‘pass’ as evaluation criteria accords
with Kennedy’s assertion that trainees prefer to receive a numerical
grade for the observed lesson (Kennedy 1993). This may be a result of
conditioning, of trainees’ upbringing, and the constraints of social demands
and norms. However, a numerical grade on its own did not seem to be
satisfactory, as it did not provide explicit feedback on their performance. The
trainees who were in favour of the holistic approach needed the stated
criteria on the assessment form to aid discussion during feedback sessions
and to provide signposts for further reflection. Yet, they did not want to be
assessed analytically where each criterion was allotted numerical points, in
spite of this approach enhancing reliability and transparency.
Our small-scale investigation demonstrated that trainees at their initial
stages of teaching perceive the lesson as separate parts and not as a whole
entity. The sum of the parts represents quality of teaching. Trainees need
explicit criteria for effective teaching in order to identify the quality of their
teaching. Their preferences for assessment show that they regard the
observation lesson as both a test and a means for reflection and professional
development.
These conclusions are situated in the limited context of just one practicum
experience, thus they cannot have wide implications. However, as teachers
researching our own field of practice, we gained deeper understanding
and insights into a troublesome issue. Our findings represent insights of
264
Shosh Leshem and Rivka Bar-Hama
an exploratory nature and they support the claim that quality and
accountability should be achieved through explicit and objective criteria.
Final version received October 2006
References
Acheson, K. A. and M. D. Gall. 1997. ‘Techniques in
clinical supervision of teacher’ in E. Pajak.
Approaches to Clinical Supervision: Alternatives for
Improving Instruction. Norwood: ChristopherGordon Publishers, Inc.
Alderson, C. 1991. ‘Language testing in the 1990’s:
how far have we come? How much further have we
to go?’ in S. Anivan (ed.). Current Developments in
Language Testing. Singapore: S EA M E O Regional
Language Center.
Broadfoot, P. 2005. ‘Dark alleys and blind bends:
testing the language of learning’. Language Testing
22: 123–41.
Brooker, R., R. Muller, A. Mylonas, and B. Hansford.
1998. ‘Improving the assessment of practice
teaching: a criteria and standards framework’.
Assessment and Evaluation in Higher Education 23/1:
5–25.
Hascher, T., Y. Cocard, and P. Moser. 2004. ‘Forget
about theory—practice is all? Student teachers’
learning in practicum’. Teachers and Teaching: Theory
and Practice 10/6: 623–37.
Kennedy, J. 1993. ‘Meeting the needs of the teacher
trainees on teaching practice’. E LT Journal 47/2:
157–65.
Leung, C. and J. Lewkowicz. 2006. ‘Expanding
horizons and unresolved conundrums: language
testing and assessment’. T ES O L Quarterly 40/1:
211–34.
McNamara, T. 1995. ‘Modelling performance:
opening Pandora’s box’. Applied Linguistics 16/2:
150–79.
Nunan, D. and K. M. Bailey (eds.). 1996. Voices from
the Language Classroom. Cambridge: Cambridge
University Press.
O’Leary, M. 2004. ‘Inspecting the observation
process: classroom observations under the
Evaluating teaching practice
spotlight’. IAT E F L Teacher Development SIG 1/4:
14–16.
Schön, D. 1983. The Reflective Practitioner.
San Francisco: Jossey-Bass.
Shohamy, E. 1996. ‘Language testing: matching
assessment procedures with language knowledge’
in M. Birenbaum and F. J. R. C. Dopchy (eds.).
Alternatives in Assessment of Achievements, Learning
Processes, and Prior Knowledge. Dordrecht,
Netherlands: Kluwer Academic.
Tsui, B. M. 2003. Understanding Expertise in Teaching:
Case Studies of Second Language Teachers. Cambridge:
Cambridge University Press.
Zeichner, K. M. 1990. ‘Changing directions in the
practicum: looking ahead to the 1990’s’. Journal of
Education for Teaching 16/2: 105–32.
The authors
Shosh Leshem is involved in teaching and teacher
education in Israel. Her publications are in the area
of teacher training and language teaching
methodology. She is currently teaching at Haifa
University and Oranim, Academic School of
Education. She is also a visiting lecturer at Anglia
Ruskin University in the UK, focusing on doctoral
processes from an ethnographic perspective.
Email: shosh-l@zahav.net.il
Rivka Bar-Hama is involved in teaching and teacher
education in Israel. Her publications are in the area
of teaching English as a foreign language and
teacher training and focus on testing and
assessment. She taught at Haifa University and is
currently head of the English Department at Gordon
Academic College of Education.
Email: rivkab@macam.ac.il
265