In a recent article, I described why integrative testing is a better way of testing language competence than discrete-point testing. An integrative test draws on a variety of sources. Syntax, vocabulary, "schema," cultural awareness, reading skills, pronunciation and grammar are all factors the test-maker and test-taker need to keep in mind. The integrative test is generally considered to be a more reliable instrument for measuring language competence.

In a follow-up article, I offered a variety of possibilities for testing. Let's examine these tests even further.

Special considerations

In schools where ESL students are mainstreamed, allowing the students to take the test orally may be helpful so the teacher can find out their proficiency in the subject matter and not just language. Certain keywords in a word problem may be new to the L2 students. In that case one can substitute synonyms without compromising the test.

For example, a science test may use "vital" and the instructor can substitute "important" or "necessary." The most successful tests combine "global" testing with discrete point testing in order to achieve a balance between ease of administration and reliability. The main point is for the test-maker to provide interesting material that tests the students' cognitive ability rather than just their skills:

  • The options should not overlap.
  • The distractors should have some link to the text but should not overlap. If the distracters had no links then the test-takers could just eliminate them. The distracters must be plausible.
  • Sometimes a test can take precedence over language learning.

However, the mandatory nature of the test leads to an unhealthy obsession among international students with TOEFL — TOEFL is of limited ability. It tests how fast you can read some grammar or some vocabulary, but it doesn't help you with the real problems you have to face once you've passed it.

Student criticism

A student writes his comments on standardized testing as a hindrance to education and a reversal of educational development. He feels they stop creativity: "These tests kill incentive to think outside the box due to the fact that we are forced to think in a “standard” rational manner."

Standardized testing may result in shifting the curriculum goals to higher test scores: "It seems that we have shifted the overall focus of free thinking, and individual development, to a more narrow-minded form of teaching."

Types of tests

1. Language aptitude tests measure the potential for second-language learning. They require the potential language-learner to solve problems in an artificial language.

2. Achievement tests are diagnostic tests that measure the learners' progress at any given time. Normally, they are given weekly over a clearly defined set of material. Most are used mainly in academic settings for grades and to provide motivation.

3. A pure diagnostic test is one not used for grading but rather for finding out specific points of weakness in the language learners.

4. Proficiency tests are based on performance and measure ability to use language in a realistic situation. Rather than just testing knowledge of grammar or ability to understand spoken language, a proficiency test will test skill by requiring the test-taker to perform functions (making a hotel reservation for example) in the target language. The Oral Proficiency Interview developed by the Foreign Service Institute and adapted by ACTFL (the American Council of Teachers of Foreign Languages) is an example of a proficiency test.

5. A placement test is used to select students for specific levels of language instruction. The placement test is a norm-referenced test meaning it tests overall English proficiency, academic listening ability, reading comprehension and structure and is interpreted relative to the scores of all other students who took the test.

6. A criterion-referenced test, on the other hand, measures well-defined and specific instructional objectives that are part of the curriculum past perfect, for example.

Validity and reliability

A test is more than a set of exercises or problems. Any test needs to be valid that is, it should test what it is supposed to. In other words, it must be relevant to the goals of the course and measure what it claims to measure. A test on writing would not be given to a conversation class, for example.

The test can measure linguistic ability, communicative competence or both factors. A test also must be reliable in that it should be precise, and it should always give the same results. A test that can be passed by good guessers would not be reliable because the results will not be consistent and will not be related to ability in the target language.

Tests can also be realistic in that they approach a measure of authentic language use, or they may be completely abstract in that the task at hand has no apparent function. An abstract test question asks for a mechanical task: Change "It is cold." to the past tense, for example.

Abstract test questions treat language not as a tool for communication but as an abstract entity that can be changed by certain rules. Here is an example of an overlap from a reading test where the passage was about a drought:

  • a. The region suffered severe drought. (Key)
  • b. The climatic conditions were harsh.

This is a logical overlap since the key is encompassed in the distracter.

Better examples

We usually _____ dinner at that restaurant.

  • a. have eaten (wrong)
  • b. drink (wrong meaning)
  • c. eat (right)
  • d. take (not the best, wrong idiom for English)

Our building has 10 stories; my apartment is on the _____.

  • a. ten (wrong form)
  • b. last (not the best)
  • c. tenth (right)
  • d. twelfth (right form, semantically wrong)

Finally, while using discrete-point tests, the test-giver needs to avoid a few common pitfalls. An item should not go over onto the next page lest the learners break their train of thought. Blanks for answers should be long enough so students will not have to write all over the paper or on the back.

Copies should be clear, especially if the learners' native language has a different alphabet. The numbering of the items and options should be checked to be sure none are missing or there is no duplication of numbers (two No. 8's, for example). It should be made clear to the students that if two answers appear correct, then they must choose the better answer.