Tugas language assessment: SUMMARY ASSESSING READING AND WRITING

ASSESSING READING

In foreign language learning, reading is likewise a skill that teacher simply expect learners to acquire.. Basic, beginning-level textbooks in a foreign language presuppose a student's reading ability if only because it's a book that is the medium. Most formal tests use the written word' as a stimulus for test-taker response; even oral interviews may require reading performance for certain tasks. For learners of-English-, two primary hurdles must be cleared in order to become efficient readers. First, they need to be 'able to master fundamental bottom-up strategies for processing separate letters, words, and phrases, as well as top-down, conceptually driven strategies for comprehension. Second, as part of that top-down approach, second language readers must develop appropriate content and formal schemata-background information and cultural experience-to carry out those interpretations effectively.

experience-to carry out those interpretations effectively. The assessment of reading ability does not end with the measurement of comprehension. Strategic pathways to full understanding are often important factors to include in assessing learners, especially in the case of most classroom assessments that are formative in nature. As we consider a number of different types or genres of written texts, the components of reading ability, and specific tasks that are commonly used in the assessment of reading, let's not forget the unobservable nature of reading. Like listening, one cannot see the process of reading, nor can one observe a specific product of reading.

TYPES {GENRES} OF READING

Each type or genre of written text has its own set of governing rules and conventions. A reader must be able to anticipate those conventions in order to process meaning efficiently. Efficient readers also have to know what their purpose is in reading a text, the strategies for accomplishing that purpose, and how to retain the information. The content validity of an assessment procedure is largely established through the genre of a text. For example, if learners in a program of English for tourism have been learning how to deal with customers needing to arrange bus tours, then assessments of their ability should include guidebooks, maps, transportation schedules, calendars, and other relevant texts.

MICROSKILLS, MACROSKILlS, AND STRATEGIES FOR READING

The micro- and macroskills below represent the spectrum of possibilities for objectives in the assessment of reading comprehension.Micro- and macroskills for reading comprehension.

Micro- and macroskills for reading comprehension

Discriminate among the distinctive graphemes and orthographic patterns of English.
Retain chunks of language of different lengths in short-term memory.
Process writing at an efficient rate of speed to suit the purpose.
Recognize a core of words, and interpret word order patterns and their sign ificance.
Recognize grammatical word classes (nouns, verbs, etc.), systems (e.g., tense, agreement, pluralization), patterns, rules, and elliptical forms.
Recognize that a particular meaning may be expressed in different grammatical forms.
Recognize cohesive devices in written discourse ahd their role in signaling the relationsh ip between and among clauses. Macroskills
Recognize the rhetorical forms of written discourse and their significance for interpretation.
Recognize the communicative functions of written texts, according to form and purpose. _
Infer context that is not explicit by using background knowledge.
From described events, ideas, etc., infer links and connections between events, deduce causes and effects, and detect such relations as main idea supporting idea, new information, given information, generalization, andexample information.
Distinguish between literal and implied meanings.
Detect culturally specific references and interpret them in a context of the appropriate cultural schemata.
Develop and use a battery of reading strategies, such as scanning and skimming, detecting discourse markers, guessing the meaning of wordsfrom context, and activating schemata for the interpretation of texts.

TYPES OF READING

Nevertheless, for considering assessment procedures, several types ofreading performance are typically identified, and these will serve as organizers of various assessment tasks.

Perceptive. In keeping with the set of categories specified for listening comprehension, similar specifications are offered here; except with some differing terminology to capture the uniqueness of reading.
Selective. This category is largely an artifact of assessment formats. In order to ascertain one's reading recognition of lexical, grammatical, or discourse features
of language within a very short stretch of language.
Interactive. Included among interactive reading types are stretches of language of several paragraphs to one page or more in which the reader must, in a psycholinguistic sense, interact with the text.
Extensive. Extensive reading, as discussed in this book, applies to texts of more than a page, up to an<:t including professional articles, essays, technical reports, short stories, and books. (It should be noted that reading research commonly refers to "extensive reading" as longer stretches of discourse, such as long articles and books that are usually read outside :t classroom hour.

DESIGNING ASSESSMENT TASKS: PERCEPTIVE READING

Reading Aloud

The test-taker sees separate letters, words, and/or short sentences and reads them aloud, one by one, in the presence of an administrator. Since the assessment is of reading comprehension, any recognizable oral approximation of the target response is considered correct.

Written Response

The same stimuli are presented, and the test-taker's task is' to reproduce the probe in writing. Because of the transfer across different skills here, evaluation of the test taker's response

must be carefully treated. If an error occurs, make sure you determine its source; what might be assumed to be a writing error, for example, may actually be a reading error, and vice versa.

Multiple-Choice

Multiple-choice responses are not only a matter of choosing one of four or five possible answers. Other formats, some of which are especially useful at the low levels Of reading, include same/different, circle the answer, true/false, choose the letter, and matching.

Picture-Cued Items

Test-takers are shown a picture, such as the one on the next page, along with a written text and are given one of a number of possible tasks to perform.

DESIGNING ASSESSMENT TASKS: SECTIVE READING

Multiple-Choice (for Form-Focused Criteria)

By far the most popular method of testing a reading knowledge of vocabulary and grammar is the multiple-choice format, mainly for reasons of practicality: it is easy to administer and can be scored quickly.

Matching Tasks

The most frequently appearing criterion in matching procedures is vocabulary. Matching tasks have the advantage of offering an alternative to, traditional . multiple-choice or flJ.1-in-the-blank formats and are sometimes easier to construct than multiple-choice items, as long as the test designer has chosen the matches carefully. Some disadvantages do come with this framework, however. They can become more of a puzzle-solving process than a genuine test of comprehension as test-takers struggle with the search for a match, possibly among 10 or 20 different items. Like other tasks in this section, they also are contrived exercises that are endemic to academia that will seldom be found in the real world.

Editing Tasks

Editing for grammatical or rhetorical errors is a widely used test method for assessing linguistic competence in reading. The TOEFL® and many "other tests employ this technique with the argument that it not only focuses on grammar but also, introduces a simulation of the. authentic task of editing, or discerning errors in written passages. Its authenticity may be supported if you consider proofreading as a real-world skill that is being tested.

Picture-Cued Tasks

Several types of picture-cued methods are commonly used.

Test-takers read a sentence or passage and choose one of four pictures that is being described. The sentence (or sentences) at this level is more complex.
Test-takers read a series of sentences or definitions, each describing a labeled part of a picture or diagram. Their task is to identify each labeled item.

Gap-Filling Tasks

The obvious disadvantage of this type of task is its questionable assessment of reading ability. The task requires both reading and writing performance, thereby rendering it of 19w validity in isolating reading as the sole criterion. Another drawback is scoring the variety of creative responses that are likely to appear. You will have to make a number of judgment calls on what comprises a correct response.

DESIGNING ASSESSMENT TASKS: INTERACTIVE READING

Cloze Tasks

One of the most popular types of reading assessment task is the cloze procedure.

The word cloze was coined by educational psychologists to capture the Gestalt psychological concept of "closure," that is, the ability to fill in gaps in an incomplete image (visual; auditory, or cognitive) and supply (from background schemata) "omitted details. Cloze tests are usually a minimum of two paragraphs in length in order to account for' discourse expectancies. They can be constructed relatively easily as long as the specifications for choosing deletions and for scoring are clearly defined. Typically every seventh word (plus or minus two) is deleted (known as fixed-ratio deletion), but many cloze test designers instead use a rational deletion procedure 11, of choosing deletions according to the grammatical or discourse functions of the words. Rational deletion also allows the designer to avoid deleting words that would be difficult to predict from the context.

Two approaches to the scoring of cloze tests are commonly used. The exact word method gives credit to test-takers only if they insert the exact word that was originally deleted. The second n1ethod, appropriate word scoring, credits the test taker for supplying any word that is grammatically correct and that makes good sense in the context. In the sentence above about the "gorgeous sunset," the test takers would get credit for supplying beautiful, amazing, and spectacular. The choice between the two methods of scoring is one of practicality/reliability vs. face validity.

Impromptu Reading Plus Comprehension Questions

Notice that this set of questions, based on a 250-word passage, covers the comprehension of these features:

main idea (topic)
expressing/idioms phrases in context
inference (implied detail)
grammatical features
detail (scanning for a specifically stated detail)
excluding facts not written (unstated details)
supporting idea(s)
vocabulary in context

To construct your own assessments that involve short reading passages followed by questions, you can begin with TOEFL-like specs as a basis. Your focus in your own classroom will determine which of these-and possibly other specifications-you will include in your assessment procedure, how you will frame questions, and how much weight you will give each item in scoring.

Short-Answer Tasks

A reading passage is presented, and the test-taker reads questions that must be answered in a sentence or two. Questions might cover the same specifications indicated above for the TOEFL reading, but be worded in question form. For example, in a passage on the future of airline travel, the following questions might appear:

Open-ended reading comprehension questions

What do you think the main idea of this passage is?
What would you infer from the passage about the future of air travel?
In line 6 the word sensation is used. From the context, what do you think this word means?
What two ideas did the writer suggest for increasing airline business?
Why do you think the airlines have recently experienced a decline

Editing (Longer Texts)

Several advantages are gained in the longer format. First, authenticity is increased. The likelihood that students in English classrooms will read connected prose of a page or two is greater than the likelihood of their encountering the contrived format of unconnected sentences. Second, the task simulates proofreading one's own essay, where it is imperative to find and correct errors. And third, if the test is connected to a specific curriculum (such as placement into one of several writing courses), the test designer. can draw up specifications for a number of grammatical and rhetorical categories that match the content of the courses. Content validity is thereby supported, and along with it the face validity of a task in which students are willing to invest. Imao (2001, p. 185) was able to offer teachers a computer-generated breakdown of performance in the following categories:

Sentence structure
Verb tense
Noun/article features
Modal auxiliaries
Verb complements
Noun clauses
Adverb clauses
Conditionals
Logical connectors
Adjective clauses (including relative clauses)
Passives

Scanning

Scanning is a strategy used by all readers to fmd relevant information in a text. Assessment of scanning is carried out by presenting test-takers with a text (prose or something in a chart or graph format) and requiring rapid identification of relevant bits of information. Possible stimuli include ,r'

a one- to two-page news article,
an essay,
a chapter in a textbook,
a technical report,
a table or chart depicting some research fmdings,
a menu, and
an application form

Among the variety of scanning objectives (for each of the genres named above), the test-taker must locate

a date, name, or place in an article;
the setting for a narrative or story;
the principal divisions of a chapter;
the principal research fmding in a technical report;
a res1.I,lt reported in a specified cell in a table;
the cost of an item on a menu; and
specified data needed to fill out an application.

Ordering Tasks

Students always enjoy the activity of receiving little strips of paper, each with a sentence it, anti assembling them into a story, sometimes called the "strip story" technique. Variations on this can serve :as an assessment of overall global understanding of a story and of the cohesive devices that signal the order of events or ideas. Alderson et al. (1995, p. 53) warn, however, against assuming that there is only one 'logical order.

Information Transfer: Reading Charts, Maps, Graphs, Diagrams

Every educated person must be able to comprehend charts, maps, graphs, calendars, diagrams, and the like. Converting such nonverbal input into comprehensible intake requires not only an understanding of the graphic and verbal conventions of themedium but also a linguistic ability to interpret that information to someone else. Reading a map implies understanding the conventions of map graphics, but it is often accompanied by telling someone where to turn, how far to go, etc. Scanning menu requires an ability to understand the structure of most menus as well as the capacity to give an order when the time comes. Interpreting the numbers on a stock market report involves the interaction of understanding the numbers and of conveyingthat understanding to others. To comprehend information in this medium (hereafter referred to simply as "'graphics"), learners must be able to

comprehend specific conventions of the various types of graphics;
comprehend labels, headings, numbers, and symbols;
comprehend the possible relationships among elements of the graphic; and
make inferences that are not presented overtly.

DESIGNING ASSESSMENT TASKS: EXTENSIVE READING

Extensive reading involves somewhat longer texts than we have been dealing with up to this point. Journal articles, technical reports, longer essays, short stories, and books fall into this category. The reason for placing such reading into a separate category is that reading of this type of discourse almost always involves a focus on meaning using mostly top-down processing, with only occasional use of a targeted bottom-up strategy. Before examining ,a few tasks that have proved to be useful in assessing extensive reading, it is essential to note that a number of the tasks described in previous categories can apply here. Among them are

impromptu reading plus comprehension questions,
short-answer tasks,
editing,
scanning,
ordering,
information transfer, and
interpretation (discussed under graphics),

Skimming Tasks

Skimming is the process of rapid coverage of reading matter to determine its gist or main idea. It is a prediction strategy used to give a reader a sense of the topic and purpose of a text, the organization of the text, the perspective or point of view of the writer, its ease or difficulty, and/or its usefulness to the reader.

Summarizing and Responding

Evaluating summaries is difficult. (2001) used four criteria for the evaluation of a summary:

Criteria for assessing a summary (lmao, 2001, p. 184)

Expresses accurately the main idea and supporting ideas.
Is written in the student's own words; occasional vocabulary from the original text is acceptable.
Is logically organized.
Displays facility in the use of language to clearly express ideas in the text

As you can readily see, a strict adherence to the1critetjon ;of assessing reading, and reading only, implies consideration of only the frrst factor; the other three pertain to writing performance. The first criterion is nevertheless a crucial factor; otherwise the reader-writer could pass all three of the other criteria with virtually no understanding of the text itself.

Note-Taking and Outlining

Finally, a reader's comprehension of extensive texts may be assessed through an evaluation of a process of note-taking and/or outlining. Because of the difficulty of controlling the conditions and time frame for both these techniques, they rest firmly in the category of informal assessment. Their utility is in the strategic training that learners gain in retaining information through marginal notes that highlight key information or organizational outlines that put supporting ideas into a visually manageable framework. A teacher, perhaps in on one conferences-with students, can use student notes/outlines as indicators of the presence or absence of effective reading strategies, and thereby point the learners in positive directions.

In his introduction to Alderson's (2000, p. xx) book on assessing reading, Lyle Bachman observed: "Reading, through which we can access worlds of ideas and feelings, as well as the knowledge of the ages and visions of the future, is at once the most extensively researched and the most enigmatic of then so-called language skills." It's the almost mysterious "psycholinguistic guessing game" (Goodman, 1970) of reading that poses the enigma. We still have much to learn about how people learn to read, and especially about how the brain accesses, stores, and recalls visually represented language. This chapter has illustrated a number of possibilities for assessment of reading across the continuum of skills, from basic letter/word recognition to the retention of meaning extracted from vast quantities of linguistic' symbols. I hope it will spur you to go beyond the confines of these suggestions and create your own methods of assessing reading.

ASSESSING WRITING

In the field of second language teaching, only a half-century ago experts were saying that writing was primarily a convention for recording speech and for reinforcing grammatical and lexical features of language. Now we understand the uniqueness of writing as a skill with its own features and conventions. We also fully understand the difficulty of learning to write "well in any language, even in our own native language. Every educated child in developed countries learns the rudiments of writing in his or her native language, but very few learn to express themselves clearly with logical, well-developed organization that accomplishes an intended purpose. And yet we expect second language learners to write coherent essays with artfully chosen rhetorical and discourse devices.

TYPES OF WRITING PERFORMANCE

Four categories of written performance that capture the range of written production are considered here. Each category resembles the categories defined for the other three skills, but these categories, as always, reflect the uniqueness of the skill area.

Imitative. To produce written language, the learner must attain skills in the fundamental, basic tasks of .writing letters, words, punctuation, and very brief sentences.
Intensive (controlled). Beyond the fundamentals of imitative writing are skills in producing appropriate vocabulary within a context, collocations and idioms, and correct grammatical features up to 'the length of a: sentence
Responsive. Here, assessment tasks require learners to perform at a limited discourse level, connecting sentences into a paragraph and creating a logically connected sequence of two or three paragraphs.
Extensive. Extensive writing implies successful management of all the processes and strategies of writing for all purposes, up to the length of an essay, a term paper, a major research project report, or even a thesis.

MICRO- AND MACROSKII.IS OF WRITING

Micro- and macroskills of writing

,Microskills

Produce graphemes and orthographic patterns of English.
Produce writing at an efficient rate of speed to suit the purpose.
Produce an acceptable core of words and use appropriate word order patterns.
Use acceptable grammatical systems (e.g., tense, agreement, pluralization), patterns, and rules.
Express a particular meaning in different grammatical forms.
Use cohesive devices in written discourse. Macroskills
Use the rhetorical forms and conventions of written discourse.
Appropriately accomplish the communicative functions of written texts according to form and purpose.
Convey links and connections between events, and communicate such relations as main idea, supporting idea, new information, given information, generalization, and exemplification.
Distinguish between literal and implied meanings when writing.
Correctly convey culturally specific references in the context of the written text.
Develop and use a battery of writing strategies, such as accurately assessing the audience's interpretation, using prewriting devices, writing with fluency in the first drafts, using paraphrases and synonyms, soliciting peer and instructor feedback, and using feedback for revising and editing.

DESIGNING ASSESSMENT TASKS: IMITATIVE WRITING

Tasks in [Hand] Writing Letters, Words, and Punctuation

First, a comment should be made on the increasing use of personal and laptop computer and handheld instruments for creating written symbols. limited variety of types of tasks are commonly used to assess a person's ability to produce written letters and symbols. A few of the more common types are described here:

Copying. There is nothing innovative or modern about directing a test-taker to copy letters or words
listening cloze selection tasks. These tasks combine dictation with a written script that has a relatively frequent deletion ratio (every fourth or fifth word, perhaps).
Picture-cued tasks. Familiar pictures are displayed, and test-takers are told to write the word that the picture represents. Assuming no ambiguity in identifying the picture (cat, hat, chair, table, etc.), no reliance is made on aural comprehension for successful completion of the task.
Form completion tasks. A variation on pictures is the use of a simple form (registration, application, etc.) that asks for name, address, phone number, and other data. Assuming , of course, that prior classroom instruction has focused on filling out such forms, this task becomes an appropriate assessment ofsimple tasks such as writing one's name and address.
Converting numbers and abbreviations to words. Some tests have a section on which numbers are written-for example,hours of the day, dates, or schedules and test-takers are directed to write out the numbers. This task can serve as a reasonably reliable method to stimulate handwritten English

Spelling Tasks and Detecting Phoneme-Grapheme Correspondences.

number of task types are in popular use to assess the ability to spell words correctly and to process phoneme-grapheme correspondences

Spelling tests
Picture-cued tasks.
Multiple-choice techniques.
Matching phonetic symbols.

DESIGNING ASSESSMENT TASKS: INTENSIVE (CONTROLLED) WRITING

Dictation and Dicto -Comp

dictation was described as an assessment of the integration of listening and writing, but it was clear that the primary skill being assessed is listening. Because of its response mode, however, it deserves a second mention in this chapter. Dictation is simply the rendition in writing of what one hears aurally, so it could be classified as an imitative type of writing, especially since a proportion of the test taker's performance centers on correct spelling. A form of controlled writing related to dictation is a dicto-comp.

Grammatical Transformation Tasks

In the heyday of structural paradigms of language teaching with slot-filler ,techniques and slot substitution drills, the practice of making grammatical transformations...;....orally or in writing-was very popular. To this day, language teachers have also used this technique as an assessment task, ostensibly to measure grammatical competence. Numerous versions of the task are possible:

Change the tenses in a paragraph.
Change full forms of verbs to reduced forms (contractions).
Change statements to yes/no or wh-questions.
Change questions into statements.
Combine two sentences into one using a relative pronoun.
Change direct speech to indirect speech.
Change from active to passive voice.

Picture-Cued Tasks

The main advantage in this technique is in detaching the almost ubiquitous reading and writing connection and offering instead a nonverbal means to stimulate written responses.

Short sentences
Picture description.
Picture sequence description.

Vocabulary Assessment Tasks

The major techniques used to assess vocabulary are (a) defining and (b) using a word in a sentence. The latter is the more authentic, but even that task is constrained by a contrived situation in which the test-taker, usually in a matter of seconds, has to come up with an appropriate sentence, which major may not indicate that the test-taker "knows" the word. Vocabulary assessment is clearly form-focused in the above tasks, but the procedures are creatively linked by means of the target word, its collocations, and its morphological variants.

Ordering Tasks

One task at the sentence level may appeal to those who are fond of word games and puzzles: ordering (or reordering) a scrambled set of words into a correct sentence. Here is the way the item format appears.

Test-takers read:

Put the words below into the correct order to make a sentence:

1. cold / winter / is / weather / the / in / the

2. studying / what / you / are

3. next / clock / the / the / is / picture / to

Short-Answer and Sentence Completion Tasks

Some types of short-answer tasks were discussed in Chapter 8 because of the heavy participation of reading performance in their completion. Such items range from

simple and predictable to somewhat more elaborate responses.

ISSUES IN ASSESSING RESPONSIVE AND EXTENSIVE WRITING

Responsive writing creates the opportunity :tor test-takers to offer an array of 'possible creative responses within a pedagogic11 or assessment framework: test-takers are "responding" to a prompt or assignment. Freed from the strict control of intensivewriting, learners can exercise a number of options in choosing vocabulary, grammar, and discourse, but with some constraints and conditions. The learner is responsible for accomplishing a purpose in writing, for developing a sequence of connected ideas, and for empathizing with an audience. The genres of text that are typically addressed here are

short reports (with structured formats and conventions);
responses to the reading of an article or story;
summaries of articles or stories;
brief narratives or descriptions; and
interpretations of graphs, tables, and charts.

Both responsive and extensive writing tasks are the subject of some classic, widely debated assessment issues that take on a distinctly different flavor from those at the lower-end production of writing.

Authenticity. Authenticity is a trait that is given special attention: if test takers are being asked to perfornl a task, its face and content validity need to be assured in order to bring out the best in the writer
Scoring. Scoring is the thorniest issue at these final two stages of writing
Time. Yet another assessment issue surrounds the unique nature of writing: it is the only skill in which the language producer is not necessarily constrained by time, which implies the freedom to process multiple drafts before the text becomes a finished product.

DESIGNING ASSESSMENT TASK RESPONSIVE AND EXTENSIVE WRITING

Paraphrasing

One of the more difficult concepts for second language learners to grasp is paraphrasing. The initial step in teaching paraphrasing is to ensure that learners understand the importance of. paraphrasing: to say something in one's own words, to avoid plagiarizing, to offer some variety in expression. With those possible motivations and poses in mind, the test deSigner needs to elicit a paraphrase of a sentence or paragraph, usually .not more.

Guided Question and Answer

A variation on using guided questions is to prompt the test-taker to write from an outline. The outline may be self-created from earlier reading and/or discussion, or, which is less desirable, be provided by the teacher or test administrator. The out· line helps to guide the learner through a presumably logical development of ideas that have been given some forethought.

Paragraph Construction Tasks

The participation of reading performance is inevitable in writing effective paragraphs. To a great extent, writing is the art of emulating what one reads. You read an effective paragraph; you analyze the ingredients of its success; you emulate it. Assessment of paragraph development takes on a number of different forms:

1. Topic sentence writing. Assessment thereof consists of

specifying the writing of a topic sentence,
scoring points for its presence or absence, and
scoring and/or commenting on its effectiveness in stating the topic

2. Topic development within a paragraph

Four criteria are commonly applied to assess the quality of a paragraph:

the clarity of expression of ideas
the-logic-of the sequence and connections
the cohesiveness or unity of the paragraph
the overall effectiveness or impact of the paragraph as a whole

3.Development main and supporting ideas across paragraphs

These elements can be considered in evaluating a mUlti-paragraph essay:

• addressing the topic, main idea, or prinCipal purpose
organizing and developing supporting ideas
using appropriate details to undergird supporting ideas
showing facility and fluency in the use of language
demonstrating syntactic variety

Strategic Options

Developing main and supporting ideas is the goal for the writer attempting to create an effective text, whether a short one- to two-paragraph one or an extensive one of several pages. A number .of strategies are commonly taught to second language writers to accomplish their purposes. Aside from strategies of free writing, outlining, drafting, and revising, writers need to be aware of the task that has been demanded and to focus on the genre of writing and the expectations of that genre.

Attending to task. In responsive writing, the context is seldom completely open-ended: a task has been defined by the teacher or test administrator, and the writer must fulfll1 the criterion of the task. Depending on the genre of the text, one or more of these task types will be needed to achieve the writer's purpose. If students are asked, for example, to "agree or disagree with the author's statement:' a likely strategy would be to cite pros and cons and then take a stand.

Attending to genre. The genres of writing that were listed at the beginning of this chapter provide some sense of the many varieties of text that may be produced by a second language learner in a writing curriculum. Assessment of the more common genres may include the following criteria, along with chosen factors from the list in item #3 (main and supporting ideas) above:

Reports (Lab Reports, Project Summaries, Article/Book Reports, etc.)
Summaries of Readings/Lectures/Videos
Responses to Readings/lectures/Videos
Narration, Description, Persuasion Argument, and Exposition
Interpreting Statistical, Graphic, or Tabular Data
Library Research Paper

TEST OF WRITTEN ENGLISH (TWE®)

The Test o/Written English (11f1E). Established in 1986, the ~ has gained a reputation as a well-respected measure of written English, and a number of research articles support its validity (Frase et al., 1999; Hale et al., 1996; Longford, 1996; My ford et al., 1996). In 1998, a computer-delivered version of the 1WE was incorporated into the standard computer-based TOEFL and simply labeled as the "writing" section of the TOEFL. The 1WE is still offered as a separate test especially where only the paper based TOEFL is available.

Test preparation manuals such as Deborah Phillips's Longman Introductory Course for the TOEFL Test (2001) advise l WE test-takers to follow six steps to maximize success on the test:

Carefully identify the topic.
Plan your supporting ideas.
In the introductory paragraph, restate the topic and state the organizational plan of the essay.
Write effective supporting paragraphs (show transitions, include a topic sentence, specify details).
Restate your position and summarize in the concluding paragraph.
Edit sentence structure and rhetorical expression

It is important to put tests like theTWE in perspective. Timed impromptu tests have obvious limitations if you are looking for an authentic sample of performance in a real-world context.

How does the Educational Testing Service justify the TWE as such art indicator?

Research by Hale et al. (1996) showed that the prompts used in the TWE approximate writing tasks assigned in 162 graduate and undergraduate courses across several disciplines in eight universities. Another study (Golub-Smith et aI., 1993) ascertained the reliabilities across several types of prompts (e.g., compare/contrast vs. chart-graph interpretation). Both Myford et aI.(1996) and Longford (1996) studied the reliabilities of judges' ratings. The question of whether a mere 30-minute time period is sufficient to elicit a sufficient sample of a test-taker's writing was addressed by Hale (1992), Henning and Cas callar (1992) conducted a large-scale study to assess the extent to which TWE performance taps into the communicative competence of the test-taker. The upshot of this research-which is updated regularly-is that the TWE (which adheres to a high standard of excellence in standardized testing) is, within acceptable standard error ranges, a remarkably accurate indicator of writing ability.

The convenience of the TWE should not lull administrators into believing that TWEs and TOEFLs and the like are the only measures that should be applied to students. It behooves admissions and placement officers worldwide to offer secondary File measures of writing ability to those test-takers who.

are on the threshold of a minimum score,
may be disabled by highly time-constrained or anxiety-producing situations,
,could be culturally disadvantaged by a topic or situation, and/or
(in the case of computer-based writing) have had few opportunities to compose on a computer.

SCORING METHODS FOR RESPONSIVE AND EXTENSIVE WRITING

Holistic Scoring

The TWE scoring scale above is a prime example of holistic scoring. In Chapter 7, a rubric for scoring oral production holistically was presented. Advantages of holistic scoring include

fast evaluation,
relatively high inter-rater reliability,
the fact that scores represent "standards" that are easily interpreted by lay persons,
the fact that scores tend to emphasize the writer's strengths (Cohen, 1994,p. 315), and
applicability to writing across many different disciplines.

Its disadvantages must also be weighed into a decision on whether to use holistic scoring:
One score masks differences across the sub skills within each score.
No diagnostic information is available (no washback potential).
The scale may not apply equally well to all genres of writing.
Raters need to be extensively trained to use the scale accurately.

Primary Trait Scoring

A second method of scoring, primary trait, focuses on "how well students can write within a narrowly defined range of discourse" (Weigle, 2002, p. 110).This type of scoring en1phasizes the task at hand and assigns a score based on the effectiveness of the text's achieving that one goal. For example, if the purpose or function of an essay is to persuade the reader to do something, the score for the writing would rise or fall on the accomplishment of that function. If a learner is asked to exploit the. ,imaginative function of language by expressing personal feelings, then the response would be evaluated on that feature alone. the advantage of this method is that it allows both writer and evaluator to focus on function. In summary, a primary trait score would assess .

the accuracy of the account of the original (summary),
the clarity of the steps of the procedure and the fmal result (lab report),
the description of the main features of the graph (graph description), and
the expression of the writer's opinion (response to an article),

Analytic Scoring

For classroom instruction, holistic scoring provides little washback into the writer's further stages of learning. Analytic scoring may be more appropriately called analytic assessment in order to. capture its closer association with classroom language instruction than with formal testing. The order in which the five categories (organization, logical development of ideas, grammar, punctuation/spelling/mechanics, and style and quality of expression) are listed may bias the evaluator toward the greater importance of organization and logical development as opposed to punctuation and style. But the mathematical assignment of the 100-point scale gives equal weight (a maximum of 20 points) to each of the five major categories.

Analytic scoring of compositions offers writers a little more washback than a single holistic or primary trait score. Scores in five or six major elements will help to call the writers' attention to areas of needed improvement. Practicality is lowered in that more time is required for teachers to attend to details within each of the categories in order to render a final score or grade, but ultimately students receive more information about their writing. Numerical scores alone, however, are still not sufficient for enabling students to become proficient writers, as we shall see in the next section.

BEYOND SCORING: RESPONDING TO EXTENSIVE WRITING

Formal testing carries with it the burden of designing a practical and reliable instrument that assesses its intended criterion accurately. To accomplish that mission, designers of writing tests are charged with the task of providing as "objective" a scoring procedure as possible, and one that in many cases can be easily interpreted by agents beyond the learner. Most writing specialists agree that the best way to teach writing is a hands-on approach that stimulates student output and then generates a series ofse1f-assessments, peer editing and revision, and teacher response and conferencing (Raimes, 1991, 1998; Reid, 1993; Seow, 2002).

Assessment of initial stages in composing

Focus your efforts primarily on meaning, main idea, and organization.
Comment on the introductory paragraph.
Make general comments about the clarity of the main idea and logic or appropriateness of the organization.
As a rule of thumb, ignore minor (local) grammatical and lexical errors.
Indicate what appear to be major (global) errors (e.g., by underlining the text in question), but allow the writer to make corrections.
Do not rewrite··questionable, un grammatical ,or· awkward sentences; rather, probe with a question about meaning.
Comment on features that appear to be irrelevant to the topic.

Assessing Later Stages of the Process of Composing

Once the writer has determined and clarified his or her purpose and plan, and has completed at least one or perhaps two drafts, the focus shifts toward "fine tuning" the expression with a view toward a final revision. Editing and responding assume an appropriately different character now, with these guidelines:

Assessment of later stages in composing

Comment on the specific clarity and strength of all main ideas and supporting ideas, and on argument and logic.
Call attention to minor ("Iocal") grammatical and mechanical (spelling, punctuation) errors, but direct the writer to self-correct.
Comment on any further word choices and expressions that may not be awkward but are not as clear or direct as they could be.
Point out any problems with cohesive devices within and across paragraphs.
If appropriate, comment on documentation, citation of sources, evidence, and other support.
Comment on the adequacy and strength of the conclusion.

Through all these stages it is assumed that peers and teacher are both responding to the writer through conferencing in person, electronic communication, or, at the very least, an exchange of papers. The impromptu timed tests and the methods of scoring discussed earlier may appear to be only distantly related. to such an individualized process of creating a· written text, but are they, in reality? All those developmental stages may be the preparation that learners need both to function in creative real..world writing tasks and to successfully demonstrate their competence on a timed impromptu test. And those holistic scores are after all generalizations of the

various components of effective writing. If the hard work of successfully progressing through a semester or two ofa challenging course in academic writing ultimately means that writers are ready to function in their real-world contexts, and to

get a 5 or 6 on the TWE, then all the effort was worthwhile. This chapter completes the cycle of considering the assessment of all of the four skills of listening, speaking, reading, and writing. As you contemplate using some of the assessment techniques that have been suggested, I think you can now fully appreciate two significant overarching guidelines for designing an effective assessment procedure:

It is virtually impossible to isolate anyone of the four skills without the involvement of at least one other mode of performance. Don't underestimate the power of the integration of skills in assessments designed to target a single skill area.
The variety of assessment techniques and item types and tasks is virtually infmite in that there is always some possibility for creating a unique variation. Explore those alternatives, but with some caution lest your overzealous urge to be innovative distract you from a central focus on achieving the intended purpose and renderirig an appropriate evaluation of performance.

Source :

Brown, H. G. (2004) Language Assessment : Principle and Classroom Practice.New York : Longman

Tugas language assessment

Jumat, 08 Mei 2020

SUMMARY ASSESSING READING AND WRITING

Tidak ada komentar:

Posting Komentar

SUMMARY ASSESSING GRAMMAR AND VOCABULARY

Laporkan Penyalahgunaan