Partnership for Accessible Reading Assessments Masthead
 
Research
Reports
Advisory Committee
- Committee Meeting
Staff
  NARAP Web Site 
 

Technology Assisted Reading Assessment Projects
Annual Technical Advisory Committee Meeting

Friday, August 24, 2007
Chauncey Conference Center, Educational Testing Service, ETS
Princeton, NJ

Attendees

Cara Laitusis, Teresa King, Kyndra Middleton, Elizabeth Stone, Klaus Zechner (ETS)
Martha Thurlow, Christopher J. Johnstone (NCEO)
Karen E. Barton (CTB/McGraw Hill)
Tracey Hall (CAST)
Brian Touchette (DE Department of Education)
Paula Bress (Boston Public Schools)
Richard M. Jackson (Boston College)
Dorothy S. Strickland (Rutgers University)
Dave Edyburn (University of Wisconsin)

Joining by telephone:
Kay Alicyn Ferrell (American Foundation for the Blind)
C. Scott Trimble, Barbara Thomas, Dave Malouf (US Department of Education)

Overview

The second meeting of the Technical Advisory Committee (TAC) of the Technology Assisted Reading Assessment (TARA) project was held on August 24, 2007 in Princeton, NJ. The purpose of the meeting was to update TAC members on TARA’s progress since the first TAC meeting on November 17, 2006 in Washington, DC.

TAC members received a TARA update including the outcomes of completed psychometric and survey research. Specific questions on the upcoming research elicited feedback and rich discussion from TAC members. The meeting concluded with a summary of the next steps for TARA.

Top of page.


Welcome and Greetings

Martha Thurlow, National Center on Educational Outcomes, NCEO; PARA; TARA

Martha Thurlow welcomed and thanked all the TAC members for their attendance. She highlighted the day’s agenda and brought special attention to the new brochure and the projects websites. All newly completed papers and reports can be found in the meeting’s binder. Reports and past presentations can also be found on the National Accessible Reading Assessment Projects (NARAP) website, www.narap.info. A basic overview of the three NARAP projects (Designing Accessible Reading Assessments (DARA), Partnership for Accessible Reading Assessment (PARA), and Technology Assisted Reading Assessment (TARA)) was provided. It was also noted that NARAP has requested a one-year extension to allow for more research to be conducted.

Top of page.


Differential Item Functioning Results

Elizabeth Stone, Educational Testing Service, ETS; DARA; TARA

Elizabeth Stone reviewed the study background, method, results, and limitations of the differential item functioning (DIF) study that was completed earlier in the year. The study was conducted using 4th and 8th grade data from an English/Language Arts (ELA) component of a large scale, state standards test. The test was comprised of reading and writing multiple choice items, as well as an essay that was not included in analyses. Students without disabilities who took the standard version served as the reference group. There were two focal groups of interest: students who are blind or visually impaired who took the 1) large print form, or 2) large print or braille form. The Mantel-Haenszel method of DIF that was used compared the performance on each item from the reference and focal groups using the overall ELA score as the matching criterion. DIF was run on Grade 4 and Grade 8 data separately, and several items on each test were found to have at least moderate DIF. Overall, the 4th grade reading items showing DIF seemed to favor the focal group, and the 4th grade writing items showing DIF appeared to favor the reference group. The reverse was true in 8th grade. The TAC suggested that this may be due to classroom instruction, which usually focuses more on reading instruction in earlier grades. By 8th grade, the focus tends to be on learning English/Language Arts and instruction has moved away from reading.

Examples of test items in each grade that were found to have DIF were presented to the TAC in order to elicit discussion of their thoughts on why the items might have displayed DIF. The dialogue led to many useful and informative explanations. The primary reasons for the DIF could be generalized as being related to test format and access to the curriculum. These reasons, and TAC suggestions for further investigation, are expanded upon below.

Large print and braille versions of the test will not necessarily have the same issues associated with them. That is, the same item may display DIF on only one of the forms. Large print issues include the potential complications from an optical overlay (for some students), the consequences of how the physical form is created, and the problems with using various fonts. Using an overlay creates the issue of seeing the part versus the whole. Sighted students using the regular format will be able to see the whole easier than students using the large print version. The effect of this on whether an item is easier or harder would depend on the item. Whether a test is a large print version or a simple enlargement also makes a difference. Typically, enlargements do not allow information in tables, charts, and footnotes to be of adequate size for a student who has a visual impairment. Finally, the use of fonts with serifs is not appropriate for an assessment that will be administered to VI students. Students who use the large print version of a test may not see small punctuation symbols such as apostrophes and commas, or may confuse serifs with punctuation.

Braille issues include the physical layout, the way figures are displayed, and the differences in the type of braille used. The layout can be an issue, for example, when side-by-side comparison of documents is required. This type of comparison might be easier to do in standard or large print form. Also, researchers need to take into account how the figures displayed in the standard and large print forms are represented in the braille version. The uncontracted/contracted braille difference can have consequences, especially on an ELA assessment. Most students begin to learn braille using the uncontracted (alphabetic) form and then progress to contracted braille. In the latter, common words or parts of words are represented by contractions rather than spelled out fully. Students who are taught different braille types have different reading and spelling strengths, and the use of one braille type or the other on the test can have an impact on performance. TARA researchers should consult the results of the ABC Braille Study when they become available from the American Printing House for the Blind (APH) in October. There is also the potential for confusion when certain symbols in literary braille are used rather than their Nemeth Code counterparts.

Students may be taught different reading strategies that contribute to how item-related information is sought out and obtained from a passage. For example, blind or VI students may be taught to read tests from the beginning of the page down or to read the items first, but this is not necessarily the case. Reading speed is also an issue for blind and VI students, and these students may feel rushed on even an un-timed test. Classroom instruction and access to information is an important issue. VI and blind students in a mainstream classroom may not be fully engaged during certain lessons that are only presented on a chalkboard, such as those involving diagramming components of a sentence or the breakdown of the stanzas of a poem. These lessons on a chalkboard are not accessible to blind or VI students and thus the students do not have the time on task necessary for the same level of performance as their peers on the related test items. A think aloud study may help researchers discover differences in how the blind and VI students experience the test that may have been too subtle to discover in the analyses completed to date.

The TAC suggested obtaining summary statistics on the reading and writing subscores to see how they would compare for the different groups, as well as looking at the overall and group score distributions more carefully. In addition, ideas of matching on testlets and performing a cluster analysis were submitted.

Top of page.


Next Steps for Analysis of Test Data

Elizabeth Stone, Educational Testing Service, ETS; DARA; TARA

Elizabeth Stone described the studies planned or currently in progress that will use the same test data as was used in the DIF analyses. More DIF research will be conducted to explore the use of other analysis methods and to more closely examine individual items as well as item types. A reader of braille will be hired to assist in comparing the braille, large print, and standard forms to aid in the interpretation of the DIF and to add to the test development recommendations that are being prepared as a result of this study. A differential distractor functioning (DDF) study is in the pipeline and will be presented at the Northeastern Educational Research Association (NERA) conference in October. The TAC agrees that this informative extension of the DIF analyses will provide evidence of whether and why groups are being drawn differentially to specific distractors. A trend analysis examining performance differences across grades will be conducted. One possible method will be to look at the differences between groups in the percent of students considered to be proficient to see if those gaps change across grades.

The TAC discussed the focuses of the teaching, learning, and instruction of reading and writing across grade levels. They agreed that by 4th grade, braille students should have the braille fundamentals in place to learn how to read on pace with their sighted peers if each started the learning process at the same time. One important caution to keep in mind during this project is that states identify blind and VI students differently. Sometimes students have a 504 plan rather than an IEP, which may lead to different legislative implications involving instruction and testing accommodations and can have an impact on access to the curriculum.

The TAC suggested conducting a longitudinal study to compare test formats, different item types and trends across several grade levels. TARA researchers acknowledge the importance of this type of study. Due to limited funding, a longitudinal study will not be possible but a smaller scale think aloud study to examine these issues should be feasible.

Top of page.


Overview of Technology Assisted Reading Assessment

Cara Laitusis, Educational Testing Service, ETS; DARA; TARA

Lois Frankel, Educational Testing Service, ETS; TARA

Cara Laitusis explained that the TARA is being developed using the evidence centered design (ECD) methodology. This process entails describing the purpose of the test and the high level claim, and then defining the population and all of the important terminology used in these descriptions. Future efforts will include creating models of test items as well as conducting a data collection. Test item models will then be adjusted according to the results of the data collection.

The TAC had some questions about the usefulness and appropriateness of testing grades 7-9 using only grade 8 test items. Adjacent grade levels tend to have much overlap but there is a concern for AT skills being confounded with reading comprehension skills. TARA researchers explained that a reason for including three grade levels is to enlarge the potential sample size. One goal of the project is to ensure that the test is compliant with NCLB regulations which include testing on grade level. Also, it would be difficult from the perspective of test development to correlate the data using different grade levels. One possibility is to include items from a variety of grade levels to allow for a more comprehensive score report. The report would then have the potential to indicate if a student’s performance is at grade level and near grade level. One important point brought up by the TAC is that proficiency level and “on grade level” definitions vary by state. One way to avoid this problem is to use a measure that is not a state assessment. It will be important to get the underlying scales of the test when collecting a national sample.

The issue of the grade level and number of comprehension items that are included in this test was raised by the TAC. TARA staff explained that the number of items on and level of comprehension necessary for the test will be low enough that a student’s reading ability should not impact the results. The purpose of the TARA test as it stands is to measure AT skills. Should the results of the current work be successful, one possibility is to add a third part to the assessment that would measure comprehension.

The TAC was presented with the purpose, high level claim, and definition of terms and was asked to provide their opinions and suggestions for improvement. Below is a summary of the TAC’s responses.

The TAC advised that issues including the specific content of the material, student background knowledge, and the specific structure of the text are important to keep in mind for this test. That the concept of “access” has various meanings is important to consider as well. A conceptual model of accessing text will help with the development of an assessment.

AT that is considered to be low tech should be included in the assessment if it commonly used. To distinguish AT by whether it requires instruction to use is not a reasonable method, as even low technology AT requires instruction. AT varies in its functions and abilities. It will be useful to run a feature match on the included AT products to get a baseline for examining skill level, minutes in the environment, the specific AT used, and the content of the test item.

Currently there is no progress monitoring by teachers that leads to AT selection. Much of the AT that is used is directed by what teachers know how to use. Further, district use of AT is vendor driven and many teachers will be taught one or more products from one vendor. It will be very important to collect actual AT use in and outside of the classroom, as it was suggested that use in the home may also correlate well with performance.

There is an assessment that is supported by legislation related to this project. The learning media assessment (LMA) is implied by IDEA and it presumes that a student on an IEP is a braille reader until the child can demonstrate equal or better performance using large print or other tools. TARA researchers should investigate the relation of this assessment and the project.

Distinguishing between the levels of proficient and advanced proficient will be difficult to do objectively and completely. Time is not a reasonable measure because students’ disabilities will affect the time to use AT and complete tasks. AT use can be thought of as a skill set. Some scoring methods mentioned were to look at the number of skills a student possesses, or to use a path analysis approach. A potential drawback to this idea is that advanced AT users will vary tremendously in their methods of using AT, as well as the specific skills and AT used. The scoring rubric must be carefully developed to ensure that certain behaviors are not unintentionally left out. It will be important as the project develops to stay focused on curriculum useful tasks. Some high-level tasks are unrelated to the curriculum and, therefore, should not be included.

The TAC is encouraged by the work of the project. Currently, teachers may not be equipped to make appropriate decisions about AT use. If successful, this project can compel states to require proper teacher preparation in AT and to promote collaboration between general education teachers, VI teachers, and technology staff in schools. The information can also be used for research studies to show which AT is more effective, and this will be helpful for state funding purposes. The TAC is hopeful that one outcome of the research findings will be test development guidelines. It is suggested that the TAC play a role in the development of such guidelines if necessary.

Top of page.


Survey of the Teachers of the Blind and Visually Impaired

Martha Thurlow, National Center on Educational Outcomes, NCEO; PARA; TARA

Christopher Johnstone, National Center on Educational Outcomes, NCEO; PARA; TARA

Martha Thurlow reviewed the purpose of the survey that was administered to teachers of the blind and visually impaired (TVIs). The sample and source of data were described. There were some concerns about on which listserv the survey request was distributed. In addition, some potentially useful personal information about the teachers, such as teacher disability status and age, was not able to be obtained due to institutional review board (IRB) regulations. This information can be important to know. For example, one suggestion from the TAC was that younger teachers are more apt to know about and teach technology. This kind of data should be collected during the telephone interviews that are to be conducted as a follow up to the survey. Some of this information can be determined indirectly. One example given was that a portion of the surveys were downloaded into MS Word. Without knowing personal information about the participants, it can be inferred that these teachers are comfortable enough with technology to know that this capability exists. Another factor that impacts teacher knowledge and availability of technology is geographic location. Some states have better resources than others. This information should also be collected in the interviews.

The TAC mentioned that braille notetaking tools are absent from the categories and should be included in future surveys and questions. They are a very popular tool for students. It is debatable whether they are an appropriate accommodation for students, but given their popularity in schools, it’s important to track this AT. One other concern about the questions asked of TVIs is that the teachers may not know the specifics of state standards.

A suggestion for parsing out proficiency levels is to create a cognitive web. Advanced students should be able to make determinations about what technology to use when given a specific task. This could be analyzed using a path analysis. In order to do this, researchers must first determine if all the intermediate steps are necessary and important in considering the performance on the task a success or if it is just achievement of the goal that matters. TARA researchers responded that this type of path scoring has been attempted and one difficulty with this method is that the paths have points that vary for students with disabilities. Paths taken can be affected by a student’s disability and that should not affect the student’s score. The TAC stressed that it’s important to determine and define the steps that will be measured.

Martha Thurlow reviewed the basic survey results and then described the next steps. TAC members suggested being careful about interpreting the inverse relationship of the caseloads of TVIs and the accommodations their students use. A student could have a desperate need for AT but not receive it because his or her teacher’s caseload is too large.

Top of page.


TVI Interview Plans

Christopher Johnstone, National Center on Educational Outcomes, NCEO; PARA; TARA

Christopher Johnstone explained that in-depth follow-up interviews will be conducted with a sample of the TVIs who completed the TVI surveys. He presented the planned interview questions and asked the TAC for their opinions and suggestions for improvement. Feedback was provided from the TAC on each question. The topics discussed included wording issues, the integration of AT and universal design (UD), and VI and general education teacher interaction.

The TAC also brought up specific concerns and additions to the questions to ask VI teachers. One issue was to be careful not to structure questions to focus teachers on reading instruction or to assume that AT is computer based. By 8th grade students are not directly taught reading, they are taught English/language arts. CCTVs are often used by students and are not technically computer-based. The way accommodations are implemented in the classroom is important to know from a state’s perspective. It may be a question better asked at the end of the interview because it is probable that VI teachers may not fully know their state’s accommodations policies, and there are other questions to ask that will be more important for test development purposes. There is no AT category that includes tools that combine text and speech, and that should be remedied. Speech tools are helpful with reading proficiency and speed.

Some questions not included but worth considering ask teachers to answer questions from two perspectives—one of a teacher who is familiar with AT, and one of a teacher who is not. Another suggestion is to ask general questions first and then ask teachers to answer questions while keeping in mind individual students with specific needs. Interviews could ask teachers to create their own rubrics for proficiency levels; however, a decision will need to be made about how the skills are categorized (in general, by basic type of AT, etc.). Interviewers should ask teachers what constructs and specific tasks are likely to cause difficulty for their students, even if such a task measures a skill that is not directly taught in the curriculum. It may also be informative to ask an open-ended question about what the teacher’s advice would be to the AT industry and to state departments about making AT more useful and appropriate.

Top of page.


Next Steps

Martha Thurlow, National Center on Educational Outcomes, NCEO; PARA; TARA

Martha Thurlow concluded the meeting by reviewing TARA’s endeavors and summarizing follow up items identified during the meeting. Before closing, the TAC was asked to suggest appropriate journals in which to publish research, and conferences at which to participate. The journals recommended included Re:view, Journal of Visual Impairment and Blindness (JVIB), JTLE, and the journal from Assessment for Effective Instruction. Conferences recommended were Association for Education and Rehabilition of the Blind and Visually Impaired (AER) and the low incidence research summit. TAC members were also asked to provide recommendations for an appropriate individual to recruit for the NARAP Goal 3 Principles Committee. The potential member should have blind and low vision expertise as well as knowledge about state testing. Suggestions included Cay Holbrook, Kay Ferrell, Don Leu, Alan Farstrup, Richard Jackson, and Tracey Hall.

Top of page



Contact readingassessment@ets.org  with questions or assistance with this site.