Contextual Aspects of Text Organization

Awarded: NOK 2.5 mill.

Much effort has been invested in automatic correction of writing mistakes, but less in investigating the workflow of the writing process. CATO has investigated the work flow of writers in two different groups: ordinary writers with no known reading or writing disabilities, and known dyslectics. The project has collected comparable processdata for writing in English by students learning English. This material is still analyzed. Additionally, we have tested primingeffects (in a normal population of students) across languages in a series of experiments that have resulted in conference contributions showing that meaning but not form activate words across languages. We have also shown that code-switching in a sentence does not add an extra burden to the processing of garden-path sentences. This lays ground for many new investigations into multilingual phenomena and how this affects multi-lingual reading and writing. Our main results, concerning dyslectic writing, are now available in the doctoral thesis of Vibeke Rønneberg. We have so far tested three specific hypotheses using upper secondary dyslexic and non-dyslexic students matched for math ability and age and formally confirmed by a quick word split task. We did not find significant differences of either length of texts or the time spent on writing task (active typing). There was one highly significant difference in the proportion of misspelled words, where the dyslexia group had a little more than twice as large proportion of misspelled words. Thus, it was confirmed that the dyslexia group had problems. There were no differences between the groups for using edit operations, and the proportion of edit free words as well as proportion of correctly spelled edit free words were approximately the same. The Word Level Focus hypothesis states that problems arise from more focus and attention towards word encoding and decoding. We observed significantly longer time between key presses within words and especially at the start of words. There were no observed differences in how long it took to start a sentence. There was also increased variance at the start of words, meaning that some words were written fluently and others took longer. When we look at longer pauses before the words, there were significantly more long pauses longer than 1 second for the dyslexia group, and that difference is more noticeable for the longest pauses (over 3 seconds). A slightly surprising result is that in the dyslexia group specifically, the correct words are written significantly slower (i.e. the time between two key presses) at the start and within words, compared to misspelled words. This is partly explained by faster typing in edit operations. The Word Level Focus hypothesis was thus confirmed to play a role. The second hypothesis is the Resource Sharing Hypothesis. Huey (1908) states this as: "... it takes so long to get the word that the thought is lost". However, we did not detect any significant differences in any of the measures we used (use of open or closed class words, high and low frequency words, long or short words, ratings of style, organization, argumentation, vocabulary or overall quality). At first this might seem remarkable, but there are several articles that show similar findings, with only very slight differences detected. The Resource Sharing Hypothesis is thus not supported by our observations; our group writes texts that are equally readable to their matched controls after misspellings are accounted for. The third hypothesis is the Monitoring Hypothesis. According to this hypothesis problems in dyslexia stem from problems with recognizing which words are correctly written. In order to test this hypothesis, we asked subjects to write in a blind condition where letters were replaced by x such that the words were literally crossed out as they were written. We convinced the subjects of the fact that we saved everything they wrote and that the task was meaningful. We observed an increase in productivity in the control group, but not in the dyslexia group. The control group spent significantly more time at the start of a sentence in the x-condition compared to normal type writing, whereas there was no significant difference for the dyslexia group. There was a non-significant reduction in text quality for the dyslexia group. This tendency paired with the increase in time spent at the start of a sentence for the control group indicated that the previously written text is used to trigger the next sentence. The monitoring hypothesis for dyslexia is thus not confirmed, and must be counted as rejected for our population sample. However, a new link emerges as decoding skills predicts written quality. References Huey,E.B. (1908). The psychology and pedagogy of reading. Boston: MIT Press

Much effort has been invested in automatic correction of writing mistakes, but less in investigating the work flow of the writing process. CATO aims at investigating the work flow of writers in three different groups: ordinary writers with no known readi ng or writing disabilities, known dyslectics, and second language users. We need information about the writing process, for use in automatic correction. We suggest that automatic monitoring of the writing habits of users will prove very useful. This will first be done in a controlled experimental setting. We will investigate factors that can be used to adapt a correction mechanism to the users' writing habits, and investigate the conditions for non-intrusiveness of a given tool. Our explicit aim is to f ind factors that aid, support and stimulate text production in disadvantaged groups. It is also important that we know more about normal text production. The ultimate aim is to provide information through careful investigation so that word processors can adapt to the needs of the users. We think that users, especially in the disadvantaged groups, will benefit from both positive and negative feedback. The main question in this project is when to present feedback in an optimal way so that the writing proces s is not disturbed.

