We implemented a computer simulation of a serial model of syntactic parsing, based on the garden-path theory (Frazier, 1978; Frazier & Clifton, 1996). The parsing process is basically incremental and input-driven (bottom-up). Attachment of an incoming word is first (Step 1) guided by whether there is a construction in the current partial parse tree (CPPT) needing or expecting a constituent (eg. a verb waiting for an object, or a postulated CP waiting for its TP complement). Next (Step 2), it is guided by specific properties of the incoming word and CPPT, and preference principles like Minimal Attachment and Late Closure. If attachment fails, reanalysis (Step 3) is invoked. Based on Fodor & Inoues (1994) Diagnosis model, reanalysis is implemented following the formula:
Reanalysis = Symptoms + Guesses + Verification + Repair.
In fact, reanalysis is not necessarily invoked only when attachment fails. This happens when attachment is still possible but less probable than a reanalysis option. We found at least two such cases: multiple subcategorization verbs and object traces. Eg.,
1. I want him to come. (Not to attach to come to want as a purposive adjunct, but reanalyze [VP want [DP him]] into [VP want [CP [TP [DP him] to [VP come]]]].)
2. The pictures(i) you have e(i) seen. (Not to attach seen as a modifier of the pictures (cf. The pictures you took seen from this perspective are very surrealist), but combine it with have to make a perfect tense, pushing the trace e(i) to the object position of 'seen'.)
These cases show a need of adding a step between Step 1 and Step 2 of the algorithm (say Step 1.5), which checks the possibility of doing a reanalysis rather than an "easy" attachment. To help speed up the check, the parser memorizes, for the first case, the other potential subcategorizations of a verb whenever it is satisfied with a subcategorizational configuration (cf. Stevensons (1998) decay thesis), and, for the second case, the position of a trace whenever it is located. When a trace is pushed out of its current position by reanalysis, it can be moved into a new position, as in:
or hung out waiting for a new position, as in:
The parser has some top-down mechanisms, conform to a general agreement in recent psycholinguistic research that parsing is not totally bottom-up but has also top-down features (Frazier, 1998). Besides the well-known structure postulation mechanism, we propose forward parsing, in which the parser delays attachment and continues to work with the input stream to gather enough information for a good attachment. Eg., forward parsing is used to find the head of a DP to avoid repeated reanalysis as in the case of:
The current prototype covers many basic and advanced structures of English, including noun, verb, adjective, and preposition phrases, adjunct structures, relatives, yes-no and wh-questions, and trace chains. Tests done on a set of 300 sentences of length up to 25 words (at least 10 sentences for each length) yielded a linear execution time with respect to input length, with much longer time for reanalysis cases than for straightforward cases. This shows that the implementation apparently conforms to its objective to simulate a human parsing model.
References
Fodor, J.D. & Ferreira, F., 1998 (eds). Reanalysis in Sentence Processing. Kluwer Academic.
Fodor, J.D. & Inoue, A., 1994. The Diagnosis and Cure of Garden Paths. Journal of Psycholinguistic Research, Vol. 23, No. 5, 1994.
Frazier, L., 1978. On Comprehending Sentences: Syntactic Parsing Strategies. Doctoral dissertation, University of Connecticut.
Frazier, L., 1998. Getting There (Slowly). Journal of Psycholinguistic Research, Vol. 27, No. 2, 1998.
Frazier, L. & Clifton, C., 1996. Construal. Cambridge, MA: MIT Press.
Stevenson S., 1998. Parsing as Incremental Restructuring. In Fodor & Ferreira (1998) 327-363.