Simultaneous interpreting (SI) is the key mode of interpreting at high-level international institutions such as the United Nations. However, the difficulty of obtaining and processing data has led to the current dearth of empirical, corpus-based investigations into SI.
This methodological problem inspired the project to create a representative corpus of SI based on political discourse. The corpus has two components, Russian-English and English-Russian, and comprises about 500,000 words or 60 hours of speech data. Corpus annotation combines POS-tagging with the descriptive annotation of interpreters’ disfluencies and enables a wide variety of fully automated linguistic searches.
The main purpose of the project is to conduct macroanalysis of linguistic variation between interpreted and non-interpreted texts. This will be done using the multidimensional approach inspired by Biber’s (1995) work on register variation, although based on a different, relevant set of variables.
Biber, Douglas. 1995. Dimensions of Register Variation: A Cross-Linguistic Comparison. Cambridge: Cambridge UP.