Re-exploring O. Henry’s Short Stories——A Corpus-Based Pilot Study
论文作者:佚名论文属性:短文 essay登出时间:2009-04-15编辑:黄丽樱点击率:8682
论文字数:2646论文编号:org200904152229348649语种:英语 English地区:中国价格:免费论文
关键词:语料库欧•亨利场景主题corpusO Henrysettingtheme
【摘 要】本文试图采用语料库的方法从文体学视角分析欧•亨利小说集《四百万》。研究揭示,通过语料库软件计算出的总体统计数据,为有关欧•亨利小说广泛认同的文学阐释提供了更为具体的描述基础。在探讨小说场景和基本主题方面,重现序列的搭配及频数信息发现了前人并未关注过的语言学特征。
Abstract:This article attempts to apply corpus-based method to a stylistic interpretation of O. Henry’s short story collection The Four Million. It is shown that the overall
statistics computed by corpus software has provided a more detailed descriptive basis for widely accepted literary interpretations of his stories. In terms of story settings and general themes, the collocation and frequency information of recurrent sequences can identify valuable linguistic features which literary critics seem not to have noticed.
1.Introduction
O. Henry was called the American Guy De Maupassant. Both authors wrote twist endings, but O. Henry’s stories were much more playful and optimistic. Among the former studies on O.Henry’s short stories, there is consensus that O.Henry’s works are generally branded with such features as surprising endings, use of coincidence or chance to create humor, ingenious and exquisite layouts, smile-in-tears irony and so forth. Despite the detailed literary discussion, little work has been done to reveal its linguistic styles. Nor is there work with quantitative data as convincing evidence. In terms of the established description of his style, it seems unlikely that the corpus-based method can find anything original. However, the stylistic analysis in the present paper aims to illustrate the value of corpus empirical method in exploring the literary styles. On the one hand, statistic data help to confirm the canonical view on O.Henry’s short stories; on the other hand, stylistics is related to linguistic features of his works.
2.Data and Methodology
This paper is devoted to investigating the linguistic styles of O.Henry’s works in an empirical way, applying both quantitative and qualitative methods. The study adopts two corpora. One is O.Henry’s book The Four Million (a collection of stories), published in 1906, contains a series of short stories which took place in the New York City in the early years of the 20th century. The computer readable versions available on the internet are used to set up a minor working corpus for investigation (https://www.literaturepage.com/read/thefourmillion.html). The other one is Brown corpus used as a reference corpus.
The corpus concordance software used in this study is Wordsmith tools. Wordsmith can undertake more detailed analyses of frequencies of concordance items and extract collocational information. By use of corpora software, words with significant keyness in the book The Four Million will be sorted out first, and then concordance lines with a keyword and its collocates will be extracted. The corpora data will be processed by statistical instruments.
3.Overall Statistics
The overall statistics are one essential starting point for a systematic corpus-based textual analysis. Wordsmith Tools are used to provide the overall statistics of the two corpora and comparison is made as shown in Table 3.1.
Table 3.1 Comparison of Overall Statistics between the Two Corpora
text file
tokens (running words) in text
types (distinct words)
type/token ratio (TTR)
standardised TTR
standardised TTR basis
mean sentence length (in words) mean word length (in characters)
word length std.dev. Overall of Mini
52,770
8,251
16
46.97
1,000.00
15
4
2.24 Overall of Brown
1,390,505
47,146
4
39.07
1,000.00
23
5
2
本论文由英语论文网提供整理,提供论文代写,英语论文代写,代写论文,代写英语论文,代写留学生论文,代写英文论文,留学生论文代写相关核心关键词搜索。