Case-sensitive letter and bigram frequency counts from large-scale English corpora. Jones, M. N & Mewhort, D. J K Behav Res Methods Instrum Comput, 36(3):388-96, 2004.
abstract   bibtex   
We tabulated upper- and lowercase letter frequency using several large-scale English corpora (approximately 183 million words in total). The results indicate that the relative frequencies for upper- and lowercase letters are not equivalent. We report a letter-naming experiment in which uppercase frequency predicted response time to uppercase letters better than did lowercase frequency. Tables of case-sensitive letter and bigram frequency are provided, including common nonalphabetic characters. Because subjects are sensitive to frequency relationships among letters, we recommend that experimenters use case-sensitive counts when constructing stimuli from letters.
@Article{Jones2004,
  author   = {Michael N Jones and D. J K Mewhort},
  journal  = {Behav Res Methods Instrum Comput},
  title    = {Case-sensitive letter and bigram frequency counts from large-scale {E}nglish corpora.},
  year     = {2004},
  number   = {3},
  pages    = {388-96},
  volume   = {36},
  abstract = {We tabulated upper- and lowercase letter frequency using several large-scale
	English corpora (approximately 183 million words in total). The results
	indicate that the relative frequencies for upper- and lowercase letters
	are not equivalent. We report a letter-naming experiment in which
	uppercase frequency predicted response time to uppercase letters
	better than did lowercase frequency. Tables of case-sensitive letter
	and bigram frequency are provided, including common nonalphabetic
	characters. Because subjects are sensitive to frequency relationships
	among letters, we recommend that experimenters use case-sensitive
	counts when constructing stimuli from letters.},
  keywords = {Cues, Fixation, Humans, Linguistics, Ocular, Periodicity, Visual Perception, Vocabulary, 15641428},
}

Downloads: 0