The universe is conspiring in my little word-obsessed direction today. After reading Litlove’s wonderful post about caveblogem’s enlightening dissection of her prose, I spent some time reading about caveblogem’s project (very interesting indeed!) and eventually found my way to Wordcount.
That this project existed and I didn’t know about it until now is most definitely one of the travesties of my adult life. A catalogue, in descending order, of the most frequently used 86,800 words in the English language – what more could I possibly need to entertain me for the next few hours?
I poured over the site and here are some rather amusing discoveries…
I was rather sad to see that “I” ranked #11 but had to temper my sudden flare of indignation and contempt for our hyper-individualized society upon seeing that “you” was just three words away.
And I couldn’t help doing the gender comparison. As expected we use the words “he” and “his” more often than “she” and “her”. And man vs. woman was particularly disappointing: 142 vs 393. But aren’t there more women in the world than men?
We use the word “but” quite a lot; it ranked #25. Does this make Anglophones conditional by nature?
Apparently, procrastination is not an epidemic disease – as illustrated by the fact that we use “now” (74) much more often than “later” (212).
English speakers do, however, prefer the negative and say “no” (51) more than they say “yes” (146).
We’re not all talk: do/39 vs say/134
We are fairly sure of ourselves: know/83 vs. think/102
I kept looking for the first nouns to appear and it took awhile. The entire list begins with “the” and continues for quite some time with prepositions, pronouns and some existential verbs until I finally came upon the most often used noun in the English language: TIME. Interesting…
The following were the nouns that followed – people (81), government (140), world (149), life (154) and home (161).
We use the word “old” more than the word “young” but we prefer “life” (154) to “death” (454) by a long shot!
I was exquisitely happy and my faith in humanity restored to see that we use the word “book” (357) significantly more than “television” (1022) or “TV” (1577). Also that we prefer the word “love” (384) to “hate” (3107) – hooray!
Then I went hunting for some verbs about day-to-day actions and created this probably biased (my choice of verbs) but nevertheless informative list:
- 406 – read
- 416 – study
- 418 – run
- 569 – talk
- 930 – write
- 977 – walk
- 1244 – learn
- 1280 – drink
- 1358 – sleep
- 1367 – eat
- 1484 – fight
- 1781 – listen
- 2610 – laugh
- 2955 – cry
- 3497 – kiss
- 4065 – sing
And finally, I looked at the most rarely used words. Here are the final ten: savills, homemakers, golgotha, lauro, multilingualism, tangency, carniola, workless, recrossed, conquistador. (I can’t find savills, lauro, nor carniola in my dictionary – anyone?)
By the way – bunkum ranked 80,008
Addendum: Carniola is a region in Slovenia and Lauro is an Italian first name. Still can’t find savills…
7 responses so far ↓
Dew // July 17, 2007 at 6:22 pm |
I’m very suspicious of statistics in general and might quibble with some of these. For example, maybe we say no more than yes because we like to say yeah, uh-huh, mm-hmm and so forth. There’s also nope and nuh-uh, but I’m pretty sure that yeah is used way more often. If we combined yeah and yes, what rank would that be?
As far as I, maybe we say that so much because we like to temper our opinions. We don’t so much like saying, “This book is awful!” We prefer to say, “I think this book…” or “I would say this book….” and so on.
But what a fun post! I’m heading over right now to check out wordcount.
Brian Hadd // July 17, 2007 at 6:48 pm |
Time conquistador, bunkum. I he he!
verbivore // July 17, 2007 at 9:14 pm |
I completely agree and forgot to put anything about my own skepticism into the post. I realize that most of these rankings are probably not hard and fast at all. But it was so much fun to browse and search!
caveblogem // July 17, 2007 at 10:04 pm |
Thanks for your kind comments on my post and such. I had forgotten all about wordcount and you reminded me that I shouldn’t. It is a very interesting site. I had been coding my own information for parts of speech, but I see that they make this information available through the CLAWS word-tagging system. My tagging system differs a little, of course. When I provide input to the Haiku-writing algorithm, for example, I have to treat certain word groups together, otherwise it will not know whether to use an article in front of certain nouns. But it will still save me a great deal of time. And it is really cool.
Your blog is very interesting, too. And I plan on coming back to visit often.
litlove // July 18, 2007 at 10:14 am |
Isn’t it fun to have a look, though, caveats about statistics notwithstanding! I thought this was a wonderful post, Verbivore, and so very enjoyable!
verbivore // July 18, 2007 at 11:50 am |
Bhadd – gave me a laugh as well!
Caveblogem – I had a lot of fun looking at the Wordcount site yesterday. Just a fascinating project. And I’ll certainly be looking in at your site from now on!
Litlove – I certainly wouldn’t want to base a dissertation on the rankings provided on wordcount but it was fun to cull some ‘trivia’ from their list of words and just get a general sense of what we write says about our culture. And thank you, I’m glad you liked it!
Stefanie // July 18, 2007 at 3:12 pm |
Very interesting analysis even if stats are unreliable it still provides a fascinating snapshot.