Difference between female and male usage [closed]
Just out of curiosity I have done some quick statistics.
I downloaded the following books from Project Gutenberg
Men writers
- Alice's Adventures in Wonderland by Lewis Carroll
- Adventures of Huckleberry Finn by Mark Twain
- Moby Dick, or, the whale by Herman Melville
- The Adventures of Sherlock Holmes by Sir Arthur Conan Doyle
- The Picture of Dorian Gray by Oscar Wilde
- Paradise Lost by John Milton
- The Works of Edgar Allan Poe — Volume 1 by Edgar Allan Poe
- War and Peace by graf Leo Tolstoy
- Dracula by Bram Stoker
- Treasure Island by Robert Louis Stevenson
Women writers
- Secret Adversary by Agatha Christie
- Jane Eyre by Charlotte Brontë
- Frankenstein by Mary Wollstonecraft Shelley
- Pride and Prejudice by Jane Austen
- Sarah Orne Jewett
- Ramona by Helen Hunt Jackson
- Home Influence by Grace Aguilar
- Middlemarch by George Eliot
- A Season at Harrogate by Mrs. Hofland
- Wuthering Heights by Emily Brontë
After removing the common Project Gutenberg header, I've read the files in R, split them into characters and let it count vowels and consonants.
I had a total of 8725700 characters for men and 11468186 for women
Here's a graph with the ratios consonants/vowels1 calcolated per book (showing mean +/- standard deviation)
There is no statistical significance in the two groups (p=0.89, t-test)
EDIT
I played some more with the data and I got this bargraph of usage of the single letters.
Again, you can see no major differences between men and women writers
EDIT2: I repeated the analysis with 10 books per group. I would say that there is definitely no difference
1 I considered a, e, i, o and u as vowels, the result does not grossly change including y.