Difference between female and male usage [closed]

Just out of curiosity I have done some quick statistics.

I downloaded the following books from Project Gutenberg

Men writers

  • Alice's Adventures in Wonderland by Lewis Carroll
  • Adventures of Huckleberry Finn by Mark Twain
  • Moby Dick, or, the whale by Herman Melville
  • The Adventures of Sherlock Holmes by Sir Arthur Conan Doyle
  • The Picture of Dorian Gray by Oscar Wilde
  • Paradise Lost by John Milton
  • The Works of Edgar Allan Poe — Volume 1 by Edgar Allan Poe
  • War and Peace by graf Leo Tolstoy
  • Dracula by Bram Stoker
  • Treasure Island by Robert Louis Stevenson

Women writers

  • Secret Adversary by Agatha Christie
  • Jane Eyre by Charlotte Brontë
  • Frankenstein by Mary Wollstonecraft Shelley
  • Pride and Prejudice by Jane Austen
  • Sarah Orne Jewett
  • Ramona by Helen Hunt Jackson
  • Home Influence by Grace Aguilar
  • Middlemarch by George Eliot
  • A Season at Harrogate by Mrs. Hofland
  • Wuthering Heights by Emily Brontë

After removing the common Project Gutenberg header, I've read the files in R, split them into characters and let it count vowels and consonants.

I had a total of 8725700 characters for men and 11468186 for women

Here's a graph with the ratios consonants/vowels1 calcolated per book (showing mean +/- standard deviation)

Consonants/vowels ratio

There is no statistical significance in the two groups (p=0.89, t-test)

EDIT

I played some more with the data and I got this bargraph of usage of the single letters.

usage per letter

Again, you can see no major differences between men and women writers

EDIT2: I repeated the analysis with 10 books per group. I would say that there is definitely no difference


1 I considered a, e, i, o and u as vowels, the result does not grossly change including y.