Stylometry and Gender

Communication between men and women can sometimes be a little… difficult. And that’s understandable. In conversations, both genders tend to use different nonverbal cues when communicating [1].
Men smile less and use fewer facial expressions than women. In conversation, women use paralanguage (all linguistic elements not being words, for example sounds like ‘mhm’ or ‘oh’) more frequently than men. Its purpose is mainly to accommodate the other person and show understanding, whereas men use it mainly to confirm a statement. Both genders show different physical activity: men usually stand wider, with their arms and legs further apart, where women usually cross their legs and keep their arms close.

It is clear there are some differences in communication styles when it comes to women and men. But is this consistent when it comes to written communication? Do women generally write differently from men, and if so, in what regard?

I unfortunately don’t have time to check every written word ever and compare the two. Luckily, there are several tools to do this for me. Despite the fact that style might intuitively seem something unmeasurable, it can be done.

To comprehend this process, it’s necessary to understand the word ‘Stylometry’. Stylometry is the quantitative study of literary style through computational distant reading methods. [2] Distant reading aspires to generate textual and numerical evidence from texts at large scale. A simpler explanation would be that distant reading is “understanding literature not by studying particular texts, but by aggregating and analyzing massive amounts of data” [3]. By using formal languages to learn about large sets of data, conclusions can be drawn in a technical and punctual way. Distant reading can be seen as a qualitative approach to literary studies [4].

But how could you capture something as arbitrary as ‘style’ in such a technical manner? An example of this is vocabulary. Everyone has a unique vocabulary, large or small. One way to measure this is to count the amount of unique words. Mean length of sentences or punctuation can also be indications for style.

Back to our research question: does text, written by either men or women, show significant stylometric differences? Yes, states George K. Mikros in Systematic stylometric differences in men and women authors: a corpus-based study [5]. His research is based on a corpus of newspaper articles and shows men and women use most stylometric features in a different way. One of his conclusions is that women write texts with rare vocabulary, while men’s texts present less lexical repetition and avoidance of standardized lexical patterns. He also found the ‘report vs. rapport’ distinction [6], where female authors tend to concentrate on interaction with readers and males focus on the information transmission.  

Different research, The limits of distinctive words: re-evaluating literature’s gender mark debate by Sean Weidman and James O’Sullivan [7], is not only using contemporary data but also literary works from other periods. The results are, again, quite stereotypical. Word counts show women tend to describe places in micro-sense: ‘home’, ‘kitchen’, ‘hallway’. Men on the other hand focus on greater spaces when discussing location: ‘country’, ‘town’, ‘space’. These micro- or macro-diction extends to positions, travel and time.
Not only descriptions show significant differences. Males present information with much more confidence: ‘declared’, ‘absolutely’, ‘exactly’. Females tend to be more uncertain in their choice of words: ‘believed’, ‘seemed’, ‘perhaps’.
These stereotypical outcomes may be expected when we look at historic periods, given the rise of emancipation and change of lifestyles. Interesting fact: this research shows the gap between stylistic separation between genders is greater than ever.

I only mentioned two research projects directly, but both of these articles refer to plenty of other articles who share these results. I never knew stylometric differences in written word, whether it’s a newspaper article or a book, were this stereotypical and significant.

Remember this next time you write your boyfriend a shopping list: do not go into too much detail about the place you were writing it and the feelings it prompted. And men: remember to leave out your infallible opinion about space and the universe next time you text your sister.
And everyone will get along just fine.

References:

1 https://online.pointpark.edu/public-relations-and-advertising/gender-differences-communication-styles/

2 https://programminghistorian.org/en/lessons/introduction-to-stylometry-with-python

3 https://hollythehistorian.wordpress.com/2015/09/21/blog-4-distant-reading-text-mining-and-topic-modeling/

4 https://www.youtube.com/watch?v=UNsXL-0Svb4

5https://www.researchgate.net/publication/236583624_Systematic_stylometric_differences_in_men_and_women_authors_a_corpus-based_study

6 Tannen, Deborah (1991). You just don’t understand: Women and men in conversation. London: Virago Press

7 https://academic.oup.com/dsh/article/33/2/374/3111279