Introduction

Parenting requires making complex, subjective decisions about difficult topics. From before a child is even born, parents are asked to make decisions on behalf of their babies’ health, education, development, and wellbeing. While parenting might not be an inherently stigmatizing topic to discuss, some issues related to parenting can indeed be stigma-inducing. For example, postpartum depression (PPD) might be associated with the belief that parents are not wholly loving of, or able to care for, their children . Parents of children with special needs have to construct narratives around their interaction with their children that not only include their roles as parents, but also their roles as caretakers and advocates . Parents can experience stigma associated with lingering societal perceptions about identity (e.g., as a lesbian, gay, bisexual, or transgender (LGBT) parent ) or social status (e.g., parents who are divorced or non-custodial ). Stigmatization can also be experienced by association, based on close interactions with another stigmatized person. For example, having a child who exhibits violent behavior or having a child who identifies as LGBT requires parents to assess and navigate appropriate disclosures on behalf of the child.

A significant challenge that parents face is the perceived stigma and judgment from family members and other parents . Parents can feel pressure around sharing the details of a child’s experiences and health with grandparents and other family members who might not support the parents (e.g., parents of children with special needs ). Similarly, divorced parents have to manage the stigma associated with their new roles as divorced or estranged, which may be partially due to role transitions, such as custody battles between parents . Prior research shows that self-disclosure provides an important therapeutic outlet that in turn has positive psychological and physical health implications .

Social media sites provide a potential platform for parents to access information and social support for parenting questions; however, studies suggest that parents can be judgmental towards each other online (e.g., ). On sites like Facebook or Instagram, parents may feel a social expectation to perform an idealized version of parenting , making it difficult for them to disclose sensitive topics . Parents at times cannot gauge the propriety of sharing their experiences with other parents who might perceive such self-disclosure as one-upmanship or unfair comparison with their own children . Many popular parenting sites like BabyCenter support the use of pseudonyms, while others like YouBeMom support complete anonymity, allowing parents to disclose sensitive content more freely. However, it has been difficult to evaluate disclosure differences across different levels of anonymity given the wide variance in community norms across sites. On Reddit, however, users can easily alternate between posting under their username and creating a temporary "throwaway" account that allows them to post anonymously. This provides a natural context in which to investigate differences in parents’ posting behaviors between username (pseudonymous) and throwaway accounts.

Throwaway accounts have been used by Reddit users to discuss sensitive issues relating to relationships, gender identity, sex, and confessions , and stigmatized experiences including sexual assault and mental illness . Reddit has also been used to discuss stigmatized parenting content. A recent news story highlighted how mothers on Reddit — many using throwaway accounts — discussed how “motherhood was a bad idea” , an assertion that would be received critically in many contexts.

This research investigates what parents disclose when they use throwaway accounts on Reddit and whether their disclosure behaviors differ from non-throwaway posts. In Study 1, we explore what factors predict parents posting to Reddit using throwaway accounts rather than with their usernames. In Study 2, we examine the main themes discussed on those throwaway accounts. In Study 3, we investigate whether and how responses to throwaway accounts differ from responses to non-throwaway accounts.

Study 1 uses topic models and lexico-syntactic categories as features in a logistic regression classifier to determine the topics/lexical categories that predict if a throwaway account is used. Study 2 uses log likelihood ratios, coupled with qualitative methods, to produce eleven themes discussed by throwaway accounts on parenting subreddits. Finally, in Study 3 we use propensity score matching (PSM) and find that throwaway comments received more responses that were longer and received higher average karma scores.

Our results offer two overarching contributions: first, throwaway accounts allow parents to overcome societal expectations that they will be “good” parents; and second, that anonymity enables increased disclosure and support for parents. We discuss how temporary accounts allow parents to discuss potentially stigmatizing topics, thus gaining information and social support from parents with similar experiences. We argue that throwaways provide parents with shared norms and expectations for sharing potentially stigmatizing experiences while still being embedded within their existing online community. It can also allow them to make better sense of the boundaries and norms of the subreddit, after which they can “graduate” to pseudonymous accounts. We propose design opportunities for joint hybrid identified and anonymous social media sites that can provide more supportive online experiences for parents and other users.

Related work

In this section, we focus on two areas of literature. The first section summarizes literature on parenting and online self-presentation especially when considering potentially stigmatizing issues. In the second section, we focus on the use of anonymous and pseudonymous social media sites to discuss stigmatizing issues.

Parenting and online self-presentation

Parenting has been a common topic in online communities since the days of “the WELL1” — an early online community that shaped scholarship about how people interact online . On the WELL’s Parenting Conference, or board, parents shared intimate descriptions of their experiences, ranging from the mundane (e.g., diaper changing) to discussions about LGBT teenagers and children with special needs. Mothers and fathers provided “emotional support on a deeper level, parent to parent, within the boundaries of Parenting [Conference], a small but warmly human corner of cyberspace.” Engaging with other parents on social media sites provides parents with social support . For example, mothers are empowered by and find a sense of community through blogging and fathers look for other fathers facing similar experiences and challenges to engage with online . This allows parents to seek information about their parenting experiences, but also to make sense of their experiences .

When parents present themselves to those in their networks, they do so in ways that are considered to be socially acceptable. For example, when sharing pictures about children online, parents share pictures that show the family in a good light . Mothers mostly share pictures that show a happy child in “cute” settings, especially if they are achieving child milestones . Parents usually refrain from posting pictures of children crying, naked children, or any other pictures that might not be socially acceptable . Parents may not want to share their views online about specific topics where other parents might have strong and different views. For example, parents avoided discussing sleep training, vaccinations and discipline on Facebook .

This adherence to normative standards could be threatened if parents have a child with behavioral challenges or if the normative family unit is dissolved (e.g., due to divorce or through abuse allegations). Parents experiencing postpartum depression (PPD) also feel stigmatized by their condition . PPD is a mental health diagnosis associated with stigma that can inhibit both mothers and fathers from seeking help. Parents “fear the disclosure of mental illness and stigmatization and, in turn, often forgo treatment to avoid label attachment. Additionally, stigma causes withdrawal and social exclusion” which adds to the negative effects associated with PPD. Moreover, fathers may not have as much support when they face PPD as mothers .

Finally, while the stigma faced by divorcees has declined , divorced parents still have to face social stigma and ambivalence about their status . Gertsel argues that stigma associated with divorce is not related specifically to the act of divorce, but all the associated transitions that occur at the time of divorce. One such transition, legal custody battles for the children, can be particularly stigmatizing, especially when allegations of abuse are used by one or two of the parties involved in the divorce .

Early research on parenting and online communities by Madge and O’Connor suggests that mothers use anonymous sites to examine alternative perspectives on motherhood that might not be normative. Similarly, YouBeMom, an anonymous online community, allows mothers to discuss topics that they might not want to share with friends on Facebook or in face-to-face interactions, including negative discussions of their spouses and/or their children. . Ammari and Schoenebeck describe how newly-divorced fathers prefer using Reddit over “real-name” sites like Facebook for parenting advice because they perceive some of the responses to their parenting posts on Facebook to be judgmental. They also engage in self-censorship by refraining from discussing parenting topics that might be deemed problematic like custody battles with their partners .

Discussing stigmatising topics on Reddit

Anonymity online is thought to have negative effects due to what Suler has called the “online disinhibition effect,” in which users engage in antisocial behaviors like trolling and flaming . However, Bernstein et al.’s study of 4chan argues that anonymity can also be advantageous in “advice and discussion threads [where] anonymity may provide a cover for more intimate and open conversations.”

Reddit is a social news site with pseudonymous identities where users accrue karma points if their posts are up-voted. As opposed to 4chan, Reddit requires a username and persistent identity. Leavitt demonstrates that when sharing personal information on Reddit, users regularly post to the site via “throwaway accounts.” Throwaway accounts are “temporary” Reddit accounts that users can create in addition to their primary account. Throwaway accounts provide relative anonymity by disaggregating throwaway account posts from the user’s primary account , thus acting as proxies for anonymity on Reddit . Throwaway accounts allow users to “navigate boundaries” on Reddit especially when posting about personal issues such as “relationships, sex, gender, confessions [etc.]” , “identity-work associated with sexual identities that are not exclusively heterosexual” , and seeking support for stigmatized experiences (e.g., sexual abuse and mental health) .

The use of social media sites is framed by platform affordances and norms on the site . Specifically, Reddit’s design features facilitate easy setup and use of throwaway accounts. It is also supported by norms on Reddit which accept the use of throwaway accounts when discussing stigmatizing issues (e.g., ).

Previous research describes Redditors engaging in “identity-work associated with sexual identities that are not exclusively heterosexual” . These reflections on one’s sexuality often included not-safe-for-work (NSFW) content such as pornography or potentially stigmatizing content. In order to separate other contributions on the site . Robards, echoing work done by suggest that most of the posters were using throwaway accounts to separate NSFW content from SFW content on the site .

Reddit forums provide a space where users can address themes in which they are most interested. The network of users within any subreddit is more reflective of their interest in a particular topic than they are on existing social networks or geography . Some of these interests might revolve around sharing news (e.g., ), others are explicitly providing social support in specific context like suicide watch(e.g., ), while others engage in identity work (e.g., ).

Reddit users can seek social support when facing particularly stigmatizing issues like sexual abuse, mental health issues or eating disorders. Andalibi et al. argue that seeking support when experiencing sexual harassment can be helpful, but only if the discloser is supported by those who respond to their comments. They argue that moderators and other subreddit members pay more attention to throwaway accounts that are usually employed by users discussing their sexual abuse. Their study found that throwaway users seek support, provide support to other users in similar situations, and engage in sense-making as well as in asking explicit questions about their experience. Similarly, Reddit users who have experienced domestic abuse discuss their abuse in detail using throwaway accounts .

De Choudhury & De describe how throwaway accounts empower users to engage in mental health discourse without affecting their reputation (i.e., karma points). Indeed, throwaway accounts were six times more prevalent than pseudonymous accounts on mental health subreddits when compared to other subreddits . Using a text categorization scheme proposed by Altman and Taylor and weighting n-grams of throwaway comments, Pavalanathan and De Choudhury found that throwaway users shared more detailed information about themselves focusing on their “personal beliefs, needs, fears, and values.” In their study, pseudonymous accounts on the same subreddits shared considerably less personal information about their experience and focused on the help they were seeking from the site .

Study 1: Predictors of throwaway posting on parenting subreddits

Prior work suggests that pseudonymous social media sites like Reddit allow parents to discuss topics that might be problematic to share on Facebook such as vaccinations, circumcision, divorce and custody . We test that theory here with our first research question:

RQ1: What are the predictors of parents posting to Reddit as throwaways?

In Study 1, we use a logistic regression classifier in a prediction task where we will find the features predicting throwaway accounts. Below, we identify each of the feature vectors used in our model.

Dataset and Methods

We used a publicly available Reddit dataset2. This dataset was collected by Baumgartner using the Reddit API. The dataset included all public comments and submissions on Reddit3. The dataset includes comments, user names (pseudonyms), as well as comment timestamps and karma scores. No other identifying information, such as gender or age are given.

The data we use in our analysis were drawn from public subreddits between March 31st of 2008 and October 31st of 2018. While there are posts about parenting in Reddit before March of 2008, the first post on any parenting centered subreddit was in March 31st 2008 when r/Parenting was created. We focused our analyses on three subreddits: Parenting, Daddit, and Mommit. Table [subredditnum] shows the number of throwaway comments as well as the unique throwaway users under each subreddit. The total number of unique throwaway accounts across all three subreddits is 1,459. That is because some of the throwaway accounts have commented in more than one of the subreddits.

In their analysis of posts about mental health on Reddit, De Choudhurry and De note that despite mental illness being a stigmatized topic, “a rather small percentage of users in our dataset used throwaway accounts (1,209 users; 4.46%).” While there might be particular topics that parents might find socially stigmatized, parenting, as a general topic, is not understood to be socially stigmatized in the same way that mental health is. That might explain why the percentage of throwaway accounts in our study is almost one fifth the percentage of throwaway users in .

We selected the largest three parenting subreddits. There are 1.2 M registered users on r/Parenting, 117 K registered users on Daddit, and 73.2 K registered users on r/Mommit. We chose not to analyze other related subreddits that focused on closely related, but distinct, topics, like pregnancy (r/Babybumps; 119k members) and expecting fathers (r/predaddit; 29.7k members). We also excluded r/beyondthebump (92.8k members) which perhaps could have been included but is an extension of the pregnancy experience and continues to focus on that part of the parenting experience. Finally, we made the decision to focus on parenting communities with broadly similar (i.e., supportive) norms and not to focus on communities that are designed to be sarcastic and harsh (r/BreakingMom; 44.4k members).

This table reports the number of comments, threads and unique users per subreddit
r/Parenting r/Daddit r/Mommit
First comment 03/31/2008 09/04/2010 07/17/2010
Comments 2,112,028 440,728 232,919
Threads 100,373 44,730 17,399
Unique Users 128,527 53,059 22,683
Throwaway comments 9,838 416 378
Unique Throwaway Users 1,275 139 79

Finding throwaway accounts

Basing our method on earlier work in , we identify throwaway accounts by first looking for the term “throwaway” or a variant of it in the account names. We use the list suggested by Andalibi et al. , specifically “[*thrw*, *throwaway*, *throw*, *thrw*, *thraway*]”. In addition, we added any users who used statements like, “this is a throwaway account,” or “I’m using a throwaway account.” Once we identified a set of users, the first author randomly selected 50 users to manually verify that they were indeed throwaway accounts. All but one of the users explicitly stated that they were using throwaway accounts. Many parents explained their use of a throwaway account by saying that they are using a throwaway account “for obvious reasons,” others provided some explanations including: (1) other members of the family members being Reddit users; (2) that they felt more at ease using a throwaway account for what they perceived to be a stigmatizing narrative — “I am ashamed of what I am divulging here.”

Logistic regression classifier

In logistic regression classifiers, best-fit set of parameters are built for the training data. Fitting the data is done using a function similar to a smoothed step function called a sigmoid function. Using the sigmoid function, ϕ(z) = 1/1 + e − z, each of the feature values is weighted and the results added up. The result is the input to the sigmoid function allowing us to get a result between [0,1]. Any value above 0.5 will be classified as class 1. Anything under 0.5 will be classified under class 0 . We built a logistic regression model on our data. The model had two classes, class 1: Throwaway, and class 0: Not Throwaway.

A note on balancing the data set

The class we analyzed in our logistic regression classifier, throwaway comments, is the minority sample in the dataset. Any classifier would perform better by predicting that the value is not classified as a throwaway comment. In order to balance the dataset, we under-sampled the majority class. Undersampling balances the dataset by randomly removing values from the majority dataset (non-throwaway comments). This generated a 50:50 class ratio for the classifier with a baseline accuracy of 0.5.

One of the disadvantages of undersampling is that there is some data loss since we are removing some of the data from the majority class. Another method that is employed to balance datasets is oversampling. Oversampling works by randomly creating synthetic data similar to the data in the minority class. Oversampling however increases the possibility of overfitting, which would affect the predictive capacity of the model . We have chosen to use random undersampling in our analysis in order to reduce the chances of overfitting. However, we have also trained the same models on oversampled minority classes using the SMOTE implementation in SKLearn to compare the results of the models presented in this paper. We found that the performance of the models was comparable in both cases and the features and their importance to the prediction were also comparable.

We split our data into training data between 2012/01/01 and 2018/01/01. The test dataset was set on the rest of the data between 2018/01/01 and 2018/10/31. We opted to train the data starting in 2012 in order to reduce any effects of changes to the subreddits too close to their creation dates (see Table [subredditnum]). We also avoid data leakage by splitting the dataset this way since the test data did not yet exist when the training data was generated .

To check against the time-split model, we also trained a model using a random split using 80% of the accounts as training data and 20% as test data and 5-fold cross validation. This classifier showed only marginally better results than our model. However, given that we have an interest in reducing the amount of time leakage from earlier comments, we decided to maintain use of the time-split model.

We built a logistic regression classifier model applying L2 regularization which penalizes the complexity of the model (large number of features) to get robust coefficients 4.

Features for logistic regression classifier

We used 135 features in our logistic regression classifier to predict if the Reddit account is a throwaway account. Sixty of the features are LDA topics. Seventy two features are sentiment analysis values (LIWC linguistic measures). Finally, three features represent control features including karma scores, comment length and user tenure on parenting subreddits. While these three control features are not directly related to the comment text (LDA topics and sentiments), they describe the behavior of users on parenting subreddits.

LDA Topic Modeling for Topic Detection [60 features]

We introduced the LDA topic modeling features to measure how different topics that make the content of the parenting Reddit comments are associated with Throwaway accounts.

First, we represented our corpus of Reddit threads as a bag of words (BoW) where each document (Reddit thread) in the corpus is represented by a list of words disregarding grammar and word order . Using the BoW representation as features would create too many features in relation to the number of observations. Besides, we are interested in interpreting the results of the logistic regression classifier. In order to do so, we used the Latent Dirichlet Allocation (LDA) model to discover topics in our corpus . In a Document, D, there is a sequence of N words, D = (w1, w2..., wn). A corpus C is in turn a collection of documents, C = (D1,D2,...Dn). The output of LDA models is represented by abstract topics throughout the corpus, C. In this analysis, the corpus C represents the text of comments throughout the three subreddits. We assume that each thread contains a related set of topics. Therefore, we take every single thread to be a document D. Each of the LDA topics is represented by a number of key terms which we refer to as key term group (KTG). We trained a LDA model using the Python Gensim package on the corpus of the aggregated subreddits, r/Parenting, r/Daddit and r/Mommit. We listed significant LDA models and their KTG in Table [tab:LDA].

In order to find the optimum number of topics for the LDA model, we trained 9 LDA models starting with a number of topics k=10, with a step of 10 topics until a limit of 90 topics. For each of these k iterations, we calculated the coherence of the LDA models using the gensim CoherenceModel feature5. This feature measures the coherence score of the topics in the LDA model. Coherence values have been found to be better at approximating human rating of LDA model “understandability” than other measures like perplexity . Figure [fig:coherence] shows the coherence values for each of the LDA models. We used these scores as a guide to analyze a subset of the LDA models. We verified three LDA models which represent local maxima in the graph at 40, 70, and 80 topics.

Coherence values for LDA models between 10 and 100 topics
Coherence values for LDA models between 10 and 100 topics

To verify the LDA topics, we randomly selected comments with high topic scores (the process of finding the scores is discussed below) for each LDA topic. For each of the topics, we selected a random sample of 100 comments. We read as many of the selected comments that allowed us to reach saturation, at which point, we could describe and verify a topic, or deem the topic incoherent or irrelevant to the analysis (e.g., spam). We found the 70-topic LDA model to be the most coherent. Starting with 70 topics, we verified 60 and found 10 topics to be incoherent.

Topic score

In order to calculate the score per topic for each comment, we used the inference module in Gensim6, based on work by Hoffman et al. , in order to find the topic distribution over the corpus. A stochastic model estimates the values of topic distributions over a corpus through converging values of estimators populated from the LDA model. This allowed us to show the topic distribution across the corpus . We use the average topic score as a proxy to the user’s interest in particular parenting topics. These features are used to answer RQ1. Due to space limitations, we only present significant LDA topics from the logistic regression.

LIWC linguistic measures [72 features]

We used the Linguistic Inquiry and Word Count (LIWC) text analysis program, a lexicon of linguistic categories that has been psycho-metrically validated and performs well on social media data sets (e.g. De Choudhury et al.) to extract lexico-syntactic features. We applied LIWC 2015 processor on each of the comments in our dataset. While there are other tools to extract lexical categories like Empath , whose categories are highly correlated to LIWC categories, using LIWC would allow us to compare our results to earlier work analyzing lexical categories on social media. For example, LIWC has been used to analyze the differences between discussants on separate sides of the abortion debate on Reddit while Gilbert and Karahalios used LIWC categories to predict tie strengths between Facebook users.

There are 72 LIWC categories divided as follows: (1) standard linguistic measures (e.g., pronouns, articles etc.); (2) 41 term categories measuring psychological constructs (e.g., affect, cognition, and biological processes); (3) personal concern measures relating to work, home, money, religion, and leisure activities; (4) categories covering informal language (e.g., fillers, netspeak, swear words etc.).

Control features [3 features]

In addition to the 60 LDA topic features and 72 LIWC linguistic categories, we also used 3 control features: (1) average user tenure; (2) average Karma score per comment; and (3) average comment length. These three features provide controls in our logistic regression classifiers because they describe user contributions but are not dependent on the LDA topic scores or LIWC lexical category scores.

User tenure provides a measure of engagement in the parenting subreddits. Longer tenure indicates longer engagement in the community. Tenure is calculated by finding the number of days between the first comment from the user and the latest one in our dataset. If comments were only made in the same day, the value of the tenure would be zero. Since throwaway accounts are usually used for discussions related to specific topics that might be stigmatizing, the tenure of the user could be used as a proxy to predict whether a user is a throwaway account or not.

Similarly, average Karma score per comment measures the acceptability of user comments within parenting subreddits and their engagement references the number of comments they have on parenting subreddits. Each Reddit comment has an associated Karma score which is the difference between up-votes and down-votes. The more up-votes a comment gets, the higher the Karma score, and vice versa. We divided the average karma score by the total comments as a proxy of user activity on the subreddit. Both these values can be considered platform signals which provide proxies for the acceptability of the topics discussed by the user and his/her activity levels on site.

Finally, average comment length is a proxy to user engagement in a particular discussion. The longer the comment, the more engaged the user.

All 135 features are used to answer RQ1.

Understanding context using doc2vec

While LDA provides us with a list of topics from the document, it does not account for the semantics of the document because the order of words is not preserved as we discuss in section 3.4.1. Word2vec creates word embeddings using continuous bag of words (CBOW) where the algorithm predicts a target word relying on the surrounding terms. Word embeddings allow us to understand semantic context and the distance between words. For example, the word “powerful” is semantically closer to the word “strong” than it is to “Paris,” and word embeddings maintain these distances . Doc2vec builds on this by accounting for the context of words within “documents” . By doing so, doc2vec allows us to determine the difference between the same term in different documents. Documents in the doc2vec model can be defined as a sentence, paragraph etc. We define each Reddit comment as a document for our doc2vec model.

We extended the doc2vec model by tagging each document with an LDA topic tag if the comment has a high topic score (see subsection 3.4.1.1) for each of the significant topics (as identified in the classifier). By extending the doc2vec model using the LDA topic tags, we analyzed how each document and its associated tags “share high semantic similarity which allows us to learn the embeddings of [the top LDA comment tags] along with the documents.” We used Gensim’s implementation of doc2vec7 to train our doc2vec model. We presented the context for each of the LDA topics by finding the closest terms to each LDA topic tag. The semantic context for each of the significant LDA models is listed in Table [tab:LDA]. The doc2vec terms are not used as features in the logistic regression classifier since they are not esaily interpreted. In other words, we would not be able to make sense of the embeddings if they are associated with predicting throwaway accounts. However, they do provide context for each of the LDA topics, especially those terms that might be stigmatizing. For example, the Gender and parenting expectations LDA topic terms might not show any stigmatizing words, but doc2vec includes words like “sexist”, “estranged,” and “child-molest.”

Results: What topics do throwaway users discuss

Descriptive statistics

There are 1,459 throwaway accounts who committed 10,632 comments. The average score for throwaway accounts is 5.53 while the average tenure is 39.26 days. Most of the throwaway accounts are used only within the same day (tenure = 0) or a few days. However, some users maintained their throwaway accounts for awhile longer. This might be due to the fact that parents kept using the throwaway account for specific parenting discussions.

This figure shows the tenure for Pseudonymous v. Throwaway accounts.
This figure shows the tenure for Pseudonymous v. Throwaway accounts.

[throwaway_stats]

Figure [throwaway_stats] shows the tenure profile for throwaway users (top) vs. pseudonymous users (bottom). While most throwaway accounts were used only within the same day, thus having a tenure of zero, some users maintained the use of their throwaway accounts for longer periods of time.

Classifier for throwaway accounts

The model has an accuracy of 0.699, precision of 0.690, recall of 0.683 and F1 score of 0.712. We also calculated Area under the Curve (AUC) metric for the model in order to analyze its fit . The AUC is a common metric used to evaluate regression models . After plotting the true positives (i.e., predicting a throwaway when the user is actually a throwaway) on the y-axis and the false positives (i.e., predicting a throwaway when the user is not a throwaway) on the x-axis, the AUC for our model is 0.777, therefore, our model is reasonably well fit .

When analyzing the logistic regression classifier, the features that have positive weights provide a better chance for the user to be classified as a throwaway user. Features with negative values predict that the user belongs to Class 0, that is to say, the user is not a throwaway user. Our logistic regression classifier describes throwaway users who engage in the following topics: (a) Gender & parenting expectations; (b) abuse and therapy; (c) parenting hardships; (d) work-parenting demands; (e) parenting nature; (f) financial problems; (g) family health; (h) speech and language development; (i) growing pains; (j) religions and social beliefs; (k) body image and privacy; (l) pregnancy challenges, loss, and grief; and (m) circumcision. The list of topics and associated KTG are shown in table [tab:LDA]. All these LDA topic features are positively associated with throwaway accounts.

Conversely, the use of the LIWC categories adjectives {free,happy,long} and numbers {second,thousand} are negatively related to predicting a throwaway account. The verbs LIWC category is also positively associated with predicting a throwaway account. Using more verbs indicates “attitude markers...which indicate the writer’s affective” response to certain propositions. Attitude towards a topic can be signaled by “attitude verbs (e.g., agree, prefer).”

While the LDA topics religions and social beliefs is positively associated with throwaway accounts, the LIWC religion category is negatively associated with throwaway accounts. We return to this in section 4.3.8.

Tenure is expected to be negatively associated with being a throwaway account since in most cases throwaway accounts are created to engage in particular topics, and therefore, they tend to have shorter tenures than pseudonymous users. In general, this model confirms what we expected: that users might want to use throwaway accounts when engaging in topics that might be stigmatizing.

This table presents the significant features from the logistic regression classifier. OR is odds ratio. The value means for each case, the odds of its having the baseline exposure is represented by the OR value; Only significant values presented in this table. **** p<0.0001; *** p<0.001; ** p<0.01; * p<0.05. The feature type indicates whether it is an LDA topic, LIWC, or control features.
Predictor Coefficient p-values OR Feature Type
Tenure -1.045 **** 0.352 Control
Gender & parenting expectations 0.297 **** 1.346 LDA
Abuse & therapy 0.334 **** 1.397 LDA
Parenting hardships 0.270 **** 1.310 LDA
Work-parenting demands 0.213 **** 1.237 LDA
Parenting nature 0.210 ** 1.233 LDA
Religion -0.264 ** 0.768 LIWC
Financial problems 0.238 ** 1.269 LDA
Family health 0.171 ** 1.186 LDA
Speech & social development 0.171 ** 1.186 LDA
Adjectives -0.812 ** 0.444 LIWC
Growing pains 0.125 * 1.133 LDA
Religious & social beliefs 0.137 * 1.147 LDA
Numbers -0.219 * 0.804 LIWC
Body image & privacy 0.131 * 1.140 LDA
Pregnancy challenges, loss, & grief 0.136 * 1.146 LDA
Verbs 1.906 * 6.724 LIWC
Parenting groups 0.117 * 1.124 LDA
Circumcision 0.112 * 1.118 LDA

Study 2: Throwaway conversations

To contextualize results from Study 1, we expand on our quantitative results using both quantitative and qualitative methods. We use Log Likelihood Ratio (LLR) to develop themes associated with the LDA topics, and use qualitative analysis to check and expand our inquiry into the nature of parents’ comments. Using our knowledge of the LDA topics predictive of throwaway accounts, we ask:

RQ2: What are the main themes discussed by throwaways?

Methods

In order to build an understanding of the differences between throwaway and pseudonymous conversations, we used Log Likelihood Ratio (LLR) analysis coupled with qualitative methods. First, we introduce the concept of Log Likelihood Ratios, and then describe the steps we used to find themes discussed by throwaways.

Log Likelihood Ratio

The Log Likelihood Ratio is the logarithm of the ratio of the probability of the word’s occurrence in throwaway comments to the probability of it occurring in pseudonymous comments. LLR analysis requires two documents to compare. In our case, LLR is used to compare throwaway and pseudonymous conversation discussing each of the significant LDA topics (see Table [tab:LDA]). A large LLR value indicates that the term is more likely to appear in throwaway conversations as opposed to pseudonymous conversations. A closer value to zero mean that the term is equally likely to occur in both throwaway and pseudonymous comments.

LLR has been used to determine if terms can be treated as “topic signatures” and Gupta et al. found that LLR “defines the aboutness” of a list of words in a topic. In the area of topic discovery, Chancellor et al. used LLR as a measure of the linguistic content when determining the differences between two subreddits focusing on weight loss. Even though each pair of documents we compared were different in size (there are significantly more pseudonymous comments), LLR takes into account the “size of the two corpora” .

However, LLR can be unreliable with rare word occurrences . To address this problem, we have read the occurrences in the text and discounted any LLR values attributed to rare tokens. For example, if a token only occurred or was repeated several times in the same comment or a short thread, we discounted it.

Steps for finding throwaway conversation themes

In this section, we report on a qualitative interpretation of the themes used by throwaway accounts when discussing the predictive topics. In doing so, we describe meaningful themes that are distinct from those discussed by pseudonymous accounts that might be engaging in the same predictive LDA topics.

We used the following steps to identify main throwaway themes (RQ2):

  1. Selecting top comments for each of the significant LDA topics: We selected comments with high topic score (>0.9) (see subsection 3.4.1.1) for each of the significant LDA topics (see Table [tab:LDA]). This step is repeated twice for each LDA topic to find the throwaway and pseudonymous comments with high topic score.

  2. Creating throwaway and pseudonymous documents for each significant LDA topic: Now that we have the top comments for each of the significant LDA topics, we appended the responses to each of the comments to create separate throwaway and pseudonymous conversation documents. In each document, we have the comments, and responses to them, related to a particular significant LDA topic. For example, we had (1) an abuse and therapy - throwaway document; and (2) an abuse and therapy - pseudonymous document to compare. We created a total of 28 documents, 2 per significant LDA topic.

  3. Finding comments for qualitative analysis: In this step, we selected comments from our documents (described in step 2) that included terms more likely to appear in throwaway (high LLR) or pseudonymous (low LLR) documents. For example, if “therapy” had a high LLR value for the Gender & parenting LDA topic, we selected comments that used this term in the gender & parenting expectations throwaway document. We also checked the LLR of LDA topic words and doc2vec words (from Table [tab:LDA]). Again, we selected comments that included LDA topic words and doc2vec words with high LLR values from the throwaway document. For example, in the Family health LDA topic, one of the doc2vec terms, “Strattera” had a high LLR value, indicating that it is more likely used by throwaways. We selected throwaway comments referencing “Strattera” from the family health throwaway document. If any of the LDA topic words and doc2vec words had LLR values close to 0 (i.e., equally likely to occur), we randomly sampled comments that used these terms from both throwaway and pseudonymous comments. This allowed us to understand how these terms were used in throwaway documents, and how their discussions differed from discussions in pseudonymous documents. Next, we used qualitative analysis methods to study comments identified in this step.

  4. Qualitative analysis: For each of the significant LDA topics, we read the comments identified in the earlier step until we reached saturation, at which point, we identified a number of themes under each of the LDA topics. We iteratively read through these threads in order to identify emergent themes discussed by throwaway users across the significant LDA topics. We read a total of 1,993 comments. Of this total, there were 630 pseudonymous comments and 148 responses to these comments. We also read 840 throwaway comments and 375 responses to them. Bruckman recommends levels of user disguise when quoting users in a research study. More recently, Fiesler and Proferes show that social media users do not expect to be quoted verbatim in academic research. Due to the sensitive nature of throwaway comments and their responses, we describe the contents of comments without direct quotations from the dataset. When we do quote from our dataset, the quote is edited to protect the privacy of the authors.

To demonstrate our process, we share an example in subsection 4.3.1 that shows how we differentiated between throwaway and pseudonymous comments, and then found common threads across different LDA topics.

Results

We present our findings from analyzing throwaway and pseudonymous conversation under each of the significant LDA topics identified in the logistic regression classifier from Study 1. We found 11 emergent themes from throwaway conversations: (1) dealing with abuse; (2) financial problems; (3) postpartum depression; (4) men in parenting; (5) transition to adolescence; (6) pregnancy complications, loss and grief, (7) family health; (8) parenting and social beliefs; (9) divorce and custody; (10) gratitude in throwaways; and (11) the reasons for using throwaway accounts. Some of these themes followed closely with the LDA topics identified in Study 1, while others incorporate throwaway comments using similar threads from different LDA topics.

Dealing with abuse

In this section, we present the differences between throwaway and pseudonymous conversations under the Gender & parenting expectations LDA topic using the steps outlined in the section. After that, we carry out the same process to analyze the Abuse and therapy topic. The LLR values for both topics are shown in Table [table:LLR1].

Gender & parenting expectations.

When discussing gender and parenting norms in the family context, pseudonymous accounts were more likely to discuss the role of the mother and father as adoptive (LLR=-8,141) parents and contrasting these roles to those of biological (LLR=-2,392) parents. For example, one parent described himself as both a “biodad” (LLR=-480) and a “stepdad” (LLR=-417) and compared both parenting roles through his personal experience.

Throwaway discussions were more likely to talk about the need (LLR=4,981) for help (LLR=4,882) and discuss abuse (LLR=4,234) in the context of gender stereotypes. Many of the discussions revolve around how some fathers withheld affection from their children and often treated their partners with disrespect. The fathers discussed were often themselves the product of unhealthy relationships with their own parents. Indeed, throwaway comments suggested that the parents’ attitudes are in many cases related to childhood mental (LLR=1,591) trauma, with many insisting that parents who inherit such psychological issues should invest in therapy (LLR=3,731).

A different thread we identified in throwaway conversations under the Gender & parenting expectations LDA topic focused on the negative views of fathers in public spaces. This thread is presented in . We found similar discussions about abuse to be more likely with throwaway conversations under the Abuse and therapy LDA topic. Specifically, the discussion about the definition of abuse, especially when it relates to men interacting with children. We describe this in detail below.

Abuse and therapy

Throwaway accounts under the abuse and therapy topic were more likely to discuss how abuse is not only physical. Some parents described a partner, usually a father, who is distant and degrading when interacting with both the partner and the children. They were attempting to determine if this behavior amounted to abuse. Most answers suggested that when a partner continued to engage in what they identified as emotional abuse, the partner should be given an ultimatum (LLR=12) to engage in counseling (LLR=2.21). Throwaway conversations were also more likely to discuss how abused parents engage in cycles (LLR=1.58) of abuse because of their lack of empathy (LLR=2.06). They also suggested therapy resources for both parents and children in families that experienced abuse. Parents updated others on the progress of their children after receiving therapy. They thanked (LLR=83.20) those who responded for their concern and for sharing detailed responses with relevant material information (e.g., resources like books, contact information for organizations providing support for abuse victims etc.).

Pseudonymous comments under the abuse topic were more likely to discuss how parents (LLR=-33.18) are responsible for setting boundaries (LLR=-6.12) and rules (LLR=-9.90), especially when dealing with teenagers (LLR=-1.68). Maintaining these lines brings punishment (LLR=-8.25) into focus. For example, pseudonymous conversations focused on the question about whether corporal punishment constitutes abuse.

After finding threads discussing abuse in throwaway conversations under these two LDA topics, we created the dealing with abuse theme. We continued a similar process for the rest of the LDA topics. Going forward in this section, due to space limitations, and since our focus is on the use of throwaway accounts, we will focus on the emergent themes that are more likely to be discussed by throwaway accounts. Except for a few cases, we will not reference the LLR values or the equivalent pseudonymous themes throughout the rest of this section.

Financial problems

Throwaways discussed the cost or budget that a new child might need. Such questions came from those who are not yet parents but considering becoming parents, or those considering having another child. For those considering having children, a number of responses suggested that “if you think a child will save your marriage or complete it, you are deeply mistaken,” intimating that the decision to have a child should be more than just a decision to compromise with their significant other who wants one.

Parents using throwaways also discussed problems in financing their children’s college degrees, especially if their children are failing college courses. Further, throwaway conversations discussed the propriety/need for parents to help finance their children’s graduate education. Parents discussed the effects of their own college debts on their ability to provide for their children and on their credit by discussing issues like debt consolidation.

Throwaway comments were also more likely to discuss resources for needy parents. For example, responses provided links and phone numbers for single parents to find resources they might need.

Postpartum depression

Throwaways discussed regretting having a child because of challenges in the first few months after birth, mostly related to PPD, but also touching on other topics like their partner’s’ low sex drive. Discussants in throwaway conversations were more likely to offer others a chance to private message (pm) them to continue the conversation in a more private manner. Many of these messages were from fathers who wanted to know more about how their partners were feeling in the early stages of parenthood, especially with reference to PPD. When fathers asked if their partners might be dealing with PPD, mothers replied with their own experiences with PPD and how fathers can support their partners. For example, one father recently went back to work after a short paternal leave, and thinks that his wife is suffering from PPD. When asking how he can help, parents suggest that he has to “push your wife to take a break even if she does not want to. Just take the kiddo yourself for awhile to a play-date or something...give her some time off.” Throwaway father accounts were also more likely to discuss their own experiences with what they diagnosed as PPD for men, especially when raising a challenging child, and the effects the experience has had on them and their partners.

Views about men and parenting

Throwaway accounts discussed other issues at the intersection of parenting and masculinity. For example, a single father lamented the lack of resources for single fathers in the rural area of the US where he resides. Other throwaway conversations focused on how difficult it is for fathers to find public spaces or parenting support groups that catered to their needs as parents. Some comments suggested that mothers could be paranoid around fathers. Men are usually judged as being inappropriate when they interact with children in the same way women do. When men tickle, hug or otherwise touch children, it is more likely to be seen as inappropriate. Some responses argued that since the statistics show that men are responsible for the mass majority of sexual abuse against children, these prejudices have a rationale to them.

Transition to adolescence

Throwaway conversations were more likely to mention scenarios where they found either teenagers or adults being “pervy”. For example, as the child grows older (tweens), what forms of physical contact (e.g., hugging or sitting in lap) with parents/relatives is acceptable? Discussions ranged between this being a form of physical affection and such physical contact being inappropriate. In related threads, throwaways discussed setting appropriate boundaries at home as the children were growing older, especially if they were of the opposite sex or if the adult is a step-parent. For example, throwaway accounts discussed the appropriate dynamics of interactions with children in gendered locker rooms as they grow older.

Throwaways were also more likely to comment on their children’s sexual experimentation as they transition into adolescence, specifically discussing whether such experimentation was appropriate. Responses from other parents were focused on how they navigated similar circumstances with their own children, or shared stories of when they themselves were teenagers. These discussions included issues related to sexting, sleepovers, and sexual relations between teenagers. Some throwaway conversations also extended to discussions of Romeo and Juliette laws in different states. These are the laws that govern sexual relations between teenagers, especially when one is older than the other (for example, a 14-year-old and a 16-year-old).

Pseudonymous and throwaway LLR values for gender & parenting expectations, financial problems, and abuse and therapy LDA topics. Throwaway users were usually more likely to thank other users for their responses. They were also more like to discuss the use of throwaway accounts.
Term LLR Term LLR Term LLR Term LLR Term LLR Term LLR
need 4,951 dad -8,141 degree 28.71 per -15.38 146.68 sound -21.05
help 4,882 adopt -4,543 25.55 can -12.37 try 23.91 parent -13.63
abuse 4,234 mom -4,208 consolidate 24.48 week -11.88 we’ve 21.20 kid -13.10
therapy 3,731 girl -3,968 big-law 23.53 tax -10.40 spoken 18.80 may -11.63
counseling 2,574 father -3,745 flagship 21.32 watch -9.87 16.66 unless -11.31

Pregnancy complications, loss and grief

When discussing issues around difficult pregnancies, miscarriages, and infertility, throwaway accounts were more likely to talk about the stigma associated with abortions and miscarriages. They also talked about tests for genetic and other medical screenings (e.g., spinia bifidia) that might have made them consider ending a pregnancy. Throwaway accounts were also more likely to share details about their challenges with conceiving children. For example, a number of throwaway comments shared that they used anonymous egg donations through medical tourism in Spain8. They also provided details about how many times they had to go through the IVF9 process in order to console those whose first IVF round was not successful.

Family health

Throwaway conversations were more likely to discuss specific medications used by children with ADHD, like Strattera.10 Throwaways discussed the effectiveness of the medication and its side effects. Throwaway conversations were also more likely to discuss autism especially if their child exhibited mimicry and adaptation behavior or if they failed to follow behavioral norms. This is because many thought these might be signs that the child might have a disorder (e.g., be on the spectrum). Specifically, parents asked for advice if their child was in need of psychological consultations, and sought recommendations for resources like therapists to help their children in case their children are diagnosed with autism or other disorders.

When discussing circumcision, throwaways talked about medical conditions like Urinary Tract Infections (UTI), phimosis,11 and other infections that might be considered sensitive. Throwaway accounts were also more likely to discuss medical consent to the circumcision procedure. Some parents compared European and American healthcare systems suggesting that parents have to be more active and adamant about not consenting to the circumcision procedure in the US where circumcision is a more culturally and medically dominant procedure.

Parenting and social beliefs

Throwaway comments also involved questions about LGBT teenagers as some parents did not know or were not sure how to support the teenager. This is especially true as teenagers come to grips with their own sexuality and sexual preferences. Responses were mostly supportive and included personal narratives of coming out as adolescents or personal experiences among parents whose children identify as LGBT. Many parents brought into the discussion their own conservative and/or religious upbringing and how they were coming to terms with an LGBT child. Other parents commented on the tension between their LGBT children and more conservative relatives. This was related to discussions about differences between more conservative parents and family living in rural areas of the country and younger children who are more liberal and at times reside in more liberal parts of the country. Some parents suggested that they did not want their children to interact with family members whom they thought were prejudiced which in turn made throwaway discussions more likely to talk about their perceptions of family members who exhibit trans-phobia, Islamophobia and homophobia.

LGBT parents also discussed their personal experiences using throwaways. For example, one parent identified as a “single, gay parent who adopted a child from [foreign country].” The parent further commented that they “can relate to the challenges faced by other adoptive LGBT parents.”

Throwaway accounts were more likely to discuss their search for parenting groups that suit their parenting philosophy and social/religious background. Parenting groups organized in churches were seen by many parents as safe environments to form communities with other parents. Some parents discussed how they would like to see their partners join the parenting groups at their Churches even though they were not members of the Church or followers of the same belief. Responses to such discussions were divisive. While some were supportive by offering their own positive experiences, others had reservations. For example, some responses noted that if they are looking for a way to build a parenting community, a church might not be the best place if parents are not believers in the tenets of the religion.

The focus of throwaway conversations was on the cultural side of religious affiliation. This might explain why the “religion” LIWC category was negatively related to throwaway accounts (see Table [table:features]). Sharma et al. indicated that the LIWC religion lexical category was a predictor for pro-choice activists online with the top words “Jesus, religion, bible, God and faith.” Most of the discussions in both these topics were related to cultural discussions as opposed to those related to the religious terms identified in . Even when the discussion referenced Church groups, the focus was on the social groups and how they relate to one’s upbringing, as opposed to discussions of religious foundations of the Church.

Divorce and custody

Throwaway accounts discussed serious challenges to the relationship between parents, which in some cases might lead to family court. When fathers described the deteriorating relationship between themselves and their partners, a number of responses from other fathers suggested that they contact a lawyer as soon as possible in order to protect their paternal role in court as fathers.

Throwaway comments about divorce and custody can be sub-categorized into (1) instrumental posts: asking specific legal questions; and (2) venting posts: venting about the challenges they are facing. Parents discussed their own experiences in family court. Parents also discussed their interactions with their ex-partners and their families in supervised visitations and similar interactions.

Throwaway discussions also focused on how parents can mitigate the effects of the separation process on children. A number of unmarried parents asked about custody issues if their child is born to unmarried parents and what kinds of responsibilities/rights they have in relation to the child.

Thanks mate!

We found that the term “thanks” is more likely to appear in throwaway conversations in a number of the significant LDA topics. Thanks was the top throwaway term (highest LLR value) under the abuse and therapy, parenting challenges, and speech and social development topics. It was the second most-likely term to appear in throwaway accounts for the financial problems (see Table [table:LLR1]) topic and was in the top ten terms for three other topics. We found that throwaways thanked other parents for their contributions in three main ways:

  1. They thanked others for providing different perspectives - “from the other side.” For example, a father who wants to understand his wife’s ppd experience would thank other mothers on the subreddit who gave him their insights.

  2. They also thanked others on the subreddit for being supportive. For example, Reddit users responding to throwaways disclosing a stressful parenting experience would tell them that things get better with time, or that they also felt exhausted as parents of children with special needs. Others said that “things get better” with time. Throwaways said that supportive responses made them feel less lonely, and that their responses were “exactly what they needed to hear.”

  3. Finally, they thanked other parents for providing specific and practical suggestions from their experiences. For example, responses to throwaways shared ways to access social services and resources for families with low income, as well as resources for families of children with special needs or therapy/counseling resources for parents and children. Responses also gave suggestions to fathers facing custody battles, or mothers facing domestic abuse.

Why I’m using a throwaway

Throwaway was the top term (highest LLR) for throwaway conversations discussing religious and social beliefs. It was the fifth most likely term for throwaway accounts discussing abuse and therapy (see Table [table:LLR1]) and generally more likely in throwaway conversation for mutiple LDA topics.

When explaining their use of throwaway accounts, parents gave three main reasons:

  1. Some users explained that they used a throwaway account because they are ashamed of discussing some experiences in their past, especially if they talked about incidents of sexual assault/domestic abuse and its repercussions on themselves and their families.

  2. Others were afraid of friends and family who might know their Reddit screen ID (pseudonym).

  3. Yet others wanted to ask questions that might be “risky” to ask with their main account. One parent explained that his decision to use a throwaway account was vindicated by the fact that he received a number of threatening messages from other users on the subreddit while using a throwaway. They wondered how much worse it would have been for those users to have known his main Reddit ID since there might be more identifying information on that account.

Some users recommended that new users to the subreddit or to discussions around sensitive topics use throwaway accounts until they “graduate to” pseudonymous accounts when they are more acquainted with the norms of the subreddit and/or the boundaries of the topic debate. Throughout their use of the throwaway account, the argument goes, they would get answers to their most burning questions about the sensitive topic and get used to the subreddit/topic discussion. Parents could also graduate to throwaways should they decide to share particularly stigmatizing details about different parenting topics.

Using throwaways could cause other users to question the credibility of the user since they might be trolling others on the subreddit. For example, throwaway users who asked for financial assistance (e.g., for medical costs to save a child) were considered trolls or scammers. A number of moderators explained that they would delete any posts/comments from throwaways if they recognized that they are indeed trolls. However, they are also cognizant of the difficulties that throwaways might be facing. Indeed, as one mod pointed out, discussing sensitive topics is a good reason for the use of throwaway accounts. Therefore, moderators suggested that they gave throwaways a wide berth before they consider deleting throwaway posts.

Study 3: Responses to throwaway comments

Pavalanathan and De Choudhury studied the use of throwaway accounts mental health related subreddits. They found that while throwaway users received less responses than the control group, they received responses of longer lengths and they received their first response at an earlier time than other users on mental health subreddits. They also received responses at a higher rate than the control group. The authors argue that this might be because the “Reddit audience tends to sympathize more with the throwaway [mental health] posters, and provide more helpful and contributory feedback and opinions because of their honest confessions.” Still in the area of mental health on Reddit, De Choudhury et al. used PSM to differentiate Reddit posts of users who might in the future engage with suicidal ideation as opposed to other Reddit users. Given that throwaway comments in other contexts like mental health and sexual harassment receive more responses that are longer, we ask:

RQ3: How do the responses to throwaway comments differ from responses to other comments?

Methods

When studying causal effects, randomized trial experiments are the gold standard. Experiment designers can randomly assign users to different user groups representing a particular treatment (e.g., new medication) and a control group (no medication) . In observational studies, on the other hand, researchers do not have the choice of setting control and treatment groups. We draw on methods from causal analysis to calculate the effect of the treatment (using a throwaway account) to the outcome (change in number of posts, score, number of posts etc.) while controlling for the effects of LDA topics and LIWC categories to reduce bias based on the confounding variables (determined in Study 1).

The propensity score shows the “probability of treatment assignment conditional on observed baseline characteristics" . Using the propensity score, we can analyze observational, non-randomized data in much the same way as we would a randomized controlled trial. Specifically, the propensity score will act as a balancing score since “the distribution of observed baseline covariates will be similar between treated and untreated subjects” . In our case, we consider treated subjects to be throwaway accounts and untreated subjects to be pseudonymous accounts.

In PSM, our goal is to match throwaway accounts and pseudonymous accounts based on features capturing the mechanisms that predict that the user is in the treatment class - a throwaway user. That is why we only included those features that are significant according to the logistic regression classifier (see Table [table:features]) as our covariates.

We used logistic regression on the covariates to calculate propensity scores for our PSM. We then matched the throwaway and pseudonymous groups using 1:1 nearest neighbor matching (matching 1,459 accounts). We used a nearest neighbor (KNN) algorithm with a caliper of 0.05 - matching on the logit of the propensity score using calipers of width equal to 0.05 of the standard deviation of the logit of the propensity score.

The covariates we are using for this model are the significant features reflected in the logistic classifier described in Table [table:features]. Table [table:regressThrowaway] shows the results of logistic regression of throwaway users status over the covariates identified through our classifier.

Logistic regressions of throwaway status over covariates: features determined from Logistic regression classifier. * p<0.005
Covariant Coefficient p-values OR CI
Tenure -1.83 0.00* 0.16 [-1.828,-1.826]
Role specific 2.27 0.33 9.72 [2.21,2.33]
Score -0.89 0.00* 0.41 [-0.898,-0.891]
Psych state 0.38 0.94 1.46 [0.27,0.48]
Fam. interac. 0.37 0.94 1.45 [0.26,0.48]
Judgement -0.15 0.96 0.86 [-0.22,-0.08]
Child custody -0.19 0.97 0.83 [-0.29,-0.08]
Parent. hard. -0.13 0.98 0.88 [-0.23,-0.02]
Sex talk 0.01 0.99 1.01 [-0.12, 0.15]
Comment len. 1.01 0.00* 2.74 [1.008, 1.009]
Support -1.84 0.25 0.16 [-1.87, -1.81]
Preg. & birth -0.05 0.99 0.95 [-0.19, 0.09]
Money discuss. -0.29 0.96 0.75 [-0.40, -0.17]

Table [table:summary_before_after_match] shows the standardized difference for each of the covariances before and after PSM matching. We employed the standardized difference here since, unlike significance testing, it is not confounded by sample size , and thus can be used to compare different matched samples containing different pair counts . Austin defines the standardized difference, d, as


$$d = (\overline{x_{treat}} - \overline{x_{cont}})/\sqrt{(s^2_{treat}+s^2_{cont})/2}$$

Now that we matched throwaway and pseudonymous accounts, we can compare average values from responses to throwaway comments with responses to the matched comments in parenting subreddits. We compare average values of (1) chance of receiving a response; (2) number of responses; (3) comment length (by word); (4) karma score; and (5) two LIWC lexical categories that are psychologically correlated with social support; (6) one LIWC category measuring affect; and (7) LIWC category measuring cognitive process. We applied Bonferroni corrections to the multiple hypotheses we tested in section 5.2.

Summary statistics showing the standardized difference values for each of the covariates before and after matching
Covariate Standard Difference Before Matching Standard Difference After Matching
Tenure 0.089 0.003
Gender & parenting expectations 0.001 0.001
Abuse & therapy 0.001 0.001
Parenting hardship 0.000 0.001
Work-parenting demands 0.000 0.001
Parenting nature 0.000 0.000
Financial problems 0.000 0.003
Family health 0.000 0.002
Speech & social development 0.000 0.003
Adjective 0.026 0.021
Growing pains 0.000 0.002
Religious & social beliefs 0.001 0.003
Numbers 0.005 0.007
Body image & privacy 0.000 0.001
Pregnancy challenges, loss, & grief 0.001 0.006
Verbs 0.122 0.042
Parenting groups 0.000 0.002
Circumcision 0.000 0.002

[table:summary_before_after_match]

Log-likelihood ratios

A number of statistical measures have been used with TF-IDF . One such measure is the Log likelihood ratios (LLR). LLR allows us to highlight word-collocations common within a particular domain (for us, throwaway responses) as opposed to those fixmore probable in other parts of the corpus (control responses in this case). In our calculations we look at the LLR when the probability of the word appearing in a throwaway response comment over appearing in the control responses. If the value is larger than 0, then the words are fixmore probably to appear in throwaway responses and if they are are in the negative values then they are fixmore probable to appear in control responses .

Qualitative analysis

To further analyze throwaway comments and responses, we sampled comments a high score of LDA topics found to be significant in our classifier (see table [table:features]). Using these samples, we constructed themes of topics discussed by Throwaway comments. In section 4, we introduce the themes along with the topics with high LDA scores within each.

Results: How do responses to Throwaway comments differ from other responses on parenting subreddits?

We found 917 responses to the control accounts by 679 unique responders and 3,993 responses to throwaway accounts by 2,249 unique responders. Below, we run t-tests to investigate the difference between average values in responses to throwaway groups and responses to matched comments.

We represented the chance for a response by a boolean variable get_response that would have a value of 1 if the comment got a response, and a value of zero otherwise. The difference in average value for getting a response between a throwaway comment and a matched comment is 0.18 (p = 0.0). Throwaway accounts also received 3.1 (p = 0.0) more responses per comment.

We found that responses to throwaway accounts on average had a score 2.11 points (p = 1.53e − 4) higher than matched responses. Additionally, throwaway responses were, on average, 12.90 words longer (p = 1.233e − 2) than their matched comment lengths. In summary, responses to throwaway accounts were longer than baseline responses, and they received higher Karma scores than the baseline comments. The difference in average time before first response for throwaway and pseudonymous comments was not significant.

We also measured LIWC categories with psychological correlates to social support, specifically the third person singular category and the social process category . Examples of the LIWC third person singular category are: {she,her,him}, and the social processes LIWC category included: {mate,talk,they} . We found that responses to throwaway accounts on average, have a higher value for the third person singular category than the matched comments (difference of 0.84 with p = 3.62e − 2). We also found that responses to throwaway accounts, on average, have a higher value for the social processes category (difference of 3.07 with p = 2.82e − 6). In other words, responses to throwaway accounts had higher average values for language categories that have been shown to be psychologically correlated with social support .

We also found that responses to throwaway accounts show more affect, a LIWC category which includes the words {happy,cried,abandon}. The use of these terms is associated with “emotionality.” Emotionality involves showing one’s emotions with others .

Throwaway replies were also more engaged in cognitive processes at a difference of 2.4 with (p = 2.53e − 2). The cognitive process LIWC category which include {cause,know,ought} is related to successful interactions in online communities , and associated with positive change in quality in life for users in health-support groups .

Throwaway accounts were more likely to receive responses from other users. Throwaway accounts also received more responses than control users. On average, those responses were longer, and had a higher Karma score. Additionally, these responses were more affective, expressed more emotionality, and exhibited more social support.

We also found that average values for sadness sentiment is 0.001 higher for throwaway responses than it is for matching responses (p=0.001), and 0.001 do you mean to have fear here?higher fear sentiment values 0.001 (p=0.032). Throwaway responses evinced 0.001 negative emotions on average (p=0.037) than did matched responses. and finally, the throwaway responses discourse was 0.001 more violent(p=0.033) than the matched responses. A sample of representative terms for violence as reported by Fast et al. include {hurt, break, bleed, broken, scar, hurting, injury}, pain {hurt, pounding, sobbing, gasp, torment, groan, sting}, fear {horror, paralyze, dread, scared, tremor, despair, panic}. Words associated with sadness include {crying, grief, sad} and negative emotion {hate, worthless, enemy}.

token tfidf token LLR token tfidf token LLR
thank 0.0047 lawyer 3.20 thank 0.0010 homeschool 5.25
child 0.0022 therapy 2.91 month 0.0010 thank 4.44
good 0.0021 counsel 2.82 baby 0.0029 straw 3.47
lawyer 0.0021 therapist 2.79 yes 0.0029 haha 2.51
kid 0.0021 cps 2.31 sleep 0.0023 yike 2.45
son 0.0022 abuse 2.29 dog 0.0023 congrats 2.13
parent 0.0022 emotions 2.16 homeschool 0.0023 poop 2.11
luck 0.0022 depress 2.12 kid 0.0023 bedtime 1.99
please 0.0019 please 2.02 mom 0.0022 ha 1.71
baby 0.0019 animal 1.88 right 0.0021 seat 1.62

[tfidf_table]

While LLR for conversations initiated by throwaway comments are more probably to include words like ‘therapy’, ‘counselor’, and ‘abuse’. The tf-idf weighted words shows the bigram ‘good luck’ indicating that the responders are wishing the best with a challenge they are facing. At the same time, the tf-idf wording also includes swear words like ‘fuck’. This might relate to the negative issues discussed in these conversations. The LLR for the control conversations contain more positive words like ‘congratulations’ and from the tf-idf weighted words, ‘awesome’ suggesting that the control conversations are less intense than they are in the throwaway conversations.

Discussion

Our results demonstrate how parents use throwaway accounts in unique and important ways that are distinct from their use of pseudonymous accounts. We first discuss how throwaway accounts allow parents to discuss topics that might be too stigmatizing to discuss using their main Reddit accounts. We then discuss how responses to throwaway comments provide parents with emotional and informational support that they might not find in other contexts. We argue that Reddit provides such advantages because it affords the flexibility of moving between throwaway and pseudonymous accounts, and because the use of throwaways is closely aligned with norms within Reddit communities. Building on these findings, we propose a hybrid platform that supports navigating between identified and anonymous accounts to support the discussion of stigmatized topics.

Throwaway discussions of stigmatizing parenting topics

While parenting might not be inherently socially stigmatized, hegemonic discourses around parenthood, emphasizing “intensive” parenting raises the expectations for what is normatively acceptable parenting experiences . This puts pressure on parents as they grapple with life issues that might not fit in the hegemonic normative view of parenting. This often leads to unreasonable expectations and extensive judgment when parents appear to “fail”. Results from Study 1 show an association between posting to throwaway accounts and parenting topics that range from child growing pains, financial problems, work-parenting demands, and abuse. Below, we draw on five qualitative themes from Study 2 to show how parents discussed issues that earlier literature suggests might be stigmatizing: (1) divorce and custody; (2) transition to adolescence; (3) LGBT transitions (under the parenting and social beliefs theme); (4) postpartum depression; and (5) pregnancy complications, loss and grief.

Earlier work suggests that stigma associated with divorce is more related to custody battles between parents . In Study 2 (see 4.3.9), we found that most of the discussions around divorce and custody relate to managing one’s relationship with an ex-partner and asking questions about other’s experiences in relation to custody. Throwaway accounts provide a space for parents to vent and ask about other parents’ experiences. For example, some parents discussed strategies to stay connected to their children after divorce.

When discussing transition to adolescence (see 4.3.5), throwaway comments discussed the parents’ experiences with relation to the social changes associated with adolescence. Parents discussed sexual experimentation and the methods that might have followed to inform/manage their child’s transition. Such topics could be stigmatizing as they relate to one’s sense of self and religious/societal beliefs . Throwaway conversations provide a window for parents to see different views of transitions to adolescence especially in relation to understanding sexual experimentation at this age.

Similarly, issues related to LGBT adolescents coming out to their parents have been found to be stigmatizing both for the children and their parents which might cause parents to reject their children . The discussions we identified in Study 2 (see 4.3.8) provide a wider window on personal experiences of coming out to family and in managing relationships with extended family members who might reject the child.

Postpartum depression (PPD) is socially stigmatised both as a parenting and mental health issue . In 4.3.3, we described how parents used throwaway accounts to ask others about their experiences with PPD in order to better understand their own experiences or that of their partner. This might be an important outlet for mothers, who experience PPD at relatively rates, and for fathers, who may receive less support for their PPD , to discuss their experiences.

Pregnancy complications and associated struggles was also a theme in Study 2 (see section 4.3.6). Throwaway accounts discuss abortion and how it is still stigmatized . They also discuss their experiences of abortion when having a prenatal diagnosis of special needs, which might have different connotations than other forms of abortion . Throwaways also discussed pregnancy loss and challenges associated with infertility. Such experiences are considered stigmatizing for both mothers and fathers as they make sense of their identities after the loss of a child or engaging in IVF experiences.

Stigmatized narratives and supportive responses

Using pseudonymous social media sites like Reddit allows parents to engage in online discussions while avoiding constraints introduced by “context collapse”, or multiple disparate audiences , on identified social media sites like Facebook. Even parenting topics that may not be stigmatizing may still violate normative or privacy expectations on sites like Facebook (e.g., concerns about teenagers). Therefore, when doing profile work —or the work to manage their self-presentation online—parents might choose to engage in self-censorship by not posting about these issues . Results from Study 1 and Study 2 demonstrate that parents use throwaway accounts — which provide greater anonymity — to discuss potentially stigmatizing issues relating to divorce and custody, raising neuroatypical children, adolescent transitions, domestic abuse, and financial challenges. Using anonymous throwaway accounts, parents may feel less constrained in their ability to openly disclose psychological or other kinds of challenges the parents and/or children are experiencing. This echoes findings from Andalibi et al. who found that more support-seeking behavior was detected for users of throwaway accounts, and in other contexts described above including sexual identity , sexual abuse , domestic abuse or mental health .

When people share intimate personal experiences, responses to these disclosures tend to be equally intimate . Jourard refers to this as the “reciprocity effect” of disclosures . This “mutual disclosure is often defined as an index of positive mental health...and an influential factor in the development of relationships” . Our results from Study 2 show how responses to throwaway comments contain “equally intimate” personal experiences. Replies to throwaway comments were supportive, sharing users’ personal experiences, wishing the throwaway users good luck, and inviting users to consider therapy and other forms of support, echoing findings about other online support groups where responses “show similarity, empathy, and understanding” of the original disclosure . Our results show how throwaway posters thanked those who responded to them both for their emotional (e.g., “this is exactly what I needed to hear”) and informational (e.g.,“thanks for all the suggestions”) support. Throwaway posters felt that responses to their comments were supportive because they demonstrated that they were “not alone.” Additionally, in Study 3, we found that, on average, throwaway responses have higher values of lexical categories psychologically correlated with social support .

Study 3 showed that responses to throwaway accounts are longer on average than non-throwaway account responses. These findings echo those of Pavalanathan and De Choudhury who found that throwaway comments received responses that were longer on average and findings from Pan et al. who found that more intimate disclosures elicited “higher levels of reciprocal self-disclosure in response message[s] .” Using LIWC categories, we also found that responses show more social support, affect and emotion than control responses. In Study 3, we found that replies to throwaway comments had higher karma scores on average than matched responses, indicating that these responses are endorsed and appreciated in parenting subreddits.

One open question is whether it is the content of the post, or the throwaway label, that induces support from commenters. A follow-up study might seek to compare similar posts that are not throwaways but that are similar in length and discuss similarly stigmatized content to see if comments are equally supportive. If the throwaway label is important for encouraging supportive behavior, other sites might benefit from allowing posts to have labels that indicate stigmatized disclosures; we discuss this further below.

Supporting disclosures with throwaway accounts

Prior work suggests that when observing a stigmatizing disclosure on social media, users might be unwilling to disclose their personal experiences in support of the original disclosure because of their own privacy concerns . However, people might share stigmatizing experiences like miscarriages on identified social networking sites (e.g., Facebook) after sharing them on pseudonymous social media sites like Reddit . In subsection 4.3.11, we showed that Reddit users “graduate” or move between their throwaway and main accounts depending on their experience posting on a particular subreddit or discussing a particularly stigmatizing issue. Allowing users on identified social media sites to move seamlessly between real-name and throwaway accounts could provide users with an opportunity to engage in disclosure of stigmatizing topics while providing an environment to receive supporting messages with equally intimate levels of disclosure from other users. We propose three design ideas for supporting sensitive disclosures in other sites outside of Reddit, such as Facebook, Instagram, or BabyCenter. Our proposed ideas are developed based on our results in tandem with online communities principles .

First, we argue that the use of temporary accounts can be productively adopted by real-name and other pseudonymous sites. The current design of identified social media sites and real-name norms on these sites inhibit users from sharing potentially stigmatizing issues; however, boyd argues that privacy is about users having agency to reveal “appropriate information in a given context” . The use of hybrid accounts could be particularly useful in closed or secret Facebook groups, where group norms currently often require that members email moderators a question which the moderators then post as “anonymous”. Our hybrid design proposal would encourage identified social media sites to incorporate throwaway account options into their designs. For example, a site like Facebook or Instagram might add a specific tag that signals the use of throwaway accounts rather than relying on users having to state “this is a throwaway account.” They might also make it easier for users to navgiate between their real-name/pseudonym identity and their throwaway account(s) for ongoing use.

We also argue that sites with communities and groups (e.g., Facebook Groups, BabyCenter groups) should rely on moderators to enable one or more “throwaway” identities for each user. For example, on Reddit, a private subreddit currently mandates that new fathers fill out an “application”12 to join the subreddit where a new member should provide a “a link to a post you’ve made on reddit indicating you have children,” and a picture of the username next to items only new fathers would have (e.g., a stroller or diapers). On Reddit, as well as in other communities, evidence of membership could be checked and if accepted by the group moderators, the user could be provided with a throwaway token for a set period of time. This could be a one-click option for group moderators so the additional burden on them is minimal, while allowing group members to seek support for particularly stigmatizing experiences within their own communities–these practices already take place on Facebook groups, as described above, but they are not yet supported by Facebook’s design.

Finally, newcomers could use throwaway accounts to learn about and guage the norms of the community. In subsection 4.3.11, we described how Reddit users could use a throwaway to make sense of the boundaries around appropriateness of topics. For example, on a group for parents of children with special needs, a parent might use their real-name account to ask questions about resources for their child in their local geographical area, but if they wanted to discuss an experimental medical operation, a topic that has been found to be sensitive for parents of children with special needs , they can gauge how the community will respond via a throwaway. This design would need to navigate the challenges of allowing parents to ask sensitive questions without overburdening the group with topics that might be negative or harmful (e.g., allowing questions about anti-vaccination principles via throwaways might negatively impact a parenting community). In this case, the proposal above might be extended to allow admins to govern users’ throwaway accounts (as is currently done when users email a question to a moderator in secret groups to post anonymously) so that they could permit sensitive but appropriate topics, while removing inappropriate topics. This introduces the question of how much power moderators should have which is out of scope of this work but is an important question to address.

Limitations and future work

Gaffney and Matias discussed the limitations of the Baumgartner dataset and noted that, while some of the comments were missing from the dataset, they found little risk associated with building machine learning models conducting linguistic analysis of the dataset. They examine work by Saleem et al. which trained machine learning models on the comments of subreddits that were subsequently quarantined. Their re-analysis of the data did not find any substantial differences from their original findings. While there are limitations of the Baumgartner dataset, we believe the results of our analysis using the dataset are robust.

The findings in our study are related to affordances and platform politics are specific to Reddit. Future work could analyze the use of other parenting social media sites to provide a more complete view of context collapse, anonymity, and discussion of stigmatizing topics across different social media sites.

Our method currently only accounts for those users who self-identify as throwaway accounts. While there are limitations to how we can identify those users who do not self-identify as throwaways, we can gain more insight into the decision to use a throwaway account through interviewing parents using Reddit and asking about their decision process when creating an alt account.

Conclusion

We analyzed the use of temporary anonymous (throwaway) accounts by parents on Reddit. We found that parents are more likely to discuss potentially stigmatizing issues like divorce and postpartum depression. Throwaway comments received more responses that were more detailed, thus providing more support for throwaway users. We propose design opportunities for identified social media sites to allow users to navigate between identified and anonymous accounts to support disclosure and support goals.

This material is partially based upon work supported by the Air Force Office of Scientific Research under award number FA9550-19-1-0029 and by the National Science Foundation under award CHS-1552503.


  1. https://www.well.com/about-2/

  2. https://files.pushshift.io/reddit/comments/

  3. https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/

  4. We used the logistic regression classifier as applied in Scikit-learn https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

  5. https://radimrehurek.com/gensim/models/coherencemodel.html

  6. https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/ldamodel.py

  7. https://radimrehurek.com/gensim/models/doc2vec.html

  8. Spain is a leading country in fertility medical tourism https://www.theguardian.com/lifeandstyle/2010/aug/22/spain-fertility-tourism

  9. In vitro fertilisation

  10. you can see this term in the doc2vec terms in Table 2.

  11. Phimosis is a medical condition where the foreskin of the penis cannot be pulled back past the tip of the penis. This condition might result in pain and other medical complications

  12. https://www.reddit.com/r/BrDaPublic/comments/48i4t1/how_to_join_rbreakingdad_an_idiots_guide/