Improvements to the moderation process – changes to Terms of Service

2022-09-13 | @andrea

You know what? Moderation is hard. It seems easy, just ban the “bad people” and move on, right? But think about it… How do you know, if a person uses queerphobic slurs because they’re queerphobic or if they’re actually queer and want to reclaim the slurs? Did that queer 14-year-old really mean they don’t care for this or that minoritised group, or are they misinformed, inexperienced, or maybe not fluent enough in a given language? Did someone put in a “weird-looking”, “self-contradictory” set of labels because they’re trolling us and mocking queer identities, or is this actually how they describe their identity and the moderator simply wasn’t aware that it’s a thing? We’ve received reports like “I know this card looks okay, but in real life this person is abusive” – should we really be expected to try to police someone’s actions IRL on the other side of the world and restrict access to the page as a punishment for things that allegedly happened outside of that page? Can we really handle hundreds or reports and automated triggers a week while staying up-to-date with all the potential discourses, transphobic dogwhistles and all the other potential complications?

Moderation is hard. We’re doing our best, but it’s always gonna be hard, and whatever we decide there’s always gonna be people angry at whatever decision we have made. And sometimes it’s literally thousands of people just angrily tweeting without knowing the full context, they send us messages filled with slurs and even telling us to kill ourselves. Creating Pronouns.page is a hard but very rewarding work. But as rewarding as it might be, situations like that make it really hard not to have a mental breakdown and want to just shut the project down and finally catch a break – even if you believe wholeheartedly that you’re right. And even if you’re wrong, it still hurts to see people who never contributed a second of their time into the development of this collaborative, public-source project but start bossing you around and demand to “ban them immediately!” or whatnot. It hurts to see people not respecting your time, demanding some actions here and now, tagging us in threads we’re already tagged in just to rush us – without giving us time to discuss anything, to think, to vote, to rest… None of us do this full time, we have jobs, we have school, we live in different timezones. But how dare we not reply immediately?

One other thing that we need to take into account when moderating is that while it’s our project, and we can decide who is allowed to use it, we’re also being transparent about the rules that are in place – our Terms of Service. In my opinion they’re worded pretty well – it’s a relatively short document, written in plain English, without legalese, easy to understand. It forbids posting harmful content and lists a non-exhaustive list of examples while leaving us some space for interpretation. In the cases where users aren’t openly hateful towards some group but imply some bias against them or state lack of support (which technically might not be hatred but in practice it is), we believe it’s also covered by the ToS prohibitions, but we wish it had been mentioned more explicitly so that our decision is better grounded in the actual text of ToS. With those cases being a no-brainer for us from the moral standpoint, but a grey area from the standpoint of a literal interpretation of the ToS, we would usually ban such content but restore it with a warning if the user appealed the decision, while keeping an eye on their card to see if the wording escalates.

We would like to eliminate that grey area from the Terms of Service, so we’re updating it. It will now explicitly mention that covert, vague, concealed hate speech will also be considered hate speech. The new wording is as follows (changes marked in bold):

It is forbidden to post on the Service any Content that might break the law or violate social norms, including but not limited to: propagation of totalitarian regimes, hate speech, racism, xenophobia, homophobia, transphobia, enbyphobia, queerphobia, queer exclusionism, sexism, misogyny, harassment, impersonation, encouraging self-harm and/or suicide, child pornography, unlawful conduct, misinformation, sharing of someone else's personal data, spam, trolling, advertisement, copyright or trademark violations. We reserve the right to interpret as hate-speech all instances of non-obvious, concealed hate speech (eg. telling some people to “DNI” because of their queer identity, transphobic dogwhistles like “superstraight” or “semibisexual”, etc.).

Two offences previously covered under more general terms, like “hate speech” or “unlawful conduct” are now forbidden explicitly: sexism and encouraging self-harm.

Since it’s just a clarification of existing, albeit vague, policies, and putting them into writing, we don’t consider it a material change to the ToS, which means they take effect right away, without an extra notice period.

On top of the Terms of Service we have our internal moderation guidelines that clarify how to handle some specific examples. We’ve updated them as well – they’ll now explicitly mention some of the recently challenged examples of misconduct and to make some rules more strict – for example the rule “if you’re not certain, give the person the benefit of the doubt” should still apply to cases of suspected trolling – but not cases of concealed hate speech.

We’re also changing the wording of the message informing of a ban, so that it doesn’t sound like we’re judging a person or invalidating their identity (“you’re banned!”), but simply verifying whether a specific content they had posted violates our ToS or not (“your card is no longer public”). We’re also informing users that bans are subject to an appeal and they can be reverted once the offensive content is removed (provided we believe that the offence stemmed from a mistake or lack of awareness rather than actual malice that would just be repeated as soon as the ban is undone).

We’ll also implement a mechanism that will require multiple moderators to look into a profile before issuing a ban. Usually we’re pretty much on the same page in the team, but having extra pairs of eyes to look into each case (as a technical requirement rather than how it used to be: an optional step) would help us for example to formulate the ban justification more clearly and more professionally.

We hope that the more explicit wording in the ToS and Guidelines will remove some ambiguity and will allow us to follow our conscience without worrying about potentially not sticking to the letter of it strictly enough. We support our queer siblings, and we want to be able to put that into action more easily.

Moderation is hard work. It’s complex and, by definition, controversial and ungrateful – someone is always gonna be unhappy regardless of what decisions we make. So we’re just gonna follow our hearts and adjust the ToS to match it, rather than trying to do it the other way around. And we hope that all the time and effort our moderators are putting into keeping Pronouns.page a safe and inclusive space for queer people will not go unappreciated.

React: