The system in the system
Some thoughts on the Twitter files, internet platforms and content moderation
How algorithms come to matter in contemporary society is not about trying to define what they are or at what points they act, but rather about questioning the ways in which they are enacted, and come together to make versions of reality. - Taina Bucher from If...Then: Algorithmic Power and Politics
This post explores some problems with current content moderation practices by social media platforms and how we should think about these problems. To summarise the points made:
The recent Twitter files reveal how the company came to the decision to ban Trump from the platform for tweets he posted after the Capitol Hill riots on January 6 2021.
This decision by Twitter reveals two problems with current content moderation practices. These are (i) internet platforms are 'conflicted arbiters of truth', and (ii) this flaw is systematic and not just limited to individual decisions.
Internet platforms could be regarded as 'conflicted arbiters of truth' as they are often required to navigate nuanced and complex decisions about what should and should not be allowed on their service (including determining whether content is legal or not) despite the fact that they are ordinarily ill-equipped to make such decisions. This is on the premise that these platforms, in moderating their content, are required to act as legislators, judges and administrative agencies, and that these roles can conflict with the main purpose of such companies which is to generate revenue via the mass collection and processing of user data.
This connects to the second problem, which is that the 'conflicted arbiters of truth' is not just limited to individual decisions that might be controversial. Platforms operate a wider system to make these decisions, and so each decision can be representative of a broader systematic approach to content moderation. This makes content moderation a systems problem, and not on issue limited to merely one controversial decision, such as banning the President of the United States. Using this lens of systems thinking, one may recognise a number of realities about current content moderation that tend to be overlooked, including that: (1) some decisions often involve personnel not always considered to be part of the traditional 'content moderation teams', (2) platforms often cooperate with each other and governments to make content moderation decisions, (3) platforms increasingly delegate decision-making to external entities (like fact-checkers), (4) platforms make certain ex ante design choices to build their content moderation systems, and (5) every content moderation decision involves trade-offs.
It is probably the case that platforms would prefer to not be required to make controversial content moderation decisions and instead outsource this work to other legitimate entities. This would allows platforms to simply take those decisions and build them into their content moderation systems without any real need for a human-in-the-loop to review more difficult moderation decisions. This type of 'plug-and-play' content moderation approach has been stipulated by certain judgments of the Court of Justice of the European Union, though it is doubtful that this would be deemed acceptable for all types of illegal content since these judgments were specifically concerned with defamation and copyright infringements.
Ultimately, the debates around content moderation ought to move away from focusing on individual decisions that may be controversial, and recognise that content moderation is a systems problem that requires a broader analysis.
The inspiration for this post comes from a paper by Evelyn Douek, Content Moderation as Systems Thinking, which was recently published in the Harvard Law Review. I highly recommend this paper to those with an interest in content moderation by internet platforms, its problems and how solutions based on systems thinking may be required. This post also includes some ideas from Jamie Susskind's book The Digital Republic: on Freedom and Democracy in the 21st Century.
The books used as sources for various quotes are listed at the end of the post. Feel free to leave a comment below and subscribe if you like the content.
The Twitter files and Jan 6 2021
Elon Musk has released a series of documents and internal communications from within Twitter regarding some high-profile content moderation decisions that took place before his takeover, which have been dubbed 'the Twitter files'.
There have been multiple episodes of the Twitter files covered by a selected array of journalists. Part 5 focuses on the events inside Twitter leading to the banning of then President Trump from the platform after the Capitol Hill riots on January 6 (J6), and this is the focus on this blog post (I know other episodes of the Twitter files have addressed other content moderation decisions by Twitter before Musk took over but these are not covered in this post).
Part 5 of the Twitter files were released in a Twitter thread by Bari Weiss, founder and editor of The Free Press. It is a long thread consisting of 46 tweets, but the following is a summary of the events leading up to Trump's ban post-J6:
Trump posted two tweets on January 8 2021, a few days after the riots on Capitol Hill
The first tweet read: "The 75,000,000 great American Patriots who voted for me, AMERICA FIRST, and MAKE AMERICA GREAT AGAIN, will have a GIANT VOICE long into the future. They will not be disrespected or treated unfairly in any way, shape or form!!!"
The second tweet read: "To all of those who have asked, I will not be going to the Inauguration on January 20th."
Up to that point in time, Twitter had resisted banning Trump from the platform. The rationale for this was explained in a blog post from October 2019. In that post, the company explained that part of its mission "is to provide a forum that enables people to be informed and to engage their leaders directly" and thus believes in protecting "the public’s right to hear from their leaders and to hold them to account." As such, given Trump was still President of the US at the time, there was a reluctance to ban him, even after J6.
However, this reluctance did not last, as the pressure to ban Trump grew internally as well as externally. Plenty of employees within the company had questioned why Trump had not been banned earlier and the support for his ban post-J6 had grown quite a bit.
The main question for Twitter regarding Trump's tweets on January 8 was whether they constituted an incitement to violence. The applicable rules here come from Twitter's glorification of violence policy. This policy states that Twitter users cannot (a) threaten violence against others nor (b) glorify violence. This means that a user cannot glorify, praise, condone or celebrate:
violent acts committed by civilians that resulted in death or serious physical injury, e.g., murders, mass shootings;
attacks carried out by terrorist organizations or violent extremist groups (as defined by our terrorism and violent extremism policy); and
violent events that targeted protected groups, e.g., the Holocaust, Rwandan genocide.
The initial decision by the Trust and Safety team, consisting of Twitter staff responsible for reviewing tweets for potential infringements of its policies, was that Trump's tweets did not constitute an incitement to violence. One staffer outlined in exchanges via Slack the apparent rationale for this decision: “It's pretty clear he's saying the ‘American Patriots’ are the ones who voted for him and not the terrorists (we can call them that, right?) from [J6].” Anika Navaroli, a policy official, also stated: “I also am not seeing clear or coded incitement in the DJT tweet. I’ll respond in the elections channel and say that our team has assessed and found no [violations] for the DJT one.”
Less than 90 minutes after this initial decision, Vijaya Gadde, Head of Legal, Policy, and Trust at the time, asked whether the tweets could actually be "coded incitement to further violence." She further stated: "This is an interesting question. I am going to speak with my team ASAP to see if we can run a quick survey to get reactions to the language contained in the tweet and get back to you."
As such, the 'scaled enforcement team' was surveyed and conveyed its belief that Trump may have violated the glorification of violence policy after all, based on an interpretation of "American patriots" as referring to the J6 rioters. Accordingly, that team thought of Trump as "the leader of a terrorist group responsible for violence/deaths comparable to Christchurch shooter or Hitler and on that basis and on the totality of his Tweets, he should be de-platformed."
This then led to a 30-minute all-staff meeting involving Dorsey and Gadde. At this point, many inside Twitter expressed their dissatisfaction with Trump not being banned for his tweets. After some deliberation, Twitter announced Trump's permanent suspension from Twitter and published a blog post giving the rationale for the decision:
Due to the ongoing tensions in the United States, and an uptick in the global conversation in regards to the people who violently stormed the Capitol on January 6, 2021, these two Tweets must be read in the context of broader events in the country and the ways in which the President’s statements can be mobilized by different audiences, including to incite violence, as well as in the context of the pattern of behavior from this account in recent weeks. After assessing the language in these Tweets against our Glorification of Violence policy, we have determined that these Tweets are in violation of the Glorification of Violence Policy and the user @realDonaldTrump should be immediately permanently suspended from the service. (Emphasis added)
What is content moderation?
Content moderation is where an internet platforms reviews content or communications on its service to identify that which is either illegal or breaches its terms of service (ToS). In the social media context, user-generated content (UGC) will be the type of content that these practices will be applied to.
Internet platforms will have their own ways of carrying out content moderation. However, in general across most platforms, it will usually involve the following steps:
Determining what content/activity is not allowed. This could be determined by the platform itself, which will in turn be informed by the principles it chooses to follow. These principles could take the form of (i) an underlying belief in free speech norms (e.g., protecting "the public’s right to hear from their leaders and to hold them to account"), (ii) a sense of corporate responsibility, and/or (iii) the necessity of meeting user expectations. (Andrew and Bernard 2021, 96) But deciding what is not allowed could also be pre-determined by applicable laws and regulations. For example, it is illegal to share child sexual abuse material (CSAM) online, and platforms would thus need to reflect this in their ToS.
Implementing measures to detect the illicit content. These measures can vary. They could include algorithms that perform simple keyword matching. Or they could be more sophisticated like machine learning classifiers. A process for human review could also form part of a platforms' content moderation system, especially for illicit content that is more serious and may therefore result in more stringent reprimands. A database of illicit content is often used to power the automated tools in terms of how they decide if the content is illicit or not (i.e., a keyword matching tool will compare content against a database of prohibited keywords to detect a match, or a ML classifier will be trained on a dataset of illicit content, like terrorist material, to identify content that is sufficiently similar).
Addressing the illicit content. Platforms will have different ways of dealing with illicit content once it has been detected. If the content constitutes a crime, then the user that generated it may be reported to the relevant law enforcement agency. If the content is not necessarily a crime but nevertheless violates the ToS, then the platform may decide to temporarily suspend or demonetise the users' account.
The diagram below gives a general overview of the different components of a content moderation system and the general process flows involved:
With this system, there are broadly three different types of content moderation:
Ex ante proactive moderation. This is where content is moderated by the platform before it is uploaded to the service by a user. This is typically where platforms implement automated tools to detect UGC that may be illicit.
Ex post proactive moderation. This is where content is moderation by the platform after the UGC has been uploaded to the service by a user. The tools implemented at this stage are essentially designed to identify the illicit UGC that may have been missed during the ex ante proactive moderation process.
Ex post reactive moderation. This is where content is moderated after it has been flagged by others as being potentially illicit. The content could be flagged by users, governments or even other platforms. This is the route through which Trump's post-J6 tweets were reviewed at Twitter.
One important point to note is that content moderation systems are part of a wider system operated by social media platforms (more on this further below). Thus, content moderation systems will operate alongside content recommender systems, content feeds and other mechanisms that form part of the overall user-experience on a platform. As such, it has been argued that the essence of internet platforms is the moderation of content and that this is "the commodity they offer." (Gillespie 2018, 207) This is because, "by crafting the "right" feed for each user, the "right" social exchanges, and the "right" kind of community, it allows them to 'promote engagement, increase ad revenue,[...] facilitate data collection' and ultimately shape user participation into a deliverable experience." (Andrew and Bernard 2021, 95)
The problems with current content moderation
There are two issues I want to try to articulate regarding current content moderation approaches by internet platforms that the Trump Twitter ban demonstrates:
Internet platforms are conflicted arbiters of truth
This flaw is systematic and not just limited to individual decisions
1. Conflicted arbiters of truth
This issue can be explored through the EU's Digital Services Act, which was passed in October 2022 (though does not become legally applicable until 17 February 2024). The objective of this Regulation is set out under Article 1(1):
The aim of this Regulation is to contribute to the proper functioning of the internal market for intermediary services by setting out harmonised rules for a safe, predictable and trusted online environment that facilitates innovation and in which fundamental rights enshrined in the Charter, including the principle of consumer protection, are effectively protected. (Emphasis added)
As part of this, the DSA regulates content moderation carried out by internet platforms like Twitter (which would be included in the definition of 'intermediary services' used in the legislation). 'Content moderation' is defined under Article 3(t) as essentially the identification of either illegal content or content that infringes a service providers' ToS. This could include the demotion, demonetisation, disabling of access to, or the removal of that illicit content as well as the termination or suspension of a users' account.
Under Article 3(h), 'illegal content' is defined as "any information that, in itself or in relation to an activity, including the sale of products or the provision of services, is not in compliance with [the law]." Recital (12) clarifies that this could include "information, irrespective of its form, that under the applicable law is either itself illegal, such as illegal hate speech or terrorist content and unlawful discriminatory content, or that the applicable rules render illegal in view of the fact that it relates to illegal activities."
The DSA provides two legal bases for service providers to carry out their content moderation of illegal content:
Voluntary own-initiative investigations and legal compliance, i.e., voluntary content moderation (Article 7)
Orders to act against illegal content, i.e., mandatory content moderation (Article 9)
With voluntary content moderation, service providers are permitted to, in good faith and a diligent manner, investigate or implement measures to identify and remove illegal content on their service, including when doing so in compliance with the DSA itself. Recital (26) further states that acting in good faith and in a diligent manner means "acting in an objective, non-discriminatory and proportionate manner, with due regard to the rights and legitimate interests of all parties involved, and providing the necessary safeguards against unjustified removal of legal content, in accordance with the objective and requirements of [the] Regulation."
Alternatively, service providers can also be mandated to moderate content by courts or public authorities. This takes place when that court or public authority imposes an order on the provider specifying the content that should be addressed by the provider and, among other things, the reasons explaining why the content is illegal under either EU or Member State law.
Back in early 2021, shortly after the DSA was first proposed by the European Commission, I wrote on CS about the potential problems with voluntary content moderation of illegal content by internet platforms:
a greater problem with the content moderation rules under the DSA is that, ultimately, private actors will be assuming State-like responsibilities in policing their platforms. In particular, intermediaries will be playing three distinct roles: they will be acting like a legislature when “defining what constitutes legitimate content on their platforms”; they will be acting like judges “who determine the legitimacy of content in particular instances; they will be acting like administrative agencies “who act on [their own] adjudications to block illegitimate content”.
The problem with intermediaries taking on these roles, especially the judicial role, is that there exists a conflict of interest. On the one hand, internet intermediaries “are commercial players which compete in data capitalists markets for users, business partners, and data-driven innovation”. On the other hand, they are required to use their technical capabilities to moderate activity on their platforms, which for social media platforms inevitably involves the regulation of people’s speech. As such, the DSA potentially “blurs the distinction between private interests and public responsibilities”.
Furthermore, intermediaries are becoming increasingly reliant on AI-powered content filtering systems to moderate their platforms at scale. However, such systems “effectively blend norm setting, law enforcement, and adjudication powers”. In particular, content filters are not always successful at detecting the nuances of UGC that may not necessarily render it illegal. A common example of this is in relation to copyright, whereby considerations must be made as to whether certain UCG benefits from ‘fair use’ or another lawful exception under the relevant copyright law. Content filters may not always detect when these exceptions apply and thus such “errors in algorithmic content moderation may result in censoring legitimate content, and sometimes also in disproportionately censoring some groups”.
The controversies surrounding the delegation of public responsibilities to private actors are more heightened in the context of online speech. In its proposal, the Commission states that ‘harmful’ content, while not necessarily constituting illegal content, will not be defined by the DSA nor be subject to removal obligations since “this is a delicate area with severe implications for the protection of freedom of expression”. However, such regulation may nevertheless come through the backdoor due to the definition of “illegal content” provided in the DSA; it includes any information (either in itself or by reference to an activity, including the sale of goods or the provision of services) which is not in compliance with Union law or the law of a Member State.
The potential problem here is that such a definition plugs the DSA into a body of caselaw from the European Court of Human Rights (ECtHR) that has, so far, lacked clarity on the question of so-called “hate speech”. More specifically, the Court has somewhat struggled on “the demarcation line between types of harmful expression that ordinarily are entitled to protection and the most harmful types of expression that attack the values of the [European Convention on Human Rights] and therefore do not enjoy protection”.
The EU is a signatory to the European Convention on Human Rights (the Convention) and thus binds all of its Member States. Furthermore, the meaning and scope of the rights contained in the EU Charter must be the same as those laid down by the Convention so far as the rights contained in either text correspond with each other.
Article 10 of the Convention, as well as Article 11 of the Charter, states that everyone has the right to freedom of expression. This includes the freedom to hold opinions and to receive and impart information and ideas without interference by public authority and regardless of frontiers. However, such a right is not without limitations as it carries with it duties and responsibilities. Thus, free expression may be subject to such formalities, conditions, restrictions or penalties as are prescribed by law and are necessary in a democratic society on various legitimate grounds. For example, restrictions may be placed on free expression for the protection of health or morals or for the protection of the reputation or rights of others (eg defamation).
The ECtHR “has by and large interpreted Article 10 expansively and in a way that is faithful to the broad principles of freedom of expression”. In other words, free expression is the default rule, whereas its limitations are the exception of which must be explored on a case-by-case basis. This has been applied even to offending, shocking or disturbing ideas, for such ideas must be allowed to circulate to ensure “pluralism, tolerance and broadmindedness without which there is no democratic society”.
Hate speech, while not appearing anywhere in the text of the Convention, is a term referring to speech that is so vulgar and offensive that it cannot possibly warrant protection under Article 10. However, the regulation of hate speech by the ECtHR has not historically been carried out on the basis on Article 10. Rather, Article 17 of the Convention has been the source of the Court’s jurisprudence on such speech. That provision states that nothing in the Convention shall be interpreted as allowing anyone to engage in any activity or perform any act aimed at the destruction of any of the rights and freedoms contained in the Convention or at their limitation to a greater extent than is provided for in the Convention. Accordingly, Article 17 acts as a “safety valve that denies protection to acts that seek to undermine the Convention and go against its letter and spirit”.
However, the criteria for using Article 17 of the Convention as a basis for suppressing hate speech has been somewhat ambiguous. In Delfi AS v Estonia, the ECtHR held that an online news portal can be liable for unlawful hate speech posted on its platform. However, while the Court was clear on the question of liability, it avoided the preliminary question of what constitutes hate speech. The content in this case directly advocated for acts of violence and thus constituted hate speech which was deemed unlawful under Article 17. Even so, the lack of an analysis as to criteria for determining hate speech under that provision leaves the question open in relation to speech that does not directly advocate violence but may be considered ‘borderline’ or otherwise offensive. The balance that should be struck between Articles 10 and 17 under the Convention therefore remains to be clarified.
Yet, the DSA proposes delegating that difficult question to internet intermediaries that are not necessarily focused on upholding the rule of law. A resulting concern is “the risk of over-censorship and the removal of content ‘to be on the safe side’ and to thereby avoid incurring liability for such content”. The actions of various tech companies in the aftermath of the Capitol Hill riots in January 2021 could be cited as an example of this. One could question whether internet platforms removed former President Trump’s accounts after the riots on the basis of the illegality, or at least the immorality, of his actions or rather on the basis that it was commercially expedient to do so given that other platforms were doing the same.
The argument presented in the above extract can be summed up as follows:
In moderating UGC on their service, platforms are essentially being required to be legislators, judges and administrative agencies with respect to UGC that is potentially illegal.
The fundamental problem with requiring platforms to play these roles is that that they are private companies whose primary incentive is to operate a surveillance capitalist business model that does not ordinarily prioritise the protection of individual rights nor upholding the rule of law.
This reveals a conflict of interest that makes platforms ill-equipped to answer difficult legal questions, such as those around what constitutes 'hate speech', of which even some courts have not been able to clearly define.
This ties back to the Twitter files and the decision regarding Trump and J6; this decision reveals a seemingly arbitrary process enacted by a entity that is not ordinarily well-positioned to navigate such delicate issues that even more legitimate entities are not always able to address (I expand on this point later in this post).
Some may question why this should even be a concern. Twitter is a private company, and surely private companies are free to run their platforms as they see fit. If you don't like it, you can simply leave. But this argument is flawed in two ways:
Internet platforms are in fact not free to run their platforms as they see fit. The DSA itself is proof of this; those operating in Europe have to abide by its rules. But even apart from EU legislation, there are plenty of other laws that impact how internet platforms moderate content on their service. As Graham Smith wrote in the preface to the first edition of his book, Internet Law & Regulation, all the way back in 1997:
...the suggestion that the Internet has no law is born of wishful thinking more than of cogitation. Local laws of each jurisdiction do apply to activities conducted using the Internet. While enforcing such laws presents new challenges, the pan-political nature of the Internet may in fact render it vulnerable rather than immune to the laws of jurisdictions around the world. (Smith 2020, xi)
In addition to the legal reality, there is also the moral argument. The truth is that internet platforms are not mere conduits as they used to argue (see more on this here). Instead, these platforms "have been exposed as actively managing and shaping 'the norms and boundaries for how users may form and express opinions, encounter information, debate, disagree, mobilize, and retain a sense of privacy'." (Andrew and Bernard 2021, 93-94) Accordingly, a decision by Twitter to ban the leader of a country from a platform that has become one of the central places for modern debate and deliberation does matter regardless of whether the entity making the decision is a private corporation or a public authority.
2. Content moderation as a system
The problem with internet platforms being conflicted arbiters of truth is exacerbated by the fact that, not only do these platforms need to make difficult legal decisions regarding users like Trump, but they also have to find a way to repeatedly and consistently make those decisions at scale given the large volumes of UGC disseminated on their services (in Q3 of 2022, Facebook took action on over 3 billion pieces of content!).
This therefore makes content moderation a systems problem, and not one about individual decisions like those regarding Trump and his Twitter account. This is what both Jamie Susskind and Evelyn Douek have highlighted in different ways.
In his book, The Digital Republic: on Freedom and Democracy in the 21st Century, Susskind demonstrates that the crux of the issue regarding platforms and content moderation is not that they might be biased against conservatives or any other particular group (depending on what you believe). The problem is much bigger, and this is how Susskind explains it through the lens of a republican (the ideology that is, and so not to be confused with 'Republican' as in member or supporter of the Republican Party in the United States):
Imagine a king so mighty that he can meddle in the lives of his subjects however he pleases. At his command, they will be thrown into dungeons or put to the sword. His powers of taxation and conscription are unlimited. If another man's property takes his fancy, the king need only snap his fingers. Men live or die at his mercy. His word is the law.
Just because the king has these powers, however, does not mean that he will exercise them. He might be nasty, but he might also be benign. Periods pass in which he exercises his power with restraint.
Would you say this is a free society?
The market individualist answers this question: 'It depends'. When the king behaves like a tyrant, then the people are unfree. But when he is in a good mood and lets the people get on with their lives, then yes, strictly speaking, there is freedom.
The republican offers a different answer. To be truly free is to live beyond the arbitrary power of anyone else. So long as the king has the capacity to govern as a tyrant, and the choice is his alone, then his is not a free nation.
The application of this principle to digital technology is simple enough: We should be concerned about the growing power of digital technology even when it is being used in ways that don't bother us. For as long as a technology company or government could simply change its mind and use its power to impinge on our rights and liberties, we are unfree. To borrow a metaphor from the English civil war, trusting blindly in the power of others is like putting your head in the mouth of a wolf and hoping it will not bite. That's true even if the wolf is fluffy and cute most of the time. (Susskind 2022, 26-27)
The point here is that it is not just about platforms exercising their content moderation powers unjustly against a particular group (though this is still clearly a bad thing); it is also about the fact that platforms have the ability to wield their power unjustly in the first place.
As such, the decision by Twitter regarding Trump and J6 demonstrates a systematic problem and not an isolated instance. In her Twitter thread, Bari Weiss revealed several other content moderation decisions regarding controversial tweets from other high-profile political figures. For example, in October 2020, the former Malaysian Prime Minister tweeted the following:
Twitter did delete the tweet for violating its glorifying violence policy, but it did not suspend the account like they did for Trump a few months later.
This reveals something quite important about content moderation that many people seem to overlook. While any one decision could itself be significant (for example the decision to ban a Head of State), the way in which a platform systematically applies its rules is also highly relevant. This is something that Douek stresses in here recently published paper called Content Moderation as Systems Thinking:
Content moderation bureaucracies are a "they" not an "it": they are made up of a sprawling array of actors and institutions, each of which has different functions and goals. (Douek 2022, 539)
Accordingly, the perception of content moderation carried out by internet platforms ought to move away from an individualistic view to a broader, systematic one; content moderation is a systems problem.
Using this systems thinking approach, Douek recognises a number of realities regarding the current approach to content moderation by platforms that ought to be recognised:
Some decisions often involve personnel not always considered to be part of the traditional 'content moderation teams'. For example, decisions around behavioural moderation (which looks at how someone uses a platform rather than just the content of what they post, and this is typically the focus when identifying potential information/influence operations on a platform), "are typically handled by cybersecurity-oriented teams, like Google's "Threat Analysis Group" or Twitter's "Site Integrity" team, which are distinct from the policy teams that handle platforms' rules more generally." (Douek 2022, 541)
Platforms often cooperate with each other and governments to make content moderation decisions. This happens with the detection of CSAM, in which common databases (like those managed by the National Center for Missing & Exploited Children, or NCMEC) are used by platforms to identify the existence of such content on their services.
Platforms increasingly delegate decision-making to external entities. The most significant example of this is the use of fact-checkers; platforms sometimes turn to third-parties to guide certain content moderation decisions as it allows them to ostensibly maintain their neutrality whilst at the same time respond to concerns about false information spreading on their service. Reliance on the World Health Organization to identify misinformation regarding Covid-19 is another example of this. The Facebook Oversight Board is also another manifestation of this.
Platforms make certain ex ante design choices. The nature of these choices can vary. They could, for instance, take the form of shadow banning; YouTube "touts progress in combatting violating content on its site by adjusting its recommendation algorithm to demote "borderline" content." (Douek 2022, 546) Other design choices include prompts for users before re-sharing content (e.g., 'Have you read this story?'), and this is one measure that has been promoted by Frances Haugen quite a bit (which you can see in her presentation here, and I also think this is the best resource for those wanting a full view of FH's thoughts on content moderation (particularly on Facebook), including its problems and the potential solutions).
Every content moderation decision involves trade-offs. These tradeoffs could be between factors like efficiency, accuracy, responsiveness to stakeholders, and commitment to procedural rule-of-law. As such, the "way tradeoffs are resolved reflects substantive value judgments that are embedded into content moderation systems." (Douek 2022, 548) As an example, there can be a trade-off between, on the one hand, legitimacy and procedural justice, and on the other hand, accuracy and efficiency:
Legitimacy and procedural justice: Platforms need to explain the rationale behind their content moderation decisions, despite being private companies, because their decisions do impact public speech and democratic discourse. Accordingly, "content moderation systems are expected to observe some of the foundational values of public law, including procedural regularity and non-arbitrariness." (Douek 2022, 553)
Accuracy and efficiency: "...it may be that greater emphasis on procedural justice for individual users decreases overall system accuracy and will have a complicated relationship with overall fairness by privileging certain interests over others." (Douek 2022, 554) In other words, it is difficult to moderate controversial content at speed and at scale, especially when the decision-making necessitates careful and deep analysis that is difficult to execute via an algorithm with no human intervention.
Bringing 1 and 2 together
The 'conflicted arbiters of truth' problem shows a conflict of interest within platforms meaning that it is perhaps mistaken to trust them with decisions that impact people's individual rights and involve complex legal debates.
Content moderation is not about individual decisions but about a wider system that (a) captures and codifies that conflict of interest in the form of a platforms' ToS and the technical architecture designed to enforce those ToS at scale, and (b) consistently and repeatedly applies this to content moderation decisions at scale, impacting potentially large groups of users.
What might be the ideal scenario (from the POV of platforms)
I do think there is something to the idea that internet platforms do not actually want to be in the position of making difficult legal determinations about UGC and figuring out a way to apply these at scale. On its face, it seems like such a difficult task, as recognised by Mark Zuckerberg:
Due to the design and algorithms of the News Feed, fake news was essentially a product problem, under Zuckerberg's domain. But there was no appetite to apply engineering to fix the problem. In part, misinformation persisted because Facebook, from Zuckerberg on down, believed in free speech, even when people didn't tell the truth. He held a Panglossian view of the goodness of humanity, and felt that people would sort out for themselves what was true. In addition, he viewed with pure terror the idea that Facebook would be drawn into becoming the arbiter of truth. (Emphasis added) (Levy 2020, 357)
But given everything that has happened since 2016, platforms have now recognised that they do not really have a choice but to take on this difficult role. This seems somewhat evident from Weiss's thread. After the Trump ban, one employee said:
For the longest time, Twitter's stance was that we aren't the arbiter of truth which I respected but never gave me a warm fuzzy feeling.
Twitter's COO at the time, Parag Agrawal (who later succeeded Dorsey as CEO), also said this on the ban:
I think a more [important] question now is the future of public conversation. In my mind, this is the end of the road for centralized content moderation - hard to believe that this approach will be sustainable moving forward.
So there does seem to be an acknowledgement that content moderation is important and that there is a duty on internet platforms to do this in the best way possible. But even so, as Agrawal suggested, I do not believe that this perception of responsible content moderation is sustainable for most platforms, and my argument for this has a lot to do with the 'conflicted arbiters of truth' problem highlighted beforehand. To explain:
Internet platforms are systems. A system represents interrelated elements of identifiable activity. They therefore do not exist in a vacuum and can be created and curated.
If you can identify a system, and observe its behaviour over time, you may be able to identify the incentives that drive that system. In other words, you can see how the interrelated elements work together to perform a particular function, i.e., you can figure out the purpose of the system.
Systems can exist within systems, and where this is the case there is an eco-system. Being able to draw the appropriate boundaries to highlight the different systems that make up an eco-system can sometimes be difficult since these are often complex structures that are interrelated. But at least in an abstract sense, one can imagine how an eco-system can be made up of multiple systems operating in the same environment.
The services run by internet platforms are an example of this complex eco-system. Within that eco-system exists the content moderation system. As such, content moderation is merely one part of a wider eco-system. As highlighted before, internet platforms are best described as content organisers in which they not only moderate content, but also recommend and curate content through recommender systems. Platforms are therefore running a complex eco-system consisting of several algorithms and technical processes that scan, filter, remove and promote UGC at scale.
Ultimately, the identifiable behaviour of a platform's eco-system will largely stem from those systems that are more dominant. The other systems within the same environment will therefore essentially serve the needs of those more dominant systems. I would argue that the systems that dominate the eco-system operated by internet platforms are ones based on surveillance capitalism. In other words, the primary function of platforms is to generate revenue via the collection and processing of user data. These platforms are for-profit corporations after all, and this is something that Google had to realise early on in its existence; after the dot com crash in 2000, many of its investors cast doubt about the company's growth and even "their top venture capitalists, John Doerr from Kleiner Perkins and Michael Moritz from Sequoia, were frustrated." (Zuboff 2019, 72) This was because, up to that point, Google was a free service that generated little revenue. The crash thus created a sense of emergency at Google which forced the company to start looking for a way to generate profits and make the business viable in the advent of difficult times. A business model based on surveillance capitalism is what Page and Brin eventually turned to, and this model has formed the foundation of its shareholders' expectations ever since. Facebook/Meta and Twitter have also endured similar fates, and so all these social media companies are welded to this model in a rather irreversible way.
In very simple terms, with a surveillance capitalist system, the more data that can be collected and processed, the more revenue the platform can generate. Thus, there is an incentive built into the system to facilitate mass data collection. Anything that frustrates this is therefore, in the end, not going to be prioritised, and this includes content moderation. This is because content moderation does not result in more content being generated; it either maintains the level of UGC on the service (even if it demotes content, like shadow banning does) or reduces it (by removing content). Thus, if content moderation measures frustrate the surveillance capitalist system that is responsible for generating revenue and making the platform profitable, the investment in and development of such measures will always be limited and will never surpass a level that causes too much friction for the revenue-generating system. This is a point that has been made by FH (again, see her presentation here) and many others.
Based on this understanding, I do think that internet platforms, for the most part, view content moderation as a chore that distracts them from what they are really built to do. They do not want to be having to make difficult decisions like the Trump ban post-J6 on a regular basis. This is the impression I get from Dorsey in a recent blog post addressing the revelations from the Twitter files and J6:
The biggest mistake I made was continuing to invest in building tools for us to manage the public conversation, versus building tools for the people using Twitter to easily manage it for themselves. This burdened the company with too much power, and opened us to significant outside pressure (such as advertising budgets). I generally think companies have become far too powerful, and that became completely clear to me with our suspension of Trump’s account. As I’ve said before, we did the right thing for the public company business at the time, but the wrong thing for the internet and society.
Assuming that this is true, then what would be the ideal scenario for internet platforms and content moderation? If it were up to the platforms, it would probably look something like following:
Difficult decisions around speech or other potentially controversial content is definitively determined by a third party, preferably a court, a legislature or some other public authority.
The platforms take those decisions, condense them down to lines of code (at least wherever possible), and implement automated tools within content moderation systems to enforce these decisions at scale (essentially some sort of 'plug-and-play' content moderation).
With this approach, there is little or no need for human review; the decision regarding what should and should not be allowed has already been made, and all platforms need to do is implement such decisions via technical means. There is no in-between grey area to contend with that requires a human-in-the-loop, and content moderation decisions can just simply be automated.
Clearly, this ideal scenario is quite unrealistic. Yet, interestingly, this scenario has been alluded to in two judgments from the Court of Justice of the European Union (CJEU); Eva Glawischnig-Piesczeck v Facebook (2019) and Republic of Poland v European Parliament (2022).
To summarise Eva Glawischnig-Piesczeck v Facebook (2019):
In this case, Eva Glawischnig-Piesczeck, an Austrian politician, brought legal proceedings against Facebook to force the social media platform to take down a piece of content defaming her as well as other similar pieces of defaming content. It was contended during those proceedings whether Facebook was obliged to take down content from its users that were similar to the original defaming content.
On this, the CJEU held, as it had done in previous judgments, that Facebook could not be subject to an injunction requiring it to monitor all content uploaded to its platform to detect illegal content (para. 31).
However, EU law does not prevent an injunction requiring Facebook to continuously identify and remove content already declared illegal by a national court (para. 37).
In addition, the Court held that Facebook could be required to identify and remove content containing "equivalent information" to that of the original illegal content (paras. 41-45). However, it was stressed that the information provided to the platform to identify equivalent content must be of a sufficient quality such that Facebook can use automated tools for the detection of it and thus not need to resort to carrying out its own "independent assessment" of the potentially illegal content on its platform. Essentially, Facebook should be allowed to simply 'plug-and-play'.
Similar stipulations were made by the CJEU in its judgment from Republic of Poland v European Parliament (2022):
More specifically, it was concerned with Article 17(4) of the Directive. This provision obligates online platforms to use their 'best efforts' to preventing the future uploading of content by its users that has previously been identified as infringing content. The Polish argument argued that such an obligation would effectively force platforms to implement upload filters that scan all content before it is uploaded to identify that which is potentially illegal, which would constitute general monitoring and unjustly infringe on freedom of expression (para. 24).
The Court held that the Directive, as a whole, contained sufficient safeguards to prevent such unlawful activity from happening and thus, if a platform follows the legislation properely, this should not lead them to implementing technological measures like upload filters that would infringe EU law (paras. 70, 73, 80, 84, 93 and 97).
In addition, the Court also repeated what it had stated in Eva Glawischnig-Piesczeck v Facebook regarding the use of automated tools; it held that platforms cannot be required to prevent the uploading of content for which the determination its legality requires an independent assessment to be carried out by that platform. Instead, platforms must be given sufficient information to identify content through automated tools that has already been declared illegal by a court (para. 90). In other words, platforms should be able to plug-and-play.
It should be noted that the stipulations made by the CJEU in both of these cases were in the context of defamation and copyright. It is thus unclear whether the same approach would be applicable to, or even appropriate for, other forms of illegal content, such as hate speech or CSAM. But if this 'plug-and-play' approach were applicable in other contexts, platforms would probably welcome it.
This post started with a quote from Taina Bucher's book: "How algorithms come to matter in contemporary society is not about trying to define what they are or at what points they act, but rather about questioning the ways in which they are enacted, and come together to make versions of reality." (Emphasis added) (Bucher 2018, 152)
I think this sums up the main point that I am trying convey in this post. When an internet platform makes a controversial content moderation decision, we are often quick to ask who is to blame, who (or what) is responsible and was an algorithm or a human at fault. But these questions are too narrowly focused. As I have hopefully been able to illustrate, content moderation decisions are the output of a wider system. What that wider system consists of, and the way that its different components work together to produce the identifiable emergent behaviour of the system, are crucial to addressing the issues related to content moderation by platforms.
If we confine our response to individual decisions, like the Trump ban, we may overlook the system that produces such a decision in the first place and thus fail to realise that it may actually represent something much bigger. These systems, and their constituent parts (which may include ML and non-ML algorithms, processes for human review and other mechanisms), are manifestations of certain choices made by humans, and thus it is wrong to think of individual content moderation decisions as being attributable to a single person or thing and that the fix is to reverse the individual decision made by that single person or thing. This is what the Trump ban post-J6 shows and therefore what platform regulation should focus on. As Douek puts it:
[We should move towards] an approach to content moderation regulation based on systems thinking, which focuses on the ex ante institutional design choices involved in creating a system of mass administration, rather than ex post individual error correction. Doing so makes it possible to imagine many more mechanisms of regulatory reform than the individual rights-focused model of content moderation suggested by a First Amendment analogy. The systems thinking approach draws instead on principles and practices of administrative law, which has long grappled with how to bring oversight and accountability to massive unelected bureaucracies at scale and in complex systems. (Emphasis added) (Douek 2022, 532)
Giancarlo Frosio (ed), The Oxford Handbook of Online Intermediary Liability (2020 OUP)
Graham Smith, Internet Law & Regulation (2020 Sweet & Maxwell)
Jamie Susskind, The Digital Republic: on Freedom and Democracy in the 21st Century (2022 Bloomsbury)
Johnathan Andrew and Frédéric Bernard, Human Rights Responsibilities in the Digital Age: States, Companies and Individuals (2021 Hart Publishing)
Professor Shoshana Zuboff, The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (2019 Profile Books)
Steven Levy, Facebook: The Inside Story (2020 Portfolio Penguin)
Taina Bucher, If...Then: Algorithmic Power and Politics (2018 OUP)
Tarleton Gillespie, Custodians of the Internet: platforms, content moderation, and the hidden decisions that shape social media (2018 Yale University Press)