4 Steps to Protecting Your Brand Through Content Moderation

Updated on

In January of 2019, Facebook, YouTube, and Google affirmed that they were hiring thousands of human moderators to catch what artificial intelligence (AI) was routinely missing. But by March of this year, Facebook was sending home thousands of content moderators in response to the global pandemic. And it wasn’t alone, as Twitter and YouTube responded with the same responsible measures.

Get Our Activist Investing Case Study!

Get The Full Activist Investing Study In PDF

Q2 2020 hedge fund letters, conferences and more

Social media companies were proactive, warning users that they should expect errors in flagging policy violations (although some human moderators have continued working). In the absence of a large number of their human teams, these companies shifted to depending on AI solutions to carry out content moderation, acknowledging that this could lead to mistakes. This lack of human eyes moderating content is especially concerning when we consider that social media use is on the rise due to the plethora of housebound users.

And it’s no surprise that over the past six weeks, disturbing (albeit expected) mistakes have occurred related to potentially harmful content slipping through AI filters, as well as harmless content being flagged. Take Facebook’s spam rules for instance. Last month, posts that included links to The Atlantic, USA Today, and many other reliable sources of news were removed after alleged violations of Facebook’s spam rules.

And in June, important human rights and historical documentation, such as video documenting Syrian protests made by student journalist Baraa Razzouk, were deleted by YouTube. The issue? AI-enabled tools confused his videos with terrorist material and policy violations.

Sending home content moderators
Pexels / Pixabay

Reliance exclusively on AI for moderation will continue to mean legitimate content doesn’t make it through filters while undesirable content passes through. Of course, choosing to keep live moderators safe is the right call, but these unique times have certainly drawn attention to the importance of a combined AI and human approach to moderation, and the significant effects of eliminating one of these essential elements.

From USA Today links to Razzouk’s YouTube videos, the removal of legitimate content is a stark reminder that AI only works in numbers and probabilities. So how can your brand learn from the moderation mishaps of the big players? By taking steps toward content moderation that will protect your brand. Let’s start with stakeholders!

1. Know Your Stakeholders

The first proactive step toward successful content moderation is recognizing that moderation doesn’t just protect your brand’s interests, but the consumers who view content that your brand facilitates.

The gaming space is a prime example of how content moderation can be the difference between upsetting or impressing your stakeholders. It’s common for sports platforms to incite heated debates, but in-game chats can turn from trash talk to hostile and even threatening words in a flash.

Waiting until your inbox is inundated with reports from in-game chat users who are offended by experiences they’ve had using your chat feature is not the ideal way to protect your brand (or the best use of your company’s resources and time). But by implementing effective moderation, your brand can better detect offensive text in everything from blog and forum images to social media apps, children's’ sites, and in-game chats. Using a combination of live teams and AI will go a long way in preventing problems and distrust among stakeholders.

2. Balance Content with Branding

The challenge to balance branding with content is real. Maybe your brand prides itself on user generated content (UGC). Perhaps you’ve had great success with using influencers that love your product or service, who then showcase it as part of their everyday life by sharing images on social media. You might be reluctant to use a combination of live teams and AI moderation tools to evaluate UGC, arguing that it will compromise your brand’s style by filtering out engaging photos and video.

This is where the AI/human approach is essential. While the AI can be used to eliminate any blatantly offensive content such as raw nudity or hate symbols, the human team can review the content for more nuanced brand-specific criteria. This allows you to eliminate overtly offensive, brand-damaging content, as well as to ensure that the images, videos, and text that you feature align with your brand’s mission.

Risks aren’t exclusively associated with posting content that doesn’t align with your brand or is offensive. It can be a matter of content that violates intellectual property rights as well. Just ask brands who have been linked to author and influencer Rachel Hollis through her posts.

Unfortunately, Hollis has been accused repeatedly of sharing original photos on Instagram, but captioning them with quotes which she either failed to properly cite or attributed to herself. Recently she has been accused of plagiarizing Maya Angelou. And these accusations don’t bode well for brands that she has been known to promote, including Teremana Tequila and Trader Joe’s.

The posts in question contained neither profane images, text, or anything that moderation technology would consider flag-worthy. Live moderation, however, has the capacity to police content for copyright infringement. Having actual humans reviewing content means that reporting such posts remains a priority, as it should since influencers occasionally get sloppy or hasty in sharing content.

There will always be some subjectivity to human moderation, even once you have established clear brand guidelines. For example, how does one truly define “excessive cleavage?” In the cases of such “borderline” content, it is typically safer to reject it, occasionally sacrificing content to protect your audience. The bottom line is this: When in doubt, leave it out (with the help of your moderation team, of course).

3. Be Safer, Not Sorry

In your brand’s push toward successful image moderation, keep in mind that some marketing campaigns may come with risks. Depending on what type of content users are uploading, there could be a potential that your brand will expose its audience to upsetting images or offensive text.

And if you wait to review images until after they go live, you risk playing catch-up constantly in your efforts to remove user-generated images that are offensive, especially if there’s a sudden rise in submissions.

That’s where pre-moderation comes into play. This method of content moderation works by having an AI-based solution, like WebPurify’s automated intelligent moderation (AIM) service, check all image submissions immediately, ensuring that they do not contain hate symbols, nudity, weapons, and other risky content. Such a service will automatically reject any images identified as having a very high potential of containing these offenses.


Then, a live team can do a second, more nuanced review for anything the AI solution was unsure about – or was perhaps too aggressive in rejecting – and ensure that other violations like racism or self-harm do not exist. Although you err on the side of over-rejecting content, it is a fair trade for safety in applications where being conservative is key.

There’s nothing wrong with wanting your users to be impressed by your brand’s platform. And it’s true that pre-moderation will delay UGC from going public instantly. Although impressive, giving users the option to post images instantly might not be worth it if you dazzle a few and offend the majority.

4. Don’t Make Moderation an Afterthought

The difference between an effective image moderation program and moderation that backfires often comes down to being proactive. Many companies get in the dangerous habit of adding moderation at the last minute.

From social media platforms to social commerce marketplaces, this reactive approach to moderation leaves the responsibility of flagging content that is posted in real-time in the hands of the users. Although many of these marketplaces, such as Poshmark, have a system or moderator community for flagging content if they determine that it violates the community guidelines or is offensive (typically using a report button), posted listings are only one side of the content coin.

Marketplace users can also comment on other sellers’ listings. Most of the time comments are harmless inquiries about the item for sale, but what if they aren’t? A comment containing offensive content can be added by another buyer, and removing it can be complicated.

“If you really wish to remove a comment, you will have to delete the listing and create a new one. Comments made on another Posher's listing cannot be removed,” the fashion marketplace Poshmark explains on their platform’s support page, noting that “You can anonymously report a listing that violates our policies and guidelines at any time.”

User reports are a great resource to incorporate into your overall moderation plan, but that feature is the last resort for content that has made it through the primary AI and human defense layers. Pre-moderation upfront (by artificial intelligence and a highly-trained human team) followed by post-moderation in the form of community flagging content is an effective combination.

Relying solely on user reporting is not recommended, as users don’t want the full responsibility of moderating all of the unscreened content on your site, and you can quickly lose their loyalty. On the other hand, allowing user reporting does create a sense of community and is effective if used properly as part of a larger moderation approach.

Rather than rushing to create a moderation program as an afterthought, an effective moderation plan includes multiple layers of review that require thoughtful planning well in advance of launch.

Protecting Your Brand Through Content Moderation: Conclusion

The pandemic has forced large social sites to rely solely on AI, which has revealed its limitations. Without the aid of its human counterparts to fine-tune the results, applying a broader understanding of context and nuance, machines simply fall short.

That said, AI moderation is a must for real-time review and adds scalability for high volumes of incoming content. AI-based moderation technology can surmise a great deal about content (such as offensive gestures, nudity, blurriness, and embedded text) and reject these specific types of images outright. It is also regarded as the first line of defense for child sexual exploitation, spam, and other unseemly content.

But AI has its limitations, as it may catch images that are benign while failing to filter some forms of damaging content. For example, AI won’t catch nuanced content. It may be programmed to recognize a gun as a violent image, but it won’t understand that an image of a marine in uniform, holding a gun could be patriotic in nature, not violent.

skeeze / Pixabay

On the flip side, pairing a benign image with some embedded text (although innocent on its own) might create a combination that is hateful or offensive. In these cases, humans can more efficiently flag and take action on this particularly complex content.

If companies with digital platforms can learn anything from the moderation mishaps of top companies, it’s that using a combination of AI and human moderation is the most effective way to ensure that valuable content is accepted and harmful content is rejected.

By using this hybrid approach, you can avoid exposing your audience to upsetting images or offensive text and find confidence in knowing the user-generated content that is shared will build trust between your stakeholders and your brand.