How we use AI for Social Network Content Moderation

How we use AI for Social Network Content Moderation

By:
Roberto Requena
Published on:
August 2, 2024

While creating a new social network application, we encountered a client request for admin screens to manually moderate user posts and comments. However, committed to providing an optimal user experience, we recognized the limitations of a manual approach. This is where Artificial Intelligence (AI) technologies come into play, offering an automated content moderation solution.

At Flywheel Studio, we primarily focused our developments in the “Low Code” world, leveraging tools like FlutterFlow and Webflow. However, some applications demand functionalities beyond the front end. These functionalities, often triggered by user actions or operating in the background, necessitate a robust backend component.

We're building this application with FlutterFlow, a development tool for creating mobile or web applications, and Firebase, which serves as our backend. To leverage Firebase's capabilities, we designed a system where every post or comment creation triggers a Cloud Function responsible for content moderation.  As explained in the following diagram.

Delving into the AI Solution

With the groundwork laid above, let's explore the specifics of the AI technology employed for content moderation and the valuable insights we gained during this process.

Initially, I saw this as an excellent opportunity to explore and experiment with cutting-edge Google technologies I encountered at Google I/O 2024. These included Firebase GenKit, Google AI Studio, and the Gemini API.

I initiated the creation of my Firebase Cloud Functions project and integrated the Firebase GenKit library. To my surprise, the process was remarkably straightforward, and the code examples generated by the library provided an excellent starting point.

Key advantages of using Firebase GenKit include its robust Developer Tools. The user-friendly interface empowers developers to craft prompts, experiment with different models, and test their workflows efficiently.

To achieve content moderation, I began crafting prompts to guide the Gemini AI towards the desired behavior.

After conducting several tests, I realized that the desired output wasn't being achieved sometimes. This discrepancy seemed to be related to Gemini's safety filters, which block prompts and responses deemed potentially harmful.

After a thorough investigation, I explored Machine Learning (another branch of AI) as an alternative approach to circumvent potential issues with AI Safety Filters. I chose Google’s Cloud Natural Language API for its simplicity and efficiency, as it allows us to streamline our code with a single endpoint API call. This API presented a compelling solution. We integrated it into our Firebase Cloud Functions, which are written in TypeScript, by adding the @google-cloud/language dependency to our project.

We developed a client class to encapsulate the API Call to the Moderate Text Endpoint and the processing logic of the result, adhering to clean code principles. This approach promotes code maintainability and readability.

The Moderate Text endpoint returns a list of ClassificationCategory objects. Each object comprises a category name and a confidence score indicating the AI's likelihood of assigning the content to that category.

Based on the endpoint results, our ideal logic for each created document involves extracting the post's text content within the Cloud Function. This content is then submitted to the endpoint to identify the most confident category. Depending on the associated confidence score, the post content is either accepted or rejected.

Within this context, if the score has a value that equals or exceeds 0.8 (score >= 0.8), the content is Rejected, if the score has a value less or equal to 0.3 it is Accepted (score <= 0.3), and a new attribute recording this decision is added to the document. Conversely, suppose no score meets the rejection or accepted criteria, for example when the score has a value between 0.4 and 0.7 (score > 0.4 && score <= 0.7). In that case, the document's status is updated to Needs Revision, and a push notification is sent to the application administrators to review the content and give the context moderation responsibility to a human.

To provide granular control over the AI content moderation feature, we incorporated a configuration settings screen within the application. This screen allows users to enable or disable the feature as needed, and change the score values for each case.

Our proposed solution automates content moderation using Artificial Intelligence technologies, significantly reducing administrative workload and enhancing user experience by minimizing post-approval wait times. We strongly advocate for the integration of AI technologies into your applications to drive innovation and deliver superior solutions.

Interested in a free app review?

Schedule a call

Starting a new project or want to chat with us?

Subscribe