Moderating Image Content in Slack with Amazon Rekognition and Amazon AppFlow | AWS White Paper Summary

This document shows you how to use Amazon Rekognition and Amazon AppFlow to build a fully serverless content moderation pipeline for messages posted in a Slack channel.
The content moderation strategy identiﬁes images that violate sample chosen guidelines:
- Images that contain themes of tobacco or alcohol.
- Images that contain the following disallowed words:
  - medical
  - private
Amazon Rekognition content moderation is a deep learning-based service that can detect inappropriate, or oﬀensive images & videos, making it easier to ﬁnd and remove such content at scale.
It provides a detailed taxonomy of moderation categories.
Such as Explicit Nudity, Suggestive, Violence, and Visually Disturbing.
You can now detect six new categories: Drugs, Tobacco, Alcohol, Gambling, Rude Gestures, and Hate Symbols.
Amazon AppFlow is a fully managed integration service that enables you to securely transfer data between Software-as-a-Service (SaaS) applications like Salesforce, Marketo, Slack, and ServiceNow, and AWS services like S3, Redshift, in just a few clicks.
This solution leverages Amazon AppFlow to capture the content posted in Slack channels for analysis using Amazon Rekognition.
It's supposed that reader of this document has an account/workspace on AWS and Slack, also S3 bucket accessibility.
Also, client credetials for this services will be used.
This solution doesn't require any prior machine learning (ML) expertise, or development of your own custom ML models.

Architecture overview

This solution uses serverless technologies and managed services to be scalable and cost-eﬀective.
By using an event-driven architecture that incorporates AWS Lambda and SQS, you can decouple image detection and image processing without provisioning or managing any servers.

Create Amazon AppFlow Integration with your Slack workspace

First you have to create a Slack app, for more details .
Now follow these steps to conﬁgure the Amazon AppFlow integration:
1. Navigate to the Amazon AppFlow console and choose Create ﬂow.
2. In Step 1 of the creation process, enter a Flow name, and optionally, a description.
3. For the purposes of this demo, leave the Data encryption setting as it is.
4. Optionally, enter any tags you’d like for the ﬂow.
5. Choose Next.

In Step 2 (Conﬁgure ﬂow), choose the Source name dropdown list and choose Slack from the list of options:

A Choose Slack connection dropdown list appears. From this list, choose Create new connection:
Enter your Slack workspace address (for example, testingslackdevgroup.slack.com), and Client ID and Client Secret generated when created the Slack App.
Give your connection a name on the Connect to Slack popup window.
Choose Continue.

A window pops up with a conﬁrmation prompt to allow permissions. Choose Allow.
Your new connection is conﬁgured & displayed in the Choose Slack connection dropdown list, and a new Choose Slack object dropdown list appears directly below it. Choose Conversations.
A new dropdown appears directly below Choose Slack channel. From this list, choose the Slack channel that you would like to perform content moderation on.

With the Slack workspace connected, and the channel for moderation selected, you can move on to conﬁguring the Destination details. First, choose Amazon S3 from the Destination name dropdown list. Select S3.

A new section titled Flow trigger appears with two options: Run on demand or Run ﬂow on schedule. Choose the second option, and conﬁgure the schedule to run every one (1) minute.
When you choose this option, the Incremental Transfer option is auto-selected. Enter a value for Starting at and Start date.
In Step 3 (Map data ﬁelds), you have the option to perform transformations on the data ﬁelds. Choose Manually map ﬁelds.
From the Source ﬁeld name dropdown, select Map all ﬁelds directly. This creates a mapping of all the ﬁelds without any transformations.
Choose Next.
In Step 4 (Add ﬁlters), you have the option to perform ﬁltering on the data. Do not add any ﬁlters here, simply choose Next to continue.
On the Review and Create screen, a summary of all your selections from previous steps is shown. Review these for accuracy, then scroll to the bottom of page and choose Create ﬂow. 22.After the ﬂow has been created, on the following screen, choose the Activate ﬂow button.

Create a Lambda function to process ﬁles in the S3 bucket that contain new Slack messages

Because the Lambda function needs to store the image URLs it ﬁnds into a new SQS queue, ﬁrst create that queue by following the steps outlined in Getting started with Amazon SQS . Name this queue new-image-findings.
Navigate to the Lambda console. Choose Create Function and choose the option to Use a blueprint, then provide a ﬁlter called hello. This displays the hello-world-python blueprint in the results at the bottom.
Choose configure button.
On the next screen, provide a name for your new function called process-new-messages, and create a new IAM role called process-new-messages-lambda-role using the available “Amazon S3 object read-only permissions” template. This role will need to be customized in a later step.
After the function has been created, choose the Permissions tab.
Choose the role name to open a second window where you can view the two policies applied to this role.
Expand each policy to view the permissions details. The policy named AWSLambdaBasicExecutionRole-* grants the necessary permissions for the function to log
information in CloudWatch. The policy named AWSLambdaS3ExecutionRole-* provides S3 permissions and needs to be modiﬁed. To modify the policy, choose Edit Policy and switch to the JSON view to customize this policy. The ﬁnal permissions statement should appear as follows:

"Statement": [{
  "Action": [
   "s3:GetObject*",
   "s3:GetBucket*",
   "s3:List*"
 ],
  "Resource": [
     "arn:aws:s3:::slack-moderation-output",
     "arn:aws:s3:::slack-moderation-output/*"
 ],
  "Effect": "Allow"
}]

The preceding statement follows the principle of least privilege, and limits the permissions of this Lambda function to only the bucket you created for this exercise. Save the change you’ve made to this policy.

For this function to write messages to the new-image-findings SQS queue, an additional minimally scoped IAM policy needs to be added to this role.

To add the IAM policy:

Choose Add inline policy and switch to the JSON view to create the following permissions. Note that the following Resource element needs to be updated with the correct Amazon Resource Name (ARN) for the new-image-findings SQS queue which contains your actual account number.

{
"Version": "2012-10-17",
 "Statement": [{
  "Action": [
   "sqs:SendMessage",
   "sqs:GetQueueAttributes",
   "sqs:GetQueueUrl"
 ], 
  "Resource": "arn:aws:sqs:us-east-1:111111111111:new-image-findings",
  "Effect": "Allow"
 }]

}

Choose Review policy, then enter a name for this policy and choose Create policy.
With the permissions properly conﬁgured, switch back to the Conﬁguration tab in the Lambda function window, and paste the following code into the Function code section:

import boto3
from urllib.parse import unquote_plus
import json

s3_client = boto3.client('s3')
s3 = boto3.resource('s3')
sqs = boto3.client('sqs')

def sendToSqS(attributes, queueurl):
  sqs.send_message(
    QueueUrl=queueurl,
    MessageBody='Image to Check',
    MessageAttributes={ "url": { "StringValue": attributes["image_url"], "DataType": 'String'
}, "slack_msg_id": { "StringValue": attributes["client_msg_id"], "DataType": 'String' } } )


def lambda_handler(event, context):

  image_processing_queueurl = "https://queue.amazonaws.com/111111111111/new-image-findings”

  for record in event['Records']:
    bucket = record['s3']['bucket']['name']
    key = unquote_plus(record['s3']['object']['key'])
    file_lines = s3.Object(bucket, key).get()\['Body'].read().decode('utf-8').splitlines()

    attachment_list = []
    for line in file_lines:
        if line: # Check for blank lines
          jsonline = json.loads(line)
          if "attachments" in jsonline.keys(): # Check for lines with attachements
            for attachment in jsonline["attachments"]:
              if "image_url" in attachment.keys():
                if "client_msg_id" in jsonline.keys():
                    thisdict = {
                      "image_url": attachment["image_url"],
                      "client_msg_id": jsonline["client_msg_id"]
                   }
                    attachment_list.append(thisdict.copy())
                else:
                    thisdict = {
                      "image_url": attachment["image_url"],
                      "client_msg_id": "None Found"
                    }
                    attachment_list.append(thisdict.copy())
                    for item in attachment_list:
                      sendToSqS(item, image_processing_queueurl)

After you have pasted the code, update the image_processing_queueurl variable in the function handler with the correct ARN for the new-image-findings SQS queue which contains your actual account number.
Choose Deploy to deploy the updated code.

Conﬁgure the Lambda function to be invoked when new objects are added to your S3 bucket

With your Lambda function (process-new-messages) created, the next step is to conﬁgure bucket notiﬁcations on your S3 bucket, and subscribe this Lambda function to the notiﬁcations.

To create the S3 / Lambda event integration:

Conﬁgure event notiﬁcations on your S3 bucket by following the steps outlined in this User Guide .

In Step 5 of the conﬁguration, choose the All object create events option.
In Step 6, choose your Lambda function named process-new-messages.

Create a Lambda function to process messages where image references were found (via SQS queue)

Your ﬁrst Lambda function (process-new-messages) is now being invoked, and any image references found in Slack messages have been stored in the new-image-findings SQS queue.

The next step is to create and invoke another Lambda function (process-new-images) that will use Amazon Rekognition to determine if there are any policy violations in the content.

To conﬁgure the SQS / Lambda / Amazon Rekognition Integration:

Because the Lambda function you are about to create needs to store any content violations found into a new SQS queue, ﬁrst create that queue by following the steps outlined in Getting started with Amazon SQS.
Name this queue new-violation-findings.
Navigate to the Lambda console and choose Create Function.
Choose the Use a blueprint option and provide a ﬁlter called hello. This will display the hello-world-python blueprint in the results at the bottom.
Choose the Conﬁgure button. Name the new Lambda function process-new-images.
Create a new execution role with basic Lambda permissions.
After the function has been created, choose the Permissions tab.
Choose IAM Role to open a second window where you can view the policy attached to this role.
Choose Attach Policies.
Search for AmazonRekognitionReadOnlyAccess and choose Attach Policy to complete the action. This allows your Lambda function permissions to call Amazon Rekognition.
he function also needs permissions to read from the new-image-findings queue and write new messages to the new-violation-findings queue. Choose Add inline policy and switch to the JSON view to create the following permissions.

Note that the following Resource elements need to be updated with the correct ARNs for the new-image-findings and new-violation-findings SQS queues respectively, which contain your actual account number:

{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"sqs:ReceiveMessage",
"sqs:ChangeMessageVisibility",
"sqs:GetQueueUrl",
"sqs:DeleteMessage",
"sqs:GetQueueAttributes"
],
"Resource": "arn:aws:sqs:us-east-1:111111111111:new-image-findings",
"Effect": "Allow"
},
{
"Action": [
"sqs:SendMessage",
"sqs:GetQueueAttributes",
"sqs:GetQueueUrl"
],
"Resource": "arn:aws:sqs:us-east-1:111111111111:new-violation-findings",
"Effect": "Allow"


 }]

}

Choose Review policy.
Enter a name for this policy and choose Create policy.
With the permissions conﬁgured, switch back to the Conﬁguration tab in the Lambda function window, and paste the following code into the Function code section:

import urllib.request
import boto3

sqs = boto3.client('sqs')
rekognition = boto3.client('rekognition')

def analyze_themes(file, min_confidence=80):
  with open(file, 'rb') as document:
    imageBytes = bytearray(document.read())
    response = rekognition.detect_moderation_labels(Image={'Bytes': imageBytes}, MinConfidence=min_confidence)

  found_high_confidence_labels = []

  for label in response['ModerationLabels']:
    found_high_confidence_labels.append(str(label['Name']))

  return found_high_confidence_labels


def analyze_text(file):
  with open(file, 'rb') as document:
    imageBytes = bytearray(document.read())

    response = rekognition.detect_text(Image={'Bytes': imageBytes})

    textDetections = response['TextDetections']
    found_text = ""

    for text in textDetections:
      found_text += text['DetectedText']

    return found_text



def sendToSqS(words, attributes, queueurl):
   sqs.sendMessage(
      QueueUrl=queueurl,
      MessageBody='Image with "' + words + '" found',
      MessageAttributes={
        "url": {
          "StringValue": attributes["image_url"],
          "DataType": 'String'
        },
        "slack_msg_id": {
          "StringValue": attributes["slack_msg_id"],
          "DataType": 'String'
        }
      }
   )



def lambda_handler(event, context):
   violations = "https://queue.amazonaws.com/111111111111/new-violation-findings"
   disallowed_words = ["medical", "private"]
   disallowed_themes = ["Tobacco", "Alcohol"] # Case Sensitive 
   file_name = "/tmp/image.jpg"

   for record in event['Records']:
      print(record)
      receiptHandle = record["receiptHandle"]
      image_url = record["messageAttributes"]["url"]["stringValue"]
      slack_msg_id = record["messageAttributes"]["slack_msg_id"]["stringValue"]
      eventSourceARN = record["eventSourceARN"]
      arn_elements = eventSourceARN.split(':')

      img_queue_url = sqs.get_queue_url(
          QueueName=arn_elements[5],
          QueueOwnerAWSAccountId=arn_elements[4]
      )

      sqs.delete_message(
          QueueUrl=img_queue_url["QueueUrl"],
          ReceiptHandle=receiptHandle
      )

      urllib.request.urlretrieve(image_url, file_name)
      detected_text = analyze_text(file_name)

      print("Detected Text: " + detected_text)
      found_words = []
      for disallowed_word in disallowed_words:
        if disallowed_word.lower() in detected_text.lower():
            found_words.append(disallowed_word)
            print("WORD VIOLATION: " + disallowed_word.lower() + " found in " + detected_text.lower())

        violating_words = ", ".join(found_words)
        if not violating_words == "":
          attributes_json = {}
          attributes_json["slack_msg_id"] = slack_msg_id
          attributes_json["image_url"] = image_url
          sendToSqS(violating_words, attributes_json, violations)

        detected_themes = analyze_themes(file_name)
        print("Detected Themes: " + ", ".join(detected_themes))

        found_themes = []
        for disallowed_theme in disallowed_themes:
          if disallowed_theme in detected_themes:
              found_themes.append(disallowed_theme)
              print("THEME VIOLATION: " + disallowed_theme + " found in image")

        violating_themes = ", ".join(found_themes)
          if not violating_themes == "":
              attributes_json = {}
              attributes_json["slack_msg_id"] = slack_msg_id
              attributes_json["image_url"] = image_url

              sendToSqS(violating_themes, attributes_json, violations)

After you have pasted the code, update the violations variable in the function handler with the correct ARN for the new-violation-findings SQS queue which contains your actual account number.
Choose Deploy.

To ensure that your SQS queues cannot be accessed by resources outside the account, SQS permissions policies can be applied to each of the queues.

To apply permissions policies:

Navigate to the SQS console and choose the new-violation-ﬁndings queue.
Choose the Access policy tab.
Choose the Edit button and paste in the following policy.

Note that the following Resource elements need to be updated with the correct ARN for the new-violation-findings SQS queues respectively, which contain your actual account number.

{
}
"Version": "2012-10-17",
"Statement": [
{
"Sid": "QueueOwnerOnlyAccess",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:: 111111111111:root"
},
"Action": [
"sqs:DeleteMessage",
"sqs:ReceiveMessage",
"sqs:SendMessage",
"sqs:GetQueueAttributes",
"sqs:RemovePermission",
"sqs:AddPermission",
"sqs:SetQueueAttributes"
],
"Resource": "arn:aws:sqs:us-east-1: 111111111111:new-violation-findings"
},
{
"Sid": "HttpsOnly",
"Effect": "Deny",
"Principal": "*",
"Action": "SQS:*",
"Resource": "arn:aws:sqs:us-east-1: 111111111111:new-violation-findings",
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
  }}}]
}

Repeat the preceding steps for the new-image-findings queue. Remember to use the new-image-findings ARN in the policy.

You can now conﬁgure our SQS queue to trigger your Lambda function.

To conﬁgure your SQS queue:

In the SQS Console, choose the new-image-findings queue from the Lambda triggers tab.
Choose Conﬁgure Trigger for Lambda Function.
From the dropdown list, choose the function you just created.

Trigger the Lambda function you created.

Test the solution

You can now post some messages to your moderated Slack channel for testing. You can easily change the content violation policies in the Python code by modifying the disallowed_words and disallowed_themes variables.

To test the solution:

Post sample images that will be used to trigger violations for the current conﬁgured policies:
- Post this image which contains the disallowed word "private": https://i.imgur.com/662ptww.png
- Post this image which contains a "Tobacco" theme: https://i.imgur.com/XgAtyWU.png
After creating those posts, wait 2-3 minutes and then navigate to the SQS Console. View the queues and choose the new-violation-findings queue.
Choose the Send and receive messages button.
At the bottom of the screen, choose the Poll for messages button.
After a few seconds you should see two messages pop up. You can choose each message to interrogate the contents.
Choose the Message ID. The body of the message contains information about what violation was triggered. The Attributes show the image URL and “slack_msg_id” for the oﬀending item.

References

Tags:

Aws Cloud Machinelearning

How I passed the AWS Certified Cloud Practitioner Exam?

How to schedule ECS Services in AWS easily

The Koyeb Serverless Engine: from Kubernetes to Nomad, Firecracker, and Kuma

Scaling AWS EC2 Instances

How I created a google forms clone using AWS

Appropriate instance type depending on your workflow in AWS

Working with parameters and variables in Amazon Managed Workflows for Apache Airflow

Mask or no mask? With Twilio Video, machine learning, and JavaScript