Content Safety Detection

    This feature is only enabled for Enterprise accounts! If you are not sure if your account is enabled for Enterprise features, please contact your account manager.

    With the Content Safety Detection feature, AssemblyAI can classify your transcription text with the following labels:

    Label Description
    Accidents Any man-made incident that happens unexpectedly and results in damage, injury, or death.
    Alcohol Content that discusses any alcoholic beverage or their consumption.
    Company Financials Content that discusses any sensitive company financial information.
    Crime Violence Content that discusses any type of criminal activity or extreme violence that is criminal in nature.
    Drugs Content that discusses illegal drugs or their usage. Note this includes discussions of marijuana.
    Gambling Includes gambling on casino based games such as poker, slots, etc. as well as sports betting.
    Hate Speech Content that is a direct attack against people or groups based on their sexual orientation, gender identity, race, religion, ethnicity, national origin, disability, etc.
    Health Issues Content that discusses any medical or health related problems.
    Manga Manga are comics or graphic novels originating from Japan with some of the more popular series being "Pokemon", "Naruto", "Dragon Ball Z", "One Punch Man", and "Sailor Moon".
    Natural Disasters Phenomena that happens infrequently and results in damage, injury, or death. Such as hurricanes, tornadoes, earthquakes, volcano eruptions, and firestorms.
    Negative News News content with negative sentiment which typically will occur in the third person as an unbiased recapping of events.
    NSFW (Adult Content) Content considered "Not Safe for Work" and consists of content that a viewer would not want to be heard / seen in a public environment.
    Pornography Content that discusses any sexual content or material.
    Profanity Any profanity or cursing.
    Terrorism Includes terrorist acts as well as terrorist groups. Examples include bombings, mass shootings, and ISIS. Note that many texts corresponding to this topic may also be classified into the crime violence topic.
    Tobacco Text that discusses tobacco and tobacco usage, including e-cigarrettes, nicotine, vaping, and general discussions about smoking.
    Weapons Text that discusses any type of weapon including guns, ammunition, shooting, knives, missiles, torpedoes, etc.

    Enabling Content Safety Detection when submitting files for transcription

    Simply include the content_safety parameter in your POST request, and set this parameter to true, as shown in the cURL request below.

    curl --request POST \
      --url https://api.assemblyai.com/v2/transcript \
      --header 'authorization: YOUR-API-TOKEN' \
      --header 'content-type: application/json' \
      --data '{"audio_url": "https://app.assemblyai.com/static/media/phone_demo_clip_1.wav", "content_safety": true}'

    Interpreting the Content Safety Detection response

    Once the transcription is complete, and you make a GET request to /v2/transcript/<id> to receive the transcription, there will be an additional key content_safety_labels:

    {
        # some keys have been hidden for readability
        ...
        "text": "foo bar...",    
        "id": "5551722-f677-48a6-9287-39c0aafd9ac1",
        "status": "completed",
        "content_safety_labels": {
            "status": "success", 
            "results": [
                {
                    "text": "...has led to a dramatic increase in fires and the disasters around the world have been increasing at an absolutely extraordinary. An unprecedented rate four times as many in the last 30 years...", 
                    "labels": [
                        {
                            "confidence": 0.9986903071403503, 
                            "label": "accidents"
                        }
                    ], 
                    "timestamp": {
                        "start": 171710, 
                        "end": 200770
                    }
                }
            ],
            "summary": {
                "accidents": 0.89,
                "nsfw": 0.08,
                ...
            }
        },
        ...
    }

    Lets dig in to the entire "content_safety_labels" response key below, to get a better understanding of what each value means.

    "content_safety_labels": {
        # will be "unavailable" in the rare chance that content safety results
        # were unavailable for this transcription
        "status": "success",
        # 'results' contains a list of each portion of text that was flagged
        # by the content safety model, along with the labels that were
        # assigned to the flag paragraph, the confidence score for each
        # label, and the timestamp for where the portion of text
        # occurred in the source audio file
        "results": [
            {
                "text": "...has led to a dramatic increase in fires and the disasters around the world have been increasing at an absolutely extraordinary. An unprecedented rate four times as many in the last 30 years...", 
                "labels": [
                    {
                        "confidence": 0.9986903071403503, 
                        "label": "accidents"
                    }
                ], 
                "timestamp": {
                    "start": 171710, 
                    "end": 200770
                }
            }
        ],
        # for each content safety label detected, the 'summary' key will show
        # the relevancy for that label across the entire transcription
        # text; for example, if the "nsfw" label is detected only 1 time,
        # even with high confidence, in a 60 minute audio file, the
        # 'summary' key will show a low score, since the entire
        # transcription was not found to be "nsfw"
        "summary": {
            "accidents": 0.89,
            "nsfw": 0.08,
            ...
        }
    },