Content Safety Detection

    This feature is only enabled for Enterprise accounts! If you are not sure if your account is enabled for Enterprise features, please contact your account manager.

    With the Content Safety Detection feature, AssemblyAI can classify your transcription text with the following labels:

    Label Description
    Accidents Includes accidents both man-made and natural such as plane crashes, car wrecks, hurricanes, fires, tornadoes, etc.
    Alcohol Text that discusses alcohol, alcohol usage, drinking, etc. Examples include beer, wine, vodka, hangovers, etc.
    Crime Violence Text that discusses any type of crime or violence including murder, rape, kidnapping, shootings, etc. Also includes things related to crimes such as crime scenes, forensics, etc.
    Drugs Text that discusses non-medical drugs and drug usage. Examples include marijuana, heroin, etc. Note that alcohol and tobacco should not be classified into this topic as they have their own separate categories. If a text discusses alcohol / tobacco along with other drugs such as marijuana then it can be labelled with both.
    Gambling Includes gambling on casino based games such as poker, slots, etc. as well as sports betting and other subject matter.
    Hate Speech Any name-calling, insults, slurs, etc. directed towards an individual or group of people. Also includes racism, sexism, etc.
    Health Issues Text that discusses medical / health related issues and problems. Includes things such as heart disease, flu, torn ACL, COVID-19, etc.
    Manga Manga are comics or graphic novels originating from Japan. This may be difficult to identify as text that discusses a story arc but can’t directly be identified as Manga should not be labelled. Manga includes things such as Naruto, Pokemon, etc.
    Negative News Any news that can be deemed negative even if it’s factual. Note that this typically will be given in the third person as a recapping of events. Things classified into this topic will often be grouped into others as well such as crime violence, terrorism, etc.
    NSFW (Adult Content) Means “Not safe for work” and includes content that the viewer may not want to be seen / heard in a public controlled environment. May include content such as profanity, intense sexuality / pornography, slurs, disturbing subject matter, intense violence, etc. Note that virtually all content identified as pornography and profanity should also fall under this topic.
    Pornography Text that includes any sexual content or material.
    Profanity Any profanity or cursing.
    Terrorism Includes terrorist acts as well as terrorist groups. Examples include bombings, mass shootings, ISIS, etc. Note that many texts corresponding to this topic may also be classified into the crime violence topic.
    Tobacco Text that discusses tobacco and tobacco usage including e-cigarrettes. Examples include nicotine, vaping, smoking, etc.
    Weapons Text that discusses any type of weapon including guns, ammunition, shooting, knives, missiles, torpedoes, etc.

    Enabling Content Safety Detection when submitting files for transcription

    Simply include the content_safety parameter in your POST request, and set this parameter to true, as shown in the cURL request below.

    curl --request POST \
      --url https://api.assemblyai.com/v2/transcript \
      --header 'authorization: YOUR-API-TOKEN' \
      --header 'content-type: application/json' \
      --data '{"audio_url": "https://app.assemblyai.com/static/media/phone_demo_clip_1.wav", "content_safety": true}'

    Interpreting the Content Safety Detection response

    Once the transcription is complete, and you make a GET request to /v2/transcript/<id> to receive the transcription, there will be an additional key content_safety_labels:

    {
        # some keys have been hidden for readability
        ...
        "audio_duration": 12.0960090702948,
        "audio_url": "https://s3-us-west-2.amazonaws.com/blog.assemblyai.com/audio/8-7-2018-post/7510.mp3",
        "confidence": 0.956,
        "format_text": true,
        "id": "5551722-f677-48a6-9287-39c0aafd9ac1",
        "status": "completed",
        "text": "...",
        "content_safety_labels": {
            "status": "success", 
            "results": [
                {
                    "text": "...has led to a dramatic increase in fires and the disasters around the world have been increasing at an absolutely extraordinary. An unprecedented rate four times as many in the last 30 years...", 
                    "labels": [
                        {
                            "confidence": 0.9986903071403503, 
                            "label": "accidents"
                        }
                    ], 
                    "timestamp": {
                        "start": 171710, 
                        "end": 200770
                    }
                }
            ]
        },
        ...
    }