Filler Words (Disfluencies)

    By default, the API will filter out Filler Words (aka, disfluencies) like "um" and "uh", from transcripts.

    Including Filler Words in your transcripts

    To include Filler Words in your transcripts, set the disfluencies parameter to true in your POST request to /v2/transcript, as shown below.

    curl -- request POST \
        --url https://api.assemblyai.com/v2/transcript \
        --header 'authorization: YOUR-API-TOKEN' \
        --header 'content-type: application/json' \
        --data '{"audio_url": "https://app.assemblyai.com/static/media/phone_demo_clip_1.wav", "disfluencies": true}'

    Supported Filler Words

    The list of Filler Words the API will transcribe are:

    Transcription response with Filler Words

    Once the transcription has been completed you will get a response from the API as per usual, but Filler Words will be present in the transcription text and words array.

    {
        "confidence": 0.956,
        "id": "5551722-f677-48a6-9287-39c0aafd9ac1",
        "status": "completed",
        "text": "You know um we should...",
        "words": [
            {
                "confidence": 1.0,
                "end": 440,
                "start": 0,
                "text": "You"
            },
            ...
            # Filler Words will show up in the words array, and transcription text
            {
                "confidence": 0.96,
                "end": 10060,
                "start": 9600,
                "text": "um"
            },
            {
                "confidence": 0.97,
                "end": 10260,
                "start": 10080,
                "text": "or."
            }
        ]
    }