/v2/transcript

    You can use this endpoint to submit audio URLs for transcription, and to retrieve the status or result of the transcription. This endpoint, like all endpoints, expects to receive JSON data with the 'content-type: application/json' header.

    POST Params

    When making a POST request to this endpoint, you can include the following parameters.

    Param Example Info Required
    audio_url "http://foo.bar/audio.wav" A URL that points to your audio file. This URL must be accessible by our servers. To upload a file instead, see the guide here Yes
    acoustic_model "assemblyai_default" The name of the Acoustic Model you want to use. For more info, see here. By default, we'll use our default model assemblyai_default No
    language_model "my_custom_lm" The name of your custom Language Model you want to use. For more info, see here. By default, we'll use our default Language Model assemblyai_default No
    format_text true Toggle the option to automatically case proper nouns and convert numbers to digits ("seven" -> "7"). Set to false to disable this feature. No
    punctuate true Toggle the option to automatically add punctuation to the transcription text. Set to false to disable this feature. No
    dual_channel false If working with dual channel audio files, set to true to transcribe each channel separately. For more info, see here. No
    webhook_url "http://myserver.com/receive" Instead of polling for the status of your transcription, we will make a POST request to your webhook_url when your transcription is ready . For more info, see here. No
    audio_start_from 8000 Seek in your audio file to this time, in milliseconds, before we start transcribing. You're only charged for the duration of audio that is transcribed. No
    audio_end_at 20000 Stop transcribing your audio file when we reach this time, in milliseconds, in your audio file. You're only charged for the duration of audio that is transcribed. No

    POST Response

    After a successful POST, the API will respond with the following JSON response. The most important fields to note are the "status" and "id" fields. You'll need the "id" field to make GET requests against the API to get the result/status of your transcription as it completes.

    For a complete rundown of what all the below fields mean, view the table in the GET Response section below.

    {
        "acoustic_model": "assemblyai_default",
        "audio_duration": null,
        "audio_url": "https://s3-us-west-2.amazonaws.com/blog.assemblyai.com/audio/8-7-2018-post/7510.mp3",
        "confidence": null,
        "dual_channel": null,
        "format_text": true,
        "id": "5551722-f677-48a6-9287-39c0aafd9ac1",
        "language_model": "assemblyai_default",
        "punctuate": true,
        "status": "queued",
        "text": null,
        "utterances": null,
        "webhook_status_code": null,
        "webhook_url": null,
        "words": null
    }

    GET Params

    When making GET requests against this endpoint, you need to include the "id" of your transcript in the URL. For example, this would be the full URL to GET one of your transcripts.

    https://api.assemblyai.com/v2/transcript/5551722-f677-48a6-9287-39c0aafd9ac1
    Make sure you include the authorization header in the GET request! You can read more about Authentication here.

    GET Response

    When you make a GET request, the API will respond with the following JSON response. Most of the keys will only be filled out once the "status" is "completed".

    Param Example Info
    id "5551722-f677-48a6-9287-39c0aafd9ac1" The unique id of your transcription.
    status "completed" The status of your transcription. Will be either "queued", "processing", "completed", or "error". If "error", see the "error" field below.
    audio_url "http://foo.bar/audio.wav" The URL of the audio file that was transcribed.
    acoustic_model "assemblyai_default" The name of the Acoustic Model that was used.
    language_model "my_custom_lm" If supplied, the name of your custom Language Model that was used.
    format_text true Indicates whether text formatting was applied (will be true or false).
    punctuate true Indicates whether punctuation was applied (will be true or false).
    dual_channel false Indicates whether the channels were requested to be transcribed separately or not (will be true or false).
    webhook_url "http://myserver.com/receive" If supplied, the URL we'll send webhooks to when your transcript is complete.
    audio_duration 12.09 How many seconds of audio we found in your audio file.
    confidence 0.956 The confidence score of the entire transcription, between 0 and 1.
    text "You know Demons on TV like..." The complete transcription for your audio.
    utterances [...] Available when transcribing dual channel audio. For more info, see the guide here.
    webhook_status_code 200 The status code we received when attempting to make a POST to your webhook_url. If the code is 999, that means we were unable to reach the webhook_url you supplied.
    words [{"confidence": 1.0, "end": 440, "start": 0, "text": "You"}, ...] An array of objects, with the information for each word in the transcription text. Will include the start/end time (in milliseconds) of the word, and the confidence score of the word.
    error "Unable to reach audio URL" If your status is "error", this key will be present in the JSON response and will contain more information about why the transcript failed. Usually, you can look at the error message to fix whatever is causing the transcript to fail, for example, unreachable audio URLs or URLs that point to text/HTML and not audio.