Getting Started

    Overview

    With the AssemblyAI API you can quickly and accurately transcribe audio files and real-time audio streams. To get started, you'll need an API Token which you can get by signing up for the API.

    Submit an audio file for transcription

    Here's a quick example that shows how to transcribe an audio file that's hosted on a publicly available URL.

    You'll get a response like this:

    {
        "acoustic_model": "assemblyai_default",
        "audio_duration": null,
        "audio_url": "https://s3-us-west-2.amazonaws.com/blog.assemblyai.com/audio/8-7-2018-post/7510.mp3",
        "confidence": null,
        "dual_channel": null,
        "format_text": true,
        "id": "5551722-f677-48a6-9287-39c0aafd9ac1",
        "language_model": "assemblyai_default",
        "punctuate": true,
        "status": "queued",
        "text": null,
        "utterances": null,
        "webhook_status_code": null,
        "webhook_url": null,
        "words": null
    }

    The important things to note are that the id is "5551722-f677-48a6-9287-39c0aafd9ac1" and the status is "queued" (when you run this API command, your id will be different!). Now that the audio file has been submitted for transcription, we can poll to GET the status of the transcription, and eventually the result of the transcription.

    Get the transcription result

    As you poll, you'll see the status go from "queued" to "processing" to "completed". Once the status is "completed", you'll see a full JSON response like this:

    {
        "acoustic_model": "assemblyai_default",
        "audio_duration": 12.0960090702948,
        "audio_url": "https://s3-us-west-2.amazonaws.com/blog.assemblyai.com/audio/8-7-2018-post/7510.mp3",
        "confidence": 0.956,
        "dual_channel": null,
        "format_text": true,
        "id": "5551722-f677-48a6-9287-39c0aafd9ac1",
        "language_model": "assemblyai_default",
        "punctuate": true,
        "status": "completed",
        "text": "You know Demons on TV like that and and for people to expose themselves to being rejected on TV or humiliated by fear factor or.",
        "utterances": null,
        "webhook_status_code": null,
        "webhook_url": null,
        "words": [
            {
                "confidence": 1.0,
                "end": 440,
                "start": 0,
                "text": "You"
            },
            ...
            {
                "confidence": 0.96,
                "end": 10060,
                "start": 9600,
                "text": "factor"
            },
            {
                "confidence": 0.97,
                "end": 10260,
                "start": 10080,
                "text": "or."
            }
        ]
    }

    Next Steps

    That's all there is to it for the simplest transcription task! Next up, you can learn more about more advanced features like: