Transcribing dual channel/stereo recordings

    If you have a dual channel audio file, for example a phone call recording with the agent on one channel and the customer on the other, the API supports transcribing each channel separately.

    Submit a dual channel audio file for transcription

    Dual channel transcriptions take ~25% longer to complete than normal, since we need to transcribe each channel which adds a little extra overhead!

    Get the transcription result

    Once your transcription is complete, you can GET the result like normal:

    You'll get a response like the JSON response below. The "utterances" key will contain a list of turn-by-turn utterances, as they appeared in the audio recording. Each JSON object in the "utterances" list contains the channel information (this will be either "1" or "2"), so you can easily tell which channel each utterance is from. Each word in the "words" array will also contain the channel key.

    {
        "acoustic_model": "assemblyai_default",
        "audio_duration": 150.766167800454,
        "audio_url": "https://app.assemblyai.com/static/media/phone_demo_clip_1.wav",
        "confidence": 0.922175805047867,
        "dual_channel": true,
        "format_text": true,
        "id": "5552830-d8b1-4e60-a2b4-bdfefb3130b3",
        "language_model": "assemblyai_default",
        "punctuate": true,
        "status": "completed",
        "text": "Hi, I'm joy. Hi, I'm sharon. Do you have kids in school. ...",
        "utterances": [
            {
                "channel": "1",
                "confidence": 0.97,
                "end": 1380,
                "speaker": "1",
                "start": 0,
                "text": "Hi, I'm joy.",
                "words": [
                    {
                        "channel": "1",
                        "confidence": 1.0,
                        "end": 320,
                        "speaker": "1",
                        "start": 0,
                        "text": "Hi,"
                    },
                    ...
                ]
            },
            {
                "channel": "2",
                "confidence": 0.94,
                "end": 3260,
                "speaker": "2",
                "start": 0,
                "text": "Hi, I'm sharon.",
                "words": [
                    {
                        "channel": "2",
                        "confidence": 1.0,
                        "end": 480,
                        "speaker": "2",
                        "start": 0,
                        "text": "Hi,"
                    },
                    ...
                ]
            },
            {
                "channel": "1",
                "confidence": 0.94,
                "end": 5420,
                "speaker": "1",
                "start": 2820,
                "text": "Do you have kids in school.",
                "words": [
                    {
                        "channel": "1",
                        "confidence": 1.0,
                        "end": 4300,
                        "speaker": "1",
                        "start": 2820,
                        "text": "Do"
                    },
                    ...
                ]
            },
            {
                "channel": "2",
                "confidence": 0.94,
                "end": 7380,
                "speaker": "2",
                "start": 3600,
                "text": "I have grandchildren in school.",
                "words": [
                    {
                        "channel": "2",
                        "confidence": 1.0,
                        "end": 3680,
                        "speaker": "2",
                        "start": 3600,
                        "text": "I"
                    },
                    ...
                ]
            },
        ],
        "webhook_status_code": null,
        "webhook_url": null,
        "words": [
            {
                "channel": "1",
                "confidence": 1.0,
                "end": 320,
                "speaker": "1",
                "start": 0,
                "text": "Hi,"
            },
            {
                "channel": "2",
                "confidence": 1.0,
                "end": 480,
                "speaker": "2",
                "start": 0,
                "text": "Hi,"
            },
            ...
        ]
    }