Auto Chapters (Summarization)

    Auto Chapters provides a "summary over time" for audio content transcribed with AssemblyAI's Speech-to-Text API. It works by first breaking audio/video files into logical "chapters" as the topic of conversation changes, and then provides an automatically generated summary for each "chapter" of content.

    For more information on Auto Chapters, check out the announcement on our blog.

    Using the Auto Chapters Feature

    When requesting a transcription with the AssemblyAI API, simply include the auto_chapters: true parameter in your POST requests. For example, in cURL:

    curl --request POST \
      --url https://api.assemblyai.com/v2/transcript \
      --header 'authorization: YOUR-API-TOKEN' \
      --header 'content-type: application/json' \
      --data '{"audio_url": "https://foo.bar/7510.mp3", "auto_chapters": true}'

    Interpreting the response

    When your transcription is completed, you'll see a chapters key in the JSON response, like below:

    {
        "audio_duration": 12.0960090702948,
        "audio_url": "https://s3-us-west-2.amazonaws.com/blog.assemblyai.com/audio/8-7-2018-post/7510.mp3",
        "confidence": 0.956,
        "id": "5551722-f677-48a6-9287-39c0aafd9ac1",
        "status": "completed",
        "text": "The American job plan ...",
        // auto chapter results can be found in the JSON result here
        "chapters": [
            {
                "start": 0,
                "end": 20000,
                "summary": "The American job plan is going to create millions of good paying jobs. Jobs created in an American jobs plan do not require a College degree. 75% don't require an associate's degree.",
                "headline": "The American job plan is going to create millions of good paying jobs.",
            }
            ...
        ],    
    }

    Isolating the chapters key for a moment, we can drill into the JSON response here:

    chapters: [
        {
            "start": 0,
            "end": 20000,
            "summary": "The American job plan is going to create millions of good paying jobs.  jobs created in an American jobs plan do not require a College degree. 75% don't require an associate's degree.",
            "headline": "The American job plan is going to create millions of good paying jobs.",
        }
        ...
    ]

    For each chapter that was detected, the API will include the start and end timestamps (in milliseconds), a summary - which is a few sentence summary of the content spoken during that timeframe - and a short headline, which can be thought of as a "summary of the summary".

    Reviewing the response

    Key Value
    start Starting timestamp (in milliseconds) of the portion of audio being summarized
    end Ending timestamp (in milliseconds) of the portion of audio being summarized
    summary A 2-3 sentence summary of the content spoken during this timeframe
    headline A single sentence summary of the content spoken during this timeframe