Auto-detecting key phrases/words in the transcription text

    The API can automatically detect key phrases and words in your transcription text, through what we call the Automatic Transcript Highlights feature.

    For example, consider the following text...

    Hi I'm joy. Hi I'm Sharon. Do you have kids in school? I have grandchildren in school. Okay, well, my kids are in middle school in high school. Do you think there is anything wrong with the school system? Overcrowding, of course,...

    Automatic Transcript Highlights will automatically detect the following key phrases/words in the text:

    "high school"
    "middle school" 
    "kids"
    ...

    This feature can be used to show a "summary" of the transcription text, to automatically provide tags for content you are transcribing, and for a number of other use cases.

    Submit an audio file and enable Automatic Transcript Highlights

    Transcripts can take up to 30 seconds longer to complete when Automatic Transcript Highlights is enabled.

    Get the transcription result and highlights

    Once your transcription is complete, you can GET the result:

    Pro tip: If you haven't gone through the Quickstart yet, do that here and then come back to this guide!

    You'll get a response like the JSON response below.

    {
        "id": "5552830-d8b1-4e60-a2b4-bdfefb3130b3",
        "status": "completed",
        "text": "Hi, I'm joy. Hi, I'm sharon. Do you have kids in school. ...",
        "auto_highlights_result": {
            "status": "success",
            "results": [
                {
                    "count": 2, 
                    "text": "high school", 
                    "rank": 0.13, 
                    "timestamps": [
                        {
                            "start": 62340, 
                            "end": 62840
                        }
                    ]
                },
                ...              
            ]
        },
        ...
    }

    The "auto_highlights_result" key in the JSON response will contain the key phrases/words the API found in your transcription text. Here is a close-up of just that key's response, and what each value means:

    ...
    "auto_highlights_result": {
        # if the highlights feature failed, this "status" will be "error"
        # and the results below will be an empty list 
        "status": "success",
    
        # a list of all the highlights found in your transcription text
        "results": [
            {
                # the phrase/word itself that was detected
                "text": "high school", 
                # how many times this phrase occurred in the text 
                "count": 2,
                # the relevancy of this phrase - the higher the score, the better            
                "rank": 0.13, 
                # a list of all the timestamps, in milliseconds, in the audio where this highlight occurs 
                "timestamps": [
                    {
                        "start": 62340, 
                        "end": 62840
                    }
                ]
            }, 
            ...
        ]
    }
    ...