Verbs are the heart of nonspeech captions, especially when paralinguistic sounds are involved (grunting, laughing, crying, etc.), because captioning nonspeech is fundamentally about representing and embodying action (which is what verbs do). Note, first, the distinction between discrete and sustained sounds. Nonspeech sounds that have a clear beginning and end are discrete or one-off sounds […]