Abstract:
Environmental sound synthesis is a technique for generating a natural environmental sound. Conventional work on environmental sound synthesis using sound event labels can not control synthesized sounds finely, for example, the pitch and timbre. We have considered that onomatopoeic words can be used for environmental sound synthesis. Onomatopoeic words are effective for explaining the feature of sounds. We believe that using onomatopoeic words enable us to control the fine time-frequency structure of synthesized sounds. However, there is no dataset available for environmental sound synthesis using onomatopoeic words. In this paper, we thus present RWCP-SSD-Onomatopoeia, a dataset consisting of 155,568 onomatopoeic words pairing with audio samples for environmental sound synthesis. We also have collected self-reported confidence scores and others-reported acceptance scores of onomatopoeic words, which help to select the words.