SONATA

Results and Publications

Home
Results
Description
Collaboration
Links

Definition of naturalness

The term naturalness is used in a number of studies to evaluate speech synthesisers. It is not frequent in music acoustics literature, however. The term is often used without a clear definition attached to it.

Both in music and speech synthesis, it is fruitful to consider naturalness as an attribute that makes the listener think that the performer or the speaker is a human, i.e. that the sound is humanlike. We therefore introduce the term "humanlikeness" and designate it a perceptive scale where a human performer should be placed at one end and a computer preformance towards the other end.

[Review paper in preparation: Farner and Ternström]

Speech naturalness

This part of the project is currently attacked by looking at how cost functions may be adapted to the perception of naturalness of different types of discontinuities at a join in a vowel, in particular artifacts due to pitch and spectrum discontinuities. Preliminary results have been presented at the Interspeech conference (Bjørkan, Svendsen, and Farner, 2005).

Music naturalness

In music synthesis it seems as though humanlike gestures and a good instrument model are important factors for the generation of natural sound. As a first approach, we have assumed that a MIDI-controlled digital clarinet synthesiser based on physical models is a sufficiently good instrument model. The advantage of this approach is that the model may be played by a human musician or by a computer, so that their performances may be compared directly. A human being was asked to play the same melody 20 times with the same expression, and the playing parameters (MIDI) where recorded. The systematic behaviour of the parameters was extracted by averaging, and the fluctuation was quantified by the standard deviation. When reinjecting the averaged parameters (the natural fluctuations thus having been removed), a part of the naturalness seemed to have been lost, indicating that such fluctuations contribute to naturalness (Farner, Kronland, Voinier, and Ystad, 2005 and 2006a).

This was planned verified and quantified by listening tests: The playing parameters (the blowing pressure BC) from the above performances (unmodified condition V+F+) was modified along two orthogonal axis: The variation of the note velocity has been removed (condition V-), and the fluctuations during each note were fixed to a certain level depending on the maximum and the mean of the BC of this note (F-) or they were simplified by a ASDR envelope (F*). These modifications gave 6 versions of the melody (see figure below and listen to the resulting sound), and the listeners are presented with 5 different performances in all these conditions, in total 30 stimuli [Paper in preparation: Farner, Kronland, Behne, Voinier, and Ystad].

Listening examples:

Cond.	V+	V-
F+	V+F+	V-F+
F*	V+F*	V-F*
F-	V+F-	V-F-

References

I. Bjørkan, T. Svendsen, and S. Farner. Comparing Spectral Distance Measures for Join Cost Optimization in Concatenative Speech Synthesis. Proceedings of Interspeech 2005, Lisboa, Portugal, September 2005, pp. 2577-2580
S. Farner, R. Kronland-Martinet, T. Voinier, and S. Ystad. Sound fluctuation as an attribute of naturalness in clarinet play. Proceedings of the International Symposium on Computer Music Modeling and Retrieval (CMMR) 2005, Pisa, Italy, September 2005, pp. 210-218
S. Farner, R. Kronland-Martinet, T. Voinier, and S. Ystad. Timbre variation as an attribute of naturalness in clarinet play. Submitted for publishing in Computer Music Modeling and Retrieval. International Symposium CMMR 2005. Lecture Notes in Computer Science Series. Springer-Verlag Spring 2006a
Publications in preparation:
S. Farner and S. Ternström. On the meaning of naturalness in synthesized speech and music. Paper in preparation.
S. Farner, R. Kronland-Martinet, D. Behne, T. Voinier, and S. Ystad. Influence of variations of blowing pressure on the perception of naturalness in clarinet play. Paper in preparation.

Home: http://www.pvv.ntnu.no/~farner, E-mail: farner(a)pvv.ntnu.no
Last modified: Mon Apr 18 13:23:08 CEST 2005