ֱ̽ of Cambridge - phonetics /taxonomy/subjects/phonetics en Time travelling to the mother tongue /research/features/time-travelling-to-the-mother-tongue <div class="field field-name-field-news-image field-type-image field-label-hidden"><div class="field-items"><div class="field-item even"><img class="cam-scale-with-grid" src="/sites/default/files/styles/content-580x288/public/news/research/features/160630spectrogram.jpg?itok=854Lwc4i" alt="" title="Spectrogram showing the shape of the sound of a word, Credit: John Aston" /></div></div></div><div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>No matter whether you speak English or Urdu, Waloon or Waziri, Portuguese or Persian, the roots of your language are the same. Proto-Indo-European (PIE) is the mother tongue – shared by several hundred contemporary languages, as well as many now extinct, and spoken by people who lived from about 6,000 to 3,500 BC on the steppes to the north of the Caspian Sea.</p> <p>They left no written texts and although historical linguists have, since the 19th century, painstakingly reconstructed the language from daughter languages, the question of how it actually sounded was assumed to be permanently out of reach.</p> <p>Now, researchers at the Universities of Cambridge and Oxford have developed a sound-based method to move back through the family tree of languages that stem from PIE. They can simulate how certain words would have sounded when they were spoken 8,000 years ago.</p> <p>Remarkably, at the heart of the technology is the statistics of shape.</p> <p>“Sounds have shape,” explains Professor John Aston, from Cambridge’s Statistical Laboratory. “As a word is uttered it vibrates air, and the shape of this soundwave can be measured and turned into a series of numbers. Once we have these stats, and the stats of another spoken word, we can start asking how similar they are and what it would take to shift from one to another.” </p> <p>A word said in a certain language will have a different shape to the same word in another language, or an earlier language. ֱ̽researchers can shift from one shape to another through a series of small changes in the statistics. “It’s more than an averaging process, it’s a continuum from one sound to the other,” adds Aston, who is funded by the Engineering and Physical Sciences Research Council (EPSRC). “At each stage, we can turn the shape back into sound to hear how the word has changed.”</p> <p>Rather than reconstructing written forms of ancient words, the researchers triangulate backwards from contemporary and archival audio recordings to regenerate audible spoken forms from earlier points in the evolutionary tree. Using a relatively new field of shape-based mathematics, the researchers take the soundwave and visualise it as a spectrogram – basically an undulating three-dimensional surface that represents the shape of that sound – and then reshape the spectrogram along a trajectory ‘signposted’ by known sounds.</p> <p>While Aston leads the team of statistician ‘shape-shifters’ in Cambridge, the acoustic-phonetic and linguistic expertise is provided by Professor John Coleman’s group in Oxford.</p> <p> ֱ̽researchers are working on the words for numbers as these have the same meaning in any language. ֱ̽longest path of development simulated so far goes backwards 8,000 years from <a href="http://www.phon.ox.ac.uk/jcoleman/one-from-oins.wav">English <em>one</em> to its PIE ancestor <em>oinos</em></a>, and likewise for other numerals. They have also ‘gone forwards’ from the PIE <em>penkwe</em> to the modern Greek <em>pente</em>, modern Welsh <em>pimp</em> and modern English<em>five</em>, as well as simulating change from Modern English to Anglo-Saxon (or vice versa), and from Modern Romance languages back to Latin.</p> <p><em>(Other audio demonstrations are available <a href="http://www.phon.ox.ac.uk/jcoleman/ancient-sounds-audio.html">here</a>)</em></p> <p>“We’ve explicitly focused on reproducing sound changes and etymologies that the established analyses already suggest, rather than seeking to overturn them,” says Coleman, whose research was funded by the Arts and Humanities Research Council.</p> <p>They have discovered words that appear to correctly ‘fall out’ of the continuum. “It’s pleasing, not because it overturns the received wisdom, but because it encourages us that we are getting something right, some of the time at least. And along the way there have also been a few surprises!” ֱ̽method sometimes follows paths that do not seem to be etymologically correct, demonstrating that the method is scientifically testable and pointing to areas in which refinements are needed.</p> <p>Remarkably, because the statistics describe the sound of an individual saying the word, the researchers are able to keep the characteristics of pitch and delivery the same. They can effectively turn the word spoken by someone in one language into what it would sound like if they were speaking fluently in another.</p> <p><img alt="" src="/sites/www.cam.ac.uk/files/inner-images/160630_horizontal_language_figure.jpg" style="width: 100%;" /></p> <p>They can also extrapolate into the future, although with caveats, as Coleman describes: “If you just extrapolate linearly, you’ll reach a point at which the sound change hits the limit of what is a humanly reasonable sound. This has happened in some languages in the past with certain vowel sounds. But if you asked me what English will sound like in 300 years, my educated guess is that it will be hardly any different from today!”</p> <p>For the team, the excitement of the research includes unearthing some gems of archival recordings of various languages that had been given up for dead, including an Old Prussian word last spoken by people in the early 1700s but ‘borrowed’ into Low Prussian and discovered in a German audio archive.</p> <p>Their work has applications in automatic translation and film dubbing, as well as medical imaging (see panel), but the principal aim is for the technology to be used alongside traditional methods used by historical linguists to understand the process of language change over thousands of years.</p> <p>“From my point of view, it’s amazing that we can turn exciting yet highly abstract statistical theory into something that really helps explain the roots of modern language,” says Aston.</p> <p>“Now that we’ve developed many of the necessary technical methods for realising the extraordinary ambition of hearing ancient sounds once more,” adds Coleman, “these early successes are opening up a wide range of new questions, one of the central being how far back in time can we really go?”</p> <p><em>Audio demonstrations are available here: <a href="http://www.phon.ox.ac.uk/jcoleman/ancient-sounds-audio.html">www.phon.ox.ac.uk/jcoleman/ancient-sounds-audio.html</a></em></p> <p><em>Inset image: Spectrograms showing how the shape of the sound of a word in one language can be morphed into the sound of the same word in another language; credit: John Aston.</em></p> </div></div></div><div class="field field-name-field-content-summary field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p><p> ֱ̽sounds of languages that died thousands of years ago have been brought to life again through technology that uses statistics in a revolutionary new way.</p> </p></div></div></div><div class="field field-name-field-content-quote field-type-text-long field-label-hidden"><div class="field-items"><div class="field-item even">As a word is uttered it vibrates air, and the shape of this soundwave can be measured and turned into a series of numbers</div></div></div><div class="field field-name-field-content-quote-name field-type-text field-label-hidden"><div class="field-items"><div class="field-item even">John Aston</div></div></div><div class="field field-name-field-image-credit field-type-link-field field-label-hidden"><div class="field-items"><div class="field-item even"><a href="/" target="_blank">John Aston</a></div></div></div><div class="field field-name-field-image-desctiprion field-type-text field-label-hidden"><div class="field-items"><div class="field-item even">Spectrogram showing the shape of the sound of a word</div></div></div><div class="field field-name-field-panel-title field-type-text field-label-hidden"><div class="field-items"><div class="field-item even">Medical imaging reshaped</div></div></div><div class="field field-name-field-panel-body field-type-text-long field-label-hidden"><div class="field-items"><div class="field-item even"><p><strong> ֱ̽statistics of shape are not just being used to show how different languages relate to each – they are also being used to improve the analysis of medical images.</strong></p> <p>Just as soundwaves have a shape that can be analysed using statistics, so do the patterns of neurons interacting with each other or the dimensions of the surface of a tumour. Now a new research Centre will develop tools that use the mathematics of the shapes found in medical images to improve diagnosis, prognosis and treatment planning for patients.</p> <p> ֱ̽<a href="http://www.damtp.cam.ac.uk/user/cbs31/CMiH/Welcome.html">EPSRC Centre for Mathematical and Statistical Analysis of Multimodal Clinical Imaging</a>, one of five ‘maths’ centres recently funded by £10 million from EPSRC, is co-led by Aston and Dr Carola-Bibiane Schönlieb from the Department of Applied Mathematics and Theoretical Physics in Cambridge.</p> <p>“ ֱ̽new methodologies will allow clinical medicine to move beyond one person reading single scans, to automated systems capable of analysing populations of images,” explains Schönlieb. “As a result, clinicians will have far greater scope to ask complex questions of the medical image.”</p> <p>It’s already possible to extract statistical information from an image of a patient’s thigh bone, turn the data into a template for comparison with those from other people in the population, and then ask whether a particular shape of bone is more prone to being broken than others in the elderly.</p> <p>Most organ scans split the image into many elements, which are then analysed voxel by voxel. “But complex structures like the heart and the brain should be analysed holistically,” explains Dr James Rudd, from the Department of Medicine, who leads the clinical interaction with the Centre. “ ֱ̽tools we are developing will enable the analysis of organs like the brain as single objects with millions of connections.”</p> <p> ֱ̽Centre brings together researchers and clinicians from applied and pure maths, engineering, physics, biology, oncology, clinical neuroscience and cardiology, and involves industrial partners Siemens, AstraZeneca, Microsoft, GSK and Cambridge Computed Imaging.</p> </div></div></div><div class="field field-name-field-cc-attribute-text field-type-text-long field-label-hidden"><div class="field-items"><div class="field-item even"><p><a href="http://creativecommons.org/licenses/by/4.0/" rel="license"><img alt="Creative Commons License" src="https://i.creativecommons.org/l/by/4.0/88x31.png" style="border-width:0" /></a><br /> ֱ̽text in this work is licensed under a <a href="http://creativecommons.org/licenses/by/4.0/" rel="license">Creative Commons Attribution 4.0 International License</a>. For image use please see separate credits above.</p> </div></div></div><div class="field field-name-field-show-cc-text field-type-list-boolean field-label-hidden"><div class="field-items"><div class="field-item even">Yes</div></div></div><div class="field field-name-field-related-links field-type-link-field field-label-above"><div class="field-label">Related Links:&nbsp;</div><div class="field-items"><div class="field-item even"><a href="http://www.phon.ox.ac.uk/jcoleman/ancient-sounds-home.html">Ancient Sounds project</a></div><div class="field-item odd"><a href="http://www.damtp.cam.ac.uk/user/cbs31/CMiH/Welcome.html">EPSRC Centre for Mathematical and Statistical Analysis of Multimodal Clinical Imaging</a></div></div></div> Tue, 19 Jul 2016 08:00:51 +0000 lw355 176132 at Can a voice identify a criminal? /research/news/can-a-voice-identify-a-criminal <div class="field field-name-field-news-image field-type-image field-label-hidden"><div class="field-items"><div class="field-item even"><img class="cam-scale-with-grid" src="/sites/default/files/styles/content-580x288/public/news/research/news/111117-ear-travis-isaacs.jpg?itok=212RzbZt" alt="Ear" title="Ear, Credit: Travis Isaacs from Flickr" /></div></div></div><div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Recognising a voice is a familiar experience for most people – identifying a friend’s voice over the telephone, recognising the voice of a well-known personality on the radio, hearing the voice of a colleague call out from behind. But why do voices sound distinctive? Given our ability to recognise individuals, it seems reasonable to assume that voices are unique, but it has not been scientifically demonstrated that all voices are measurably distinctive. In spite of the impression given by televised crime shows, as yet there is no technique available to identify a speaker with 100% reliability.</p>&#13; <div class="bodycopy">&#13; <p>This is a serious problem for forensic speaker identification, a branch of forensic phonetics in which a phonetician is asked to identify an unknown speaker whose voice has been recorded during the committing of a crime, for example a bomb threat, ransom demand, hoax emergency call or drug deal. ֱ̽phonetician compares the incriminating recording with samples of speech from a suspect with a view to identifying the perpetrator or eliminating the suspect. These cases are often controversial, and since the extent to which an individual’s voice is idiosyncratic has not yet been established, research in this area is crucial.</p>&#13; <p>A key problem in attempting to characterise a speaker is that each individual’s voice can vary greatly. We change our voices depending on who we are talking to, how formal the situation is, the emotion we wish to express and whether there is background noise. Speakers’ voices also change if they are tired, drunk or have a cold or sore throat, and of course speakers can disguise their voices. So a voice is much more complicated to capture than a fingerprint, which is a fixed, unchanging feature of an individual.</p>&#13; <p><strong>DyViS: investigating speech</strong></p>&#13; <p>A team of researchers in the Department of Linguistics – Dr Kirsty McDougall, Dr Gea de Jong, Toby Hudson and Professor Francis Nolan – is carrying out innovative research in speaker identification in the DyViS project (Dynamic Variability in Speech: A Forensic Phonetic Study of British English), funded by the Economic and Social Research Council (ESRC).</p>&#13; <p>To investigate the problem of variation within a speaker’s voice, the DyViS team have compiled a large-scale database of recordings of southern British English spoken across a range of speaking styles. Speakers participated in several tasks: a mock police interview where they were required to ‘lie’ about a particular scenario, a telephone call with a friend involving a more casual and relaxed style of speech, and a number of reading tasks. All of the speaking tasks included a particular selection of words that the participants had to utter in different contexts. These data enable the researchers to investigate how phonetic features of these words change for a given individual across the different speaking styles, and to what extent these features can be used to distinguish individuals.</p>&#13; <p><strong>Identifying the speaker</strong></p>&#13; <p>One particular feature being examined is a phenomenon known as ‘formant frequency dynamics’. Formant frequencies are the resonances of the vocal tract during speech – the frequencies at which vibrations of air are at maximum amplitude in the vocal tract in speech sounds such as vowels. Formant frequencies appear as roughly horizontal dark bands on a spectrogram, a computer-generated representation of the acoustic speech signal. These frequencies are powerful cues to speaker identity since they are determined by both the physical dimensions of a speaker’s vocal tract and the way the speaker configures the vocal organs to produce each sound.</p>&#13; <p>Previous research on speaker differences has typically measured the formant frequencies only at the centre of the sound. ֱ̽DyViS research goes beyond these ‘static’ measures to investigate the dynamics of formant frequencies, which reflect the movement of a person’s speech organs and are likely to reveal more fine-grained differences among speakers. Just as people exhibit personal styles for walking, running and other skilled motor activities, they move their vocal organs in individual ways when producing speech.</p>&#13; <p>Dr McDougall’s experiments have investigated the speaker-distinguishing potential of the formant frequency dynamics of the vowel sound in spoken words like bike and hike, of the vowel sound in who’d, and of sequences containing an ‘r’ sound preceded and followed by vowel sounds such as a route and a rack. ֱ̽work shows that formant frequency dynamics carry considerable speaker-specific information. By taking measurements along the formant contours surrounding the centre of a speech sound, a significant improvement in speaker discrimination is achieved.</p>&#13; <p><strong>Forensic phonetics</strong></p>&#13; <p>Together with research into other features of speech being investigated by the DyViS team, this work offers crucial new directions for solutions to the problem of extracting a speaker’s ‘signature’ from the speech signal. Findings from the DyViS project suggest that dynamic features of speech could provide a clue in speaker identification, which has clear applications in forensic evidence – in comparing voices and speech for purposes of identification, and in analysing speech recordings.</p>&#13; <p> ֱ̽research also has important implications for phonetic theory. Current models of speech production and perception do not provide a good explanation of the role of individual variation in speech communication. ֱ̽analysis of dynamic features of speech being undertaken by the DyViS team will lead to important theoretical developments in these areas, contributing to our understanding of how individual speakers can communicate with the same language yet sound so different from each other.</p>&#13; </div>&#13; </div></div></div><div class="field field-name-field-content-summary field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p><p>Innovative research in the Department of Linguistics suggests that dynamic features of speech could provide a clue to forensic speaker identification.</p>&#13; </p></div></div></div><div class="field field-name-field-content-quote field-type-text-long field-label-hidden"><div class="field-items"><div class="field-item even">A key problem in attempting to characterise a speaker is that each individual’s voice can vary greatly. We change our voices depending on who we are talking to, how formal the situation is, the emotion we wish to express and whether there is background noise.</div></div></div><div class="field field-name-field-image-credit field-type-link-field field-label-hidden"><div class="field-items"><div class="field-item even"><a href="/" target="_blank">Travis Isaacs from Flickr</a></div></div></div><div class="field field-name-field-image-desctiprion field-type-text field-label-hidden"><div class="field-items"><div class="field-item even">Ear</div></div></div><div class="field field-name-field-cc-attribute-text field-type-text-long field-label-hidden"><div class="field-items"><div class="field-item even"><p><a href="https://creativecommons.org/licenses/by-nc-sa/3.0/"><img alt="" src="/sites/www.cam.ac.uk/files/80x15.png" style="width: 80px; height: 15px;" /></a></p>&#13; <p>This work is licensed under a <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/">Creative Commons Licence</a>. If you use this content on your site please link back to this page.</p>&#13; </div></div></div><div class="field field-name-field-show-cc-text field-type-list-boolean field-label-hidden"><div class="field-items"><div class="field-item even">Yes</div></div></div> Sat, 01 Sep 2007 00:00:00 +0000 tdk25 25616 at