

In natural settings, sounds emanating from different sources mix and interfere before they reach the ears of a listener. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Ĭompeting interests: The authors have declared that no competing interests exist. Support for the authors was provided by the NSERC Colloborative Research and Training Experience program. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.ĭata Availability: All relevant data are within the paper and its Supporting Information files.įunding: This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). Received: Accepted: SeptemPublished: October 5, 2017Ĭopyright: © 2017 Hambrook et al. PLoS ONE 12(10):Įditor: Blake Johnson, Australian Research Council Centre of Excellence in Cognition and its Disorders, AUSTRALIA We also suggest that neurophysiological models of sound localization in animals could benefit from revision to include the influence of top-down memory and sensorimotor integration across head rotations.Ĭitation: Hambrook DA, Ilievski M, Mosadeghzad M, Tata M (2017) A Bayesian computational basis for auditory selective attention using head rotation and the interaural time-difference cue. Our findings suggest that an “active hearing” approach could be useful in robotic systems that operate in natural, noisy settings. Contrary to commonly held assumptions about sound localization, we show that the ITD cue used with high-frequency sound can provide accurate and unambiguous localization and resolution of competing sounds. The model makes use of head rotations to show that ITD information is sufficient to unambiguously resolve sound sources in both space and frequency. We present a simple Bayesian model and an implementation on a robot that uses ITD information recursively. This has been thought to limit their usefulness at frequencies above about 1khz in humans. The neural computations for detecting interaural time difference (ITD) have been well studied and have served as the inspiration for computational auditory scene analysis systems, however a crucial limitation of ITD models is that they produce ambiguous or “phantom” images in the scene. However, the ability of such systems to resolve distinct sound sources in both space and frequency remains limited.

It is well-known that animals use binaural differences in arrival time and intensity at the two ears to find the arrival angle of sounds in the azimuthal plane, and this localization function has sometimes been considered sufficient to enable the un-mixing of complex scenes. the crescent’s width)-we conclude that this audiovisual influence on the perception of identity over time reflects perceptual processing rather than higher-level decisions.The process of resolving mixtures of several sounds into their separate individual streams is known as auditory scene analysis and it remains a challenging task for computational systems. as streaming)-and because the effect involves a subtle quantitative influence on another clearly visual property (i.e. Because observers never have to explicitly categorize their percept (e.g. Moreover, this effect is due to the coincidence of the tone, per se, since the effect disappears when the tone is embedded in a larger regular tone sequence. Here we demonstrate that merely playing a sound coincident to the moment of overlap can also reliably induce the perception of such illusory crescents. Here we explore the nature of this phenomenon using ‘illusory causal crescents’: when people perceive bouncing (or causal ‘launching’), they also perceive the second disc to begin moving before being fully overlapped with the first disc (i.e. Recent research has attributed this effect to decisional (rather than perceptual) processes by showing that auditory tones alter response biases but not the underlying sensitivity for detecting objective bounces. Did the discs stream past each other, or bounce off each other? When people are likely to perceive streaming, playing a brief tone at the moment of overlap can readily cause them to see bouncing instead. The need for such processing is made especially clear in ambiguous situations such as the bouncing/streaming display: two discs move toward each other, superimpose, and then continue along their trajectories. A central task for vision is to identify objects as the same persisting individuals over time and motion.
