Active Noise Control (ANC) headphones are commonly employed to create a quiet zone around the ears of users. In conventional ANC technique, the ambient noise is picked up by the reference microphones on the earcups of ANC headphones then relayed to the ANC controller, which generates anti-noise to suppress it. In wireless ANC system, the reference microphone is situated close to the noise source to acquire the high-quality primary noise via wireless communication which significantly improves its noise reduction performance. In this paper, a multi-channel feedforward wireless ANC system is implemented in a headphone. Furthermore, a coherence-based weight determination algorithm is proposed to improve the noise reduction performance of headphones. To validate the efficacy of the proposed wireless ANC headphone, the numerical simulations along with the real-time experiments are performed.
Augmented or mixed reality (AR/MR) is emerging as one of the key technologies in the future of computing. Audio cues are critical for maintaining a high degree of realism, social connection, and spatial awareness for various AR/MR applications, such as education and training, gaming, remote work, and virtual social gatherings to transport the user to an alternate world called the metaverse. Motivated by a wide variety of AR/MR listening experiences delivered over hearables, this article systematically reviews the integration of fundamental and advanced signal processing techniques for AR/MR audio to equip researchers and engineers in the signal processing community for the next wave of AR/MR.
Active noise control (ANC) headphones are commonly used to reduce annoyed noise around users’ ears. Most commercial ANC headphones utilize specific filters with pre-trained coefficients due to their fast response and robustness. However, when dealing with primary noises coming from different directions, their noise reduction performance suffers significantly. Hence, we proposed an adaptive gain (AG) algorithm on the fixed filters with multi-reference method, in which the control signal is formed by adaptively weighting and summing the output signals of multiple fixed filters that have been pre-trained from primary noise with different directions of arrival. By combining the adaptive algorithm with the fixed filter approach, the proposed algorithm outperforms the conventional fixed ANC approach in noise reduction performance. This paper provides a theoretical analysis of the proposed algorithm’s step-size bound and time constant, demonstrating its robustness and fast convergence behavior. Furthermore, comparative simulations and real-time experiments with commercial ANC headphone and other adaptive algorithms show its efficacy in dealing with multiple noise sources in the actual scenario.
The selective fixed-filter active noise control (SFANC) method selecting the best pre-trained control filters for various types of noise can achieve a fast response time. However, it may lead to large steady-state errors due to inaccurate filter selection and the lack of adaptability. In comparison, the filtered-X normalized least-mean-square (FxNLMS) algorithm can obtain lower steady-state errors through adaptive optimization. Nonetheless, its slow convergence has a detrimental effect on dynamic noise attenuation. Therefore, this paper proposes a hybrid SFANC-FxNLMS approach to overcome the adaptive algorithm’s slow convergence and provide a better noise reduction level than the SFANC method. A lightweight one-dimensional convolutional neural network (1D CNN) is designed to automatically select the most suitable pre-trained control filter for each frame of the primary noise. Meanwhile, the FxNLMS algorithm continues to update the coefficients of the chosen pre-trained control filter at the sampling rate. Owing to the effective combination of the two algorithms, experimental results show that the hybrid SFANC-FxNLMS algorithm can achieve a rapid response time, a low noise reduction error, and a high degree of robustness.
The ecological validity of soundscape studies usually rests on a choice of soundscapes that are representative of the perceptual space under investigation. For example, a soundscape pleasantness study might investigate locations with soundscapes ranging from "pleasant" to "annoying". The choice of soundscapes is typically researcher-led, but a participant-led process can reduce selection bias and improve result reliability. Hence, we propose a robust participant-led method to pinpoint characteristic soundscapes possessing arbitrary perceptual attributes. We validate our method by identifying Singaporean soundscapes spanning the perceptual quadrants generated from the "Pleasantness" and "Eventfulness" axes of the ISO 12913-2 circumplex model of soundscape perception, as perceived by local experts. From memory and experience, 67 participants first selected locations corresponding to each perceptual quadrant in each major planning region of Singapore. We then performed weighted k-means clustering on the selected locations, with weights for each location derived from previous frequencies and durations spent in each location by each participant. Weights hence acted as proxies for participant confidence. In total, 62 locations were thereby identified as suitable locations with characteristic soundscapes for further research utilizing the ISO 12913-2 perceptual quadrants. Audio-visual recordings and acoustic characterization of the soundscapes will be made in a future study.
Sound event localization and detection (SELD) consists of two subtasks, which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses amplitude and/or phase differences between microphones to estimate source directions. As a result, it is often difficult to jointly optimize these two subtasks. We propose a novel feature called Spatial cue-Augmented Log-SpectrogrAm (SALSA) with exact time-frequency mapping between the signal power and the source directional cues, which is crucial for resolving overlapping sound sources. The SALSA feature consists of multichannel log-spectrograms stacked along with the normalized principal eigenvector of the spatial covariance matrix at each corresponding time-frequency bin. Depending on the microphone array format, the principal eigenvector can be normalized differently to extract amplitude and/or phase differences between the microphones. As a result, SALSA features are applicable for different microphone array formats such as first-order ambisonics (FOA) and multichannel microphone array (MIC). Experimental results on the TAU-NIGENS Spatial Sound Events 2021 dataset with directional interferences showed that SALSA features outperformed other state-of-the-art features. Specifically, the use of SALSA features in the FOA format increased the F1 score and localization recall by 6% each, compared to the multichannel log-mel spectrograms with intensity vectors. For the MIC format, using SALSA features increased F1 score and localization recall by 16% and 7%, respectively, compared to using multichannel log-mel spectrograms with generalized cross-correlation spectra.
Traditional decibel-based measures in predicting annoyance from construction activities are limited to reflect high acoustic variability of construction machinery noises. Hence, a multidimensional approach based on perceptual attributes and psychoacoustic parameters is proposed. In-situ audio-visual recordings of 16 construction machinery in operation were evaluated subjectively on both perceived annoyance and a 12-item semantic differential perceptual attribute scale. The 16 machinery noises formed three clusters based on four perceptual components (Incisiveness, Strength, Intermittency, and Periodicity) derived via principal component analysis of the perceptual attributes. Notably, individual perceptual components strongly correlate with mean values of psychoacoustic parameters (loudness, sharpness, roughness, and fluctuation strength) over time, which we use to develop an annoyance model for construction noise. Both loudness and fluctuation strength were critical parameters to discriminate between clusters. The model can be used to automatically categorize construction noises by cluster and manage it based on known cluster characteristics.
Active noise control (ANC) technology is increasingly ubiquitous in wearable audio devices, or hearables. Owing to its low computational complexity, high robustness, and exemplary performance in dealing with dynamic noise, the fixed-coefficient control filter strategy plays a central role in portable ANC implementation. Unlike its traditional adaptive counterpart, the fixed-filter strategy is unable to attain optimal noise reduction for different types of noise. Hence, we propose a selective fixed-filter ANC method based on a simplified two-dimensional convolution neural network (2D CNN), which is implemented on a co-processor (e.g., in a mobile phone), to derive the most suitable control filter for different noise types. To further reduce classification complexity, we designed a lightweight one-dimensional CNN (1D CNN), which can directly classify noise types in time domain. A numerical simulation based on measured paths in headphones demonstrates the proposed algorithm’s efficacy in attenuating real-world non-stationary noise over conventional adaptive algorithms.
Active noise control (ANC) is gaining credence as an effective approach in mitigating low-frequency urban noise. Current ANC algorithms that attenuate noise across the full audio frequency band, often exert control effort excessively at higher frequencies. Moreover, such unrestrained control unavoidably attenuates some critical sounds, such as alarms and warning signals. In practice, excessive control effort in ANC usually results in output-saturation distortion, which affects the adaptive system stability and degrades the residual audio quality. To provide greater flexibility of control in frequency bands of interest without incurring output saturation, this paper proposes two variants of the filtered reference comb-partitioned frequency-domain adaptive filter (FxCFDAF) algorithm, namely the leaky FxCFDAF and drop-out FxCFDAF algorithms, which exert output-effort constraint at the control filter in the frequency domain. In addition, the comb-partitioning approach in the proposed FxCFDAF algorithms is delayless and computationally-efficient, which are essential for practical implementation, unlike the conventional frequency-domain adaptive algorithm. Experimental simulations carried out on measured primary and secondary paths validate the proposed algorithms’ advantage over the conventional FxLMS algorithm in mitigating large-amplitude noise without output saturation.
Urban noise pollution is an omnipresent but often neglected threat to public health that must be addressed urgently. Passive noise control measures, which are less effective at reducing low-frequency noise and are often bulky and may impede airflow. As evidenced in automobiles, active control of cabin noise has resulted in lighter cars due to reduced passive insulation. Despite its long history and recent popularisation by consumer headphones, the implementation of active noise control in the built environment is still rare. To date, active noise control (ANC) has been demonstrated, at source, in construction machines and, in the transmission path, in noise barriers. Recent demand for naturally-ventilated buildings has also spurred the development of active control solutions at the receiving end, such as on windows. The ten questions aim to demystify the principles of ANC and highlight areas in which environmental noise can be actively mitigated. Since the implementation of active control in the built environment usually involves multiple stakeholders, operational concerns are addressed. To conclude, research gaps are identified that would enable increased adoption of ANC in the built environment. There is also renewed interest in applying intelligent ANC to tackle environmentally complex applications, such as varying noise levels in the earcup of ANC headphones, particularly with the advent of the low-cost, low-power, highly-efficient embedded electronics; advancing speaker technology; and new impetus from digital signal processing and artificial intelligence Algorithms.
With rapid urbanization comes the increase of community, construction, and transportation noise in residential areas. The conventional approach of solely relying on sound pressure level (SPL) information to decide on the noise environment and to plan out noise control and mitigation strategies is inadequate. This paper presents an end-to-end IoT system that extracts real-time urban sound metadata using edge devices, providing information on the sound type, location and duration, rate of occurrence, loudness, and azimuth of a dominant noise in nine residential areas. The collected metadata on environmental sound is transmitted to and aggregated in a cloud-based platform to produce detailed descriptive analytics and visualization. Our approach in integrating different building blocks, namely, hardware, software, cloud technologies, and signal processing algorithms to form our real-time IoT system is outlined. We demonstrate how some of the sound metadata extracted by our system are used to provide insights into the noise in residential areas. A scalable workflow to collect and prepare audio recordings from nine residential areas to construct our urban sound dataset for training and evaluating a location-agnostic model is discussed. Some practical challenges of managing and maintain a sensor network deployed at numerous locations are also addressed.
With the advent of efficient low-cost processors and electroacoustic components, there is renewed interest in the practical implementation of active noise control (ANC). However, the slow convergence of conventional adaptive algorithms deployed in ANC restricts its handling of typical amplitude-varying noise. Hence, we proposed a modified model-agnostic, meta-learning (MAML) strategy to obtain an initial control filter, which accelerates an adaptive algorithm's convergence when dealing with different types of amplitude-varying low-frequency noise. Numerical simulations with measured paths and real noise sources demonstrate its convergence acceleration efficacy in practical scenarios.
To investigate the effect of augmenting natural sounds in noisy environments, an in-situ experiment was conducted using a mixed-reality head-mounted display (MR HMD). Two outdoor locations close to an expressway were selected for the experiment. A natural sound (birdsong or stream) along with a hologram (sparrow/fountain or loudspeaker) was projected through the MR HMD. Participants were asked to adjust the natural sound levels to their preferred level under ambient traffic noise conditions at each location. Participants also assessed the perceived loudness of traffic (PLN) and overall soundscape quality (OSQ) in conditions with and without the augmented natural sounds. The results showed that both natural sounds significantly reduced the PLN and enhanced the OSQ. No significant differences in subjective responses were found between the loudspeaker and visual representations of the natural sound source as holograms. Analysis on the preferred signal-to-noise ratio (SNR), i.e. ratio of natural sound to traffic levels, indicated a strong negative correlation between the preferred SNRs and ambient traffic noise levels. Overall, the preferred SNR of the birdsong was significantly higher than that of the water sound. Among the acoustic parameters tested, the A-weighted traffic noise level was the strongest predictor for the preferred SNR of both the birdsong and water sound. However, the correlation for the water sound was relatively higher than the birdsong. This was due to the larger variance in the subjective evaluation for the birdsong.
Before introducing natural sounds to potentially improve the soundscape quality, it is important to understand how key contextual factors (i.e. expected activities and audio-visual congruency) affect the soundscape in a given location. In this study, the perception of eight natural sounds (i.e. 4 birdsongs, 4 water sounds) at five urban recreational areas under the constant influence of road traffic was explored subjectively under three laboratory settings: visual-only, audio-only, and audio-visual. Firstly, expected socio-recreational activities of each location were determined in the visual-only setting. Subsequently, participants assessed the pleasantness and appropriateness of the soundscape at each site, for each of the eight natural sounds augmented to the same road traffic noise, in both audio-only and audio-visual settings. Interestingly, it was found that the expected activities in each location did not significantly affect natural sound perception, whereas audio-visual congruency of the locations significantly affected the pleasantness and appropriateness of the natural sounds. Particularly, the pleasantness and appropriateness decreased for water sounds when water features were not visually present. In contrast, perception with birdsongs was unaffected by their visibility likely due to the presence of vegetation. Hence, audio-visual coherence is central to the perception of natural sounds in outdoor spaces.
Multichannel active noise control (MCANC) is widely regarded as an effective solution to achieve a significantly large noise-cancellation area in a complicated acoustic field. However, the computational complexity of MCANC algorithms, such as the multichannel filter-x least mean square (McFxLMS) algorithm, grows exponentially with an increased channel count. Many modified algorithms have been proposed to alleviate the complexity but at the expense of noise reduction performance. Till now, the trade-off between computational complexity and noise reduction performance has limited the practical implementation of MCANC. The block coordinate descent McFxLMS (BCD McFxLMS) algorithm proposed in this paper substantially reduces the computation cost of an MCANC system, while maintaining the same noise reduction performance as the conventional McFxLMS algorithm. Furthermore, a momentum mechanism is integrated into the BCD McFxLMS in practice to improve the convergence speed. The simulation and experimental results validate the effectiveness of proposed algorithms when dealing with noise in a realistic environment.
Many signal processing-based methods for sound source direction-of-arrival estimation produce a spatial pseudo-spectrum of which the local maxima strongly indicate the source directions. Due to different levels of noise, reverberation and different number of overlapping sources, the spatial pseudo-spectra are noisy even after smoothing. In addition, the number of sources is often unknown. As a result, selecting the peaks from these spectra is susceptible to error. Convolutional neural network has been successfully applied to many image processing problems in general and direction-of-arrival estimation in particular. In addition, deep learning-based methods for direction-of-arrival estimation show good generalization to different environments. We propose to use a 2D convolutional neural network with multi-task learning to robustly estimate the number of sources and the directions-of-arrival from short-time spatial pseudo-spectra, which have useful directional information from audio input signals. This approach reduces the tendency of the neural network to learn unwanted association between sound classes and directional information, and helps the network generalize to unseen sound classes. The simulation and experimental results show that the proposed methods outperform other directional-of-arrival estimation methods in different levels of noise and reverberation, and different number of sources.
Shutting the window is usually the last resort in mitigating environmental noise, at the expense of natural ventilation. We describe an active sound control system fitted onto the opening of the domestic window that attenuates the incident sound, achieving a global reduction in the room interior while maintaining natural ventilation. The incident sound is actively attenuated by an array of control modules (a small loudspeaker) distributed optimally across the aperture. A single reference microphone provides advance information for the controller to compute the anti-noise signal input to the loudspeakers in real-time. A numerical analysis revealed that the maximum active attenuation potential outperforms the perfect acoustic insulation provided by a fully shut single-glazed window in ideal conditions. To determine the real-world performance of such an active control system, an experimental system is realized in the aperture of a full-sized window installed on a mockup room. Up to 10-dB reduction in energy-averaged sound pressure level was achieved by the active control system in the presence of a recorded real-world broadband noise. However, attenuation in the low-frequency range and its maximum power output is limited by the size of the loudspeakers.
The feedforward active noise control (FF ANC) technique has been widely used to cancel the broadband noise in many practical applications. However, it fails to cope with the uncorrelated narrow-band disturbance picked up by the error sensor, which is independent of the reference signal picked up by the reference sensor. Hence, the alternative switching hybrid ANC has been proposed in the paper to attenuate this kind of uncorrelated narrow-band disturbance effectively. Furthermore, compared with the conventional hybrid ANC algorithms, the proposed algorithm alternatively switches to do updating between the feedforward (FF) and feedback (FB) ANC. The algorithm, thus, saves the computations and achieves a faster convergence speed. The numerical simulations and real-time experiments in the paper validate the effectiveness of the proposed algorithm, which exhibits better convergence and noise reduction performance than other hybrid active control algorithms.
The push for greater urban sustainability has increased the urgency of the search for noise mitigation solutions that allow for natural ventilation into buildings. Although a viable active noise control (ANC) solution with up to 10 dB of global attenuation between 100 Hz and 1000 Hz was previously developed for an open window, it had limited low-frequency performance below 300 Hz, owing to the small loudspeakers used. To improve the low-frequency attenuation, four passive radiator-based speakers were affixed around the opening of a top-hung ventilation window. The active control performance between 100 Hz and 700 Hz on a single top-hung window in a full-sized mock-up apartment room was examined. Active attenuation came close to the performance of the passive insulation provided by fully closing the window for expressway traffic and motorbike passing noise types. For a jet aircraft flyby, the performance of active attenuation with the window fully opened was similar to that of passive insulation with fully closed windows. In the case of low-frequency compressor noise, active attenuation’s performance was significantly better than the passive insulation. Overall, between 8 dB and 12 dB of active attenuation was achieved directly in front of the window opening, and up to 10.5 dB of attenuation was achieved across the entire room
Active noise control (ANC) generally employs a secondary source to generate ‘anti-noise’ wave that destructively suppresses the unwanted noise in the listening environment. The underlying signal processing mechanism behind ANC is the filtered-x least mean square (FxLMS) algorithm. FxLMS is an adaptive algorithm that updates the coefficients of a control filter in real-time intending to create an anti-noise signal, which matches the primary noise in space and time at the desired location [1]–[3]. Although ANC is commonly used in noise-cancelling headsets [4], [5] and modern automobiles [6]–[9], its application in larger three-dimensional spaces has been limited [10], [11]. To effectively attenuate broadband noise in a large space, multichannel ANC with long control filters and high sample rate are usually required [12]–[14]. In these multichannel implementations, however, high-performance processors, such as multi-core DSP processors, FPGAs [15], [16], or GPUs [17] are required, which undoubtedly increases implementation costs and complicates the programming effort. These processing limitations, thus, undermine its potential applications (e.g., in facade openings [11], [18], large dimension ducts or room interiors, active noise barriers [19]). Sign in to Continue Reading
The gradual adaptation and possibility of divergence hinder the active noise control system from being applied to a wider range of applications. Selective active noise control has been proposed to rapidly reduce noise by selecting a pre-trained control filter for different primary noise detected without an error microphone. For stationary noise, considerable noise reduction performance with a short selection period is obtained. For non-stationary noise, more restrictive requirements are imposed on instant convergence, as it leads to faster tracking and better noise reduction performance. To speed up a selective filtered active noise control system, empirical wavelet transform is introduced here to accurately and instantaneously extract the frequency information of primary noise. The boundary of the first intrinsic mode function of random noises is extracted as the instant signal feature. Primary noise is attenuated immediately by picking the optimal pre-trained control filter labeled by the nearest boundary. The storage requirement for a pre-trained control filter library is reduced. Instant control is obtained, and the instability caused by output saturation is overcome. With more concentrated energy distribution, better noise reduction performance is achieved by the proposed algorithm compared to conventional and selective active noise control algorithms. Simulation results validate these advantages of the proposed algorithm.
Recently, the grid-free compressive beamforming approach was proposed to perform direction-of-arrival (DOA) of sources using observations collected from an array of sensors; however, it is limited to sources with a single frequency. This paper extends it by considering a wideband signal impinging an array and introduces a continuous formulation of the discretized multiband representation of the impinging signal. Next, a sparsity measure is proposed to promote the continuous variable’s sparsity. In turn, the DOA estimation problem with infinite unknowns is derived and solved over a few optimization variables with the semidefinite programming. Further, it is demonstrated that the derived problem possesses properties that can be exploited. Accordingly, when the absolute frequency differences phrased in wavelengths are larger than twice the array spacing, an aliased-free DOA estimation can be achieved. Results from simulation and experimental example confirms that at aliasing condition, with sufficient sensors, the proposed approach provides improved DOA estimation using a small number of continuous multiband frequencies compared to other methods using discrete signal model formulation.
Commercial noise-cancelation headphones are based on active noise control (ANC) algorithm. However, all existing ANC headphones are based on bilateral ANC approach, where two independent monaural ANC systems are used respectively for the left and the right ear cups. The performance of bilateral ANC approach strictly depends on the number of noise sources, direction of noise sources and noise types. Its performance decreases when the number of noise sources increases. Human’s hearing system is binaural in nature and noise control can take advantage of binaural processing to further enhance noise reduction compared to conventional bilateral ANC headphones. In this paper, we first propose a binaural ANC algorithm to evaluate its performance over the bilateral ANC algorithm and subsequently, modify into a combined bilateral-binaural ANC (CBBANC) algorithm for headphones in order to improve noise reduction performance for different cases when there are more than one noise sources and all noise sources are situated at different locations and for diffuse field noise. Experimentation results show that the combined binaural-bilateral ANC has better performance in all our tests compared to the conventional bilateral ANC headphones.
The leaky filtered-input least mean square (LFxLMS) algorithm is widely used in active noise control applications to minimize the degradation of attenuation performance due to output saturation distortion. However, the leak factor, which is critical in determining the steady-state error and robustness of the algorithm, is usually selected through trial and error. This letter proposes a leak factor selection approach, which ensures the LFxLMS algorithm converges to its optimal solution under the average-output-power constraint and can be readily derived in practice. Both broadband and narrowband cases are considered in the derivation without the independence assumption, and the simulations are conducted based on real primary and secondary paths to verify its effectiveness.
With the advent of virtual reality (VR) technology, spatial audio has been increasingly adopted to evaluate the acoustic environment in soundscape research. It is therefore imperative to assess the quality of commonly used spatial audio reproduction methods to determine their ecological validity. Through subjective evaluations with 30 participants, the same participant evaluated four outdoor in situ locations vis-à-vis its corresponding audio-visual recording in VR on a separate day. A total of three spatial audio reproduction methods were assessed in VR, and they were all down-mixed from the first-order ambisonics (FOA) recordings to headphone-based FOA-static binaural, FOA-tracked binaural; and FOA 2-dimensional (2D) octagonal speaker array. The participants evaluated the acoustic environment in terms of the overall soundscape quality and perceived spatial qualities at each location. Regarding overall soundscape quality, there were no significant differences in evaluating the sound-source dominance and affective soundscape qualities between in situ and all VR methods. However, significant differences were found in the perceived spatial qualities between three reproduction methods and in situ. Among the source-related spatial attributes, the perceived distance of the dominating sounds was farther in the virtual than in the in situ evaluations. In the localization of sound sources, both the FOA-tracked binaural and the FOA-2D speaker array exhibited higher spatial acoustic fidelity than FOA-static binaural. Regarding the environment-related spatial quality attributes, the 2D speaker array reproduction was perceived as more immersive and realistic than other reproduction methods. Overall, the FOA-tracked binaural appears to exhibit sufficient fidelity for cinematic VR evaluation of soundscapes.
Many studies have investigated the effects of water sound on soundscape with an assumption that target noise coincides with the masker (co-location), while no attention has been paid to spatial separations between target noise and water sound sources. This study aims to explore the effects of spatial separations between target noise and water sound on perceived loudness of target noise (PLN) and overall soundscape quality (OSQ) through laboratory experiments. Traffic noise (target) and a water sound (masker) were recorded as acoustic stimuli and a spherical panoramic video recording of a water fountain was also used as visual stimuli. The audio-visual stimuli were reproduced through a virtual reality head-mounted display and a multichannel ambisonic loudspeaker setup. The traffic noise and water sound were played simultaneously at various azimuthal separations and were combined with a panoramic recording of a water fountain as visual stimulus. Participants assessed the audio-visual stimuli in terms of PLN and OSQ. The effect of the spatial separation between the traffic noise and water sound was significant in both PLN and OSQ. Specifically, the PLN increase at 135° separation was equivalent to an estimated target noise level increment of ~1–2 dB. Similarly, the OSQ decrease at 135° and 180° separation was equivalent to an estimated target noise level increase of ~2–5 dB. Since the typical field of view of users in space is less than 135°, the results suggest that placing water features within a user's field of view could achieve better soundscape.
This research focuses on the use of an open window with an active noise control (ANC) system and a splitter silencer to improve the attenuation level of broadband noise. We previously introduced a similar idea with a splitter silencer and a loudspeaker inside a duct. However, it is challenging to achieve noise reduction with a splitter silencer in a short duct. In our previous research, we achieved noise reduction at an open window with an array of ANC units, in which a reference microphone with a loudspeaker is collocated. However, this system had limited high-frequency noise reduction due to the distance between each loudspeaker. Here, we improved noise reduction by integrating a collocated ANC system and a splitter silencer, which has an open ratio of 50%. Each unit has a reference microphone at the front and a loudspeaker at the back, which generate anti-noise. Four units are combined in an array with a splitter silencer. In an experiment, we implemented two splitter silencers in an eight-channel ANC system at an open window (24 × 48 × 12 cm). The results show that the splitter silencers attenuated high-frequency noise above 2 kHz. Moreover, the ANC system achieved an attenuation of 2 to 10 dB for noise in the range of 200 Hz to 2 kHz. Overall, we were able to achieve a total attenuation of 2 to 17 dB above 200 Hz.
Introducing pleasant natural sounds to mask urban noises is an important soundscape design strategy to improve acoustic comfort. This study investigates the effects of signal-to-noise ratio (SNR) between natural sounds (signal) and the target noises (noise) and their temporal characteristics on the perceived loudness of noise (PLN) and overall soundscape quality (OSQ) through a laboratory experiment. Two types of urban noise sources (hydraulic breaker and traffic noises) were set to A-weighted equivalent sound pressure levels (SPL) of 55, 65, and 75 dB and then augmented with two types of natural sounds (birdsong and stream), across a range of SNRs. Each acoustic stimulus was a combination of noise and natural sound at SNRs from −6 to 6 dB. Averaged across all cases, the subjective assessment of PLN showed that augmenting urban noise separately with the two natural sounds reduced the PLN by 17.9%, with no significant differences found between the birdsong and stream sounds. Adding natural sounds increased the OSQ by on average 18.3% across the cases, but their effects gradually decreased as the noise level increased. The OSQ of the birdsong and stream sounds were similar for traffic noise, whereas the stream sound was rated higher than the birdsong for the breaker noise. The results suggest that increasing the dissimilarity in temporal structure between the target noise and natural sounds could enhance the soundscape quality. Appropriate SNRs were explored considering both PLN and OSQ. The results showed that the SNR of −6 dB was desirable when the A-weighted SPL of the noise rose to 75 dB.
Multichannel active noise control (MCANC) is widely recognized as an effective and efficient solution for acoustic noise and vibration cancellation, such as in high-dimensional ventilation ducts, open windows, and mechanical structures. The feedforward multichannel filtered-x least mean square (FFMCFxLMS) algorithm is commonly used to dynamically adjust the transfer function of the multichannel controllers for different noise environments. The computational load incurred by the FFMCFxLMS algorithm, however, increases exponentially with increasing channel count, thus requiring high-end field-programmable gate array (FPGA) processors. Nevertheless, such processors still need specific configurations to cope with soaring computing loads as the channel count increases. To achieve a high-efficiency implementation of the FFMCFxLMS algorithm with floating-point arithmetic, a novel architecture based on multiple-parallel-branch with folding (MPBF) technique is proposed. This architecture parallelizes the branches and reuses the multiplier and adder in each folded branch so that the tradeoff between throughput and the usage of the hardware resources is balanced. The proposed architecture is validated in an experimental setup that implements the FFMCFxLMS algorithm for the MCANC system with 24 reference sensors, 24 secondary sources, and 24 error sensors, at a sampling and throughput rates of 25 kHz and 260 Mb/s, respectively.
The multichannel implementation of the auxiliary-filter-based virtual-sensing (AF-VS) technique for active noise control applications is revisited and realized in the paper. Frequency-domain analysis based on random primary noise proves that the multichannel virtual-sensing active noise control (MVANC) technique can achieve optimal control at the desired virtual locations even if the signals at the physical and virtual microphones are not causally related. Further analysis on a number of sensor-actuator configurations shows that the MVANC technique achieves optimal control at the desired locations as long as the number of secondary sources does not exceed that of the physical error microphones. Furthermore, the simulations with measured transfer functions and real-time experiments conducted on a four-channel system validate the frequency domain analyses.
Active noise control (ANC) is a re-emerging technique to mitigate noise pollution. To reduce the noise power in large spaces, multiple channels are usually required, which complicates the implementation of ANC systems. In this paper, we separate the multichannel ANC problem into two subproblems, where the subproblem of computing the control filter is usually an underdetermined problem. Therefore, we could leverage the underdetermined system to simplify the ANC system without degrading the noise reduction performance. For a single incidence, we compare the conventional fully-coupled (pseudoinverse) multichannel control with the colocated (diagonal) control method and find that they can achieve equivalent performance, but the colocated control method is less computationally intensive. Furthermore, the underdetermined system presents an opportunity to control noise from multiple incidences with one common fixed filter. Both the full-rank and the overdetermined optimal control filters are realized. The performance of these control methods was analyzed numerically with the Finite Element Method (FEM) and the results validate the feasibility of the full-rank and overdetermined optimal control methods, where the latter could even offer more robust performance in more complex noise scenarios.
The active noise control (ANC) algorithms are widely applied in noise-canceling headphones. However, the disadvantage of conventional ANC algorithms is the high computational cost from high order finite-impulse-response filter, which also leads to slower convergence. Subband ANC algorithms, which are based on the subband adaptive filter, are used to reduce computational complexity and to increase convergence rate. Moreover, noise cancellation performance decreases at the higher frequency due to the decreasing control zone. In headphones based application, perceptual quality of noise attenuation and target audio signal are important factors, and ANC algorithm must be designed to account for the psychoacoustic parameters. In this paper, we propose a new noise reduction approach in order to solve some existing ANC headphones problems, especially high-frequency noise reduction and perceptual quality of the playback sound. The proposed algorithm is a new approach of integrated psychoacoustic subband ANC and psychoacoustic masking. Psychoacoustic subband ANC achieves lesser computational cost than conventional subband ANC by using the psychoacoustic model to attenuate only audible noise components during music playback in the headphones. Proposed new integrated system is efficient for broadband noise attenuation. Psychoacoustic ANC reduces noise in the most sensitive frequency range for human’s hearing system, 0–4000 Hz, and masking operates in order to mask out residual and high-frequency noise. In additional, proposed noise reduction system reduces only audible noise components. The integrated ANC-masking system applied psychoacoustic analysis without producing artificial distortions to the music signal. Objective and subjective perceptual tests are evaluated for the proposed ANC-masking system with other subband ANC and masking algorithms. Results show the advantage of the proposed integrated system in the terms of the perceptual sound quality and high-frequency noise reduction.
With the advent of virtual reality (VR) technology, spatial audio has been increasingly adopted to evaluate the acoustic environment in soundscape research. It is therefore imperative to assess the quality of commonly used spatial audio reproduction methods to determine their ecological validity. Through subjective evaluations with 30 participants, the same participant evaluated four outdoor in situ locations vis-à-vis its corresponding audio-visual recording in VR on a separate day. A total of three spatial audio reproduction methods were assessed in VR, and they were all down-mixed from the first-order ambisonics (FOA) recordings to headphone-based FOA-static binaural, FOA-tracked binaural; and FOA 2-dimensional (2D) octagonal speaker array. The participants evaluated the acoustic environment in terms of the overall soundscape quality and perceived spatial qualities at each location. Regarding overall soundscape quality, there were no significant differences in evaluating the sound-source dominance and affective soundscape qualities between in situ and all VR methods. However, significant differences were found in the perceived spatial qualities between three reproduction methods and in situ. Among the source-related spatial attributes, the perceived distance of the dominating sounds was farther in the virtual than in the in situ evaluations. In the localization of sound sources, both the FOA-tracked binaural and the FOA-2D speaker array exhibited higher spatial acoustic fidelity than FOA-static binaural. Regarding the environment-related spatial quality attributes, the 2D speaker array reproduction was perceived as more immersive and realistic than other reproduction methods. Overall, the FOA-tracked binaural appears to exhibit sufficient fidelity for cinematic VR evaluation of soundscapes.
Head-related transfer function (HRTF) is an essential component of a system for creating an immersive listening experience over headphones in multimedia and virtual and augmented reality applications. A critical requirement is the measure of the HRTFs for each individual to accommodate ear idiosyncrasies. Conventional static stop-and-go HRTF measurement methods are tedious and time-consuming. Recently proposed continuous HRTF acquisition methods could improve the acquisition efficiency but they still restrict the movements of the subjects and must be conducted in