Ðн¡½²Ì³Ñ§Êõ½²×ù
¹¦·ò: 2018Äê6ÔÂ8ÈÕ£¨ÖÜÎ壩ÉÏÎç9£º30
µØÖ·: У±¾²¿¶«ÇøÏèÓ¢´óÂ¥T516ÊÒ
½²×ù: »ùÓÚÉî¶È½ø½¨µÄµ¥Í¨Â·ÒôƵ·ÖÀëϵͳ×êÑнøÕ¹
Ñݽ²Õß: ÂÞÒÕ ²©Ê¿ ¸çÂ×±ÈÑÇ´óѧ
Ñݽ²Õß¼ò½é£º
ÂÞÒÕÏÖΪ¸çÂ×±ÈÑÇ´óѧµÄ²©Ê¿Éú¡£ËûµÄ×êÑгÁµãÊÇÒôƵǽ˴¦ÖÃÖеĻúе½ø½¨ÒÔ¼°Éî¶È½ø½¨ÏµÍ³£¬ÖØÒª´ÓÊµĹ¤×÷ÓÐÒôƵ·ÖÀ룬ÓïÒô¼ÓÇ¿£¬ÓïÒôÈ¥»ìÏìµÈ£¬Í¬Ê±»¹ÖÂÁ¦ÓÚ¹¹½¨BMIs£¨Brain-Machine Interfaces£©ÒÔ³ÉÁ¢»úеÓëÉúÎïϵͳ֮¼äµÄÁªÏµ¡£
½²×ùÌáÒª£º
Single-channel audio separation has been an active research area for decades. Recently, deep learning-based systems have greatly advanced the state of this problem. In most of the deep learning systems, the separation is performed in time-frequency (T-F) domain where a T-F mask is estimated for each of the target source. However, the use of T-F masks upper-bounds the system¡¯s performance and introduces difficulties in end-to-end separation. In this talk, I will first make an overview of several deep learning approaches in T-F domain, address their main disadvantages, and introduce the recently proposed Time-domain Audio Separation Network (TasNet) which separates the mixture directly in time-domain.
Ô¼ÇëÈË£ºÍ¨Ñ¶ÓëÐÅÏ¢¹¤³ÌѧԺ ÖìÃÎÒ¢¸±½ÌÊÚ
Ó½Ó¿í´óÀÏʦºÍѧÉú²ÎÓ룡