½üÈÕ£¬ÉϺ£µçӰѧԺӰÊÓ¹¤³Ìϵл־·å¡¢ÀîÃÎÌðÀÏʦÍŶӵÄ×êÑй¤×÷¡¶SonicVisionLM: Playing Sound with Vision Language Models¡·³É¹¦±»ÍÆËã»úÊÓ¾õ¹ú¼Ê¶¥¼¶Ñ§Êõ»áÒé The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024£¨CVPR£©Â¼Óã¬ÊÇб¦GGÉϺ£µçӰѧԺ³õ´ÎÒÔµÚÒ»µ¥ÔªÔÚÍÆËã»úÊÓ¾õ¹ú¼Ê¶¥¼¶»áÒéÉϰ䷢¸ßˮƽѧÊõÂÛÎÄ£¬Ò²ÊÇб¦GG¡°ÒÕÊõ¼¼Êõ¡¹ØóµØ½¨ÉèµÄ×îÐÂ×êÑгɾ͡£
CVPRÊÇÍÆËã»úÊÓ¾õÁìÓò¶¥¼¶»áÒ飨CCF-AÀࣩ£¬Ã¿Äê¶¼ÎüÒýÈ«Çò¶à¶à¶¥¼â¿ÆÑй¤×÷ÕßͶ¸å£¬Æä¼ÓÃÂÛÎÄ´ú±í×ÅÍÆËã»úÊÓ¾õÁìÓò×îеÄ×êÑгɾͣ¬Ö¸ÒýןÃÁìÓò½«À´µÄ×êÑз½Ïò¡£Æ¾¾ÝȨÍþµÄGoogle Scholar Citation×îÐÂͳ¼Æ£¬CVPRµÄH5-indexΪ389£¬Î»ÁÐÈ«Çò³ö°æÎïµÚËÄ£¨NatureλÁеÚÒ»£©£¬¹¤³ÌÓëÍÆËã»úÀà³ö°æÎïµÚÒ»£¬·ºÈËΪÖÇÄÜÁìÓòµÚÒ»¡£
ÂÛÎijõ´ÎʹÓÃAIGC¼¼ÊõΪµçÓ°×Ô¶¯ÌìÉúÒôЧ£¬ÒÔ´ó·ù½µµÍµçÓ°ÅäÀÖÔì×÷µÄ¹¦·òºÍÈËÁ¦³É±¾£¬ÓÐЧËõ¶ÌµçÓ°Ôì×÷ÖÜÆÚ¡£¾ßÌåÀ´Ëµ£¬ÂÛÎÄͨ¹ýÊÓ¾õ-˵»°Ä£ÐÍÌá³ö¿É¿ØµÄÒôЧÌìÉú¿ò¼ÜSonicVisionLM£¬ÓÃÓÚ×Ô¶¯¼ø±ð²¢ÌìÉúӰƬµÄÆÁÄÚÒôЧ£¬²¢ÅäÌ×ÌṩÁËÓû§½»»¥Ä£¿é£¬ÓÃÓÚÅäÒôʦ¶ÔӰƬµÄÆÁ±íÒôЧʵÏÖ´´×÷±à×룬Òý·¢´´×÷Áé¸Ð¡£ÔÚ¼¼ÊõÉÏÕë¶ÔÌìÉúÒôЧÓëӰƬ×÷ΪµÄ¹¦·òͬ²½µÄÄÑÌ⣬ºÍÌìÉúÒôЧÓëӰƬÄÚÈݵĸ߶ÈÒ»ÖµÄÎÊÌ⣬×îÖÕʵÏÖÁËӰƬÄÚÈÝÓëÆÁÄÚÒôЧµÄÂß¼Èںϣ¬ÒÔ¼°¶ÔÆÁ±íÒôЧµÄ½Ã½Ý±à×ë¡£ÂÛÎÄÌá³öµÄ²½ÖèÔÚÎÞǰÌáÌìÉúºÍǰÌáÌìÉú¹¤×÷Öж¼»ñµÃÁ˵±Ç°×î¼ÑµÄ³¢ÊÔ³ÉЧ¡£Í¬Ê±£¬ÂÛÎÄΪѧÊõÉçÇø¹±Ï×Á˹«¿ªµÄ¸ßÖÊÁ¿µÄÒôЧÊý¾Ý¼¯CondPromptBank£¬ÆäÔ̺¬23¸ö³£¼ûÒôЧÀà±ð£¬10276 ¸ö¶ÀÁ¢Ìõ¿î£¬Ã¿¸öÌõ¿îÔ̺¬Ò»¸ö¶ÌÓÚ»òµÅ×Ú10ÃëµÄ¸ßÖÊÁ¿ÒôЧÎļþ¡¢¶ÔÓ¦Îı¾ºÍ¹¦·ò´Á¡£ÂÛÎÄÔÎÄ¡¢´úÂëºÍÊý¾Ý¼¯Ïê¼ûÏîÄ¿Ö÷Ò³£ºhttps://yusiissy.github.io/SonicVisionLM.github.io/£¨ÏîÄ¿Ö÷ҳչʾÁ˾µäӰƬ¡¶Ì©Ì¹Äá¿ËºÅ¡·ºÍ¡¶Õâ¸öɱÊÖ²»Ì«Àä¡·µÄÒôЧÉú¹¦³ÉЧʾÀý£©¡£¸ÃÂÛÎĵÄѧÉúÒ»×÷ΪÓàÊ¢Ò¶£¬Êý×ÖýÌå´´Ò⹤³Ì˶ʿ×êÑÐÉú¶þÄê¼¶ÔÚ¶Á£»Ñ§Éú¶þ×÷ΪºÎÆäÀÖ£¬Êý×ÖýÌå´´Ò⹤³Ì˶ʿ×êÑÐÉúÒ»Äê¼¶ÔÚ¶Á¡£

SonicVisionLMʾÒâͼ£ºÍ¼ÖÐÀ¶É«²¿ÃŰµÊ¾ÆÁÄÚÒôÌìÉúÁ÷³Ì£ºÊ×ÏÈ£¬Ò»¶ÎÎÞÉùÊÓÆµ½øÈëÊÓ¾õ-˵»°Ä£ÐÍ£¬µÃµ½ÉùÒôÎı¾£»Æä´Î£¬ÊÓ¾õÍøÂç¶ÔÊÓÆµ½øÐд¦Öã¬×½ÄÃÉùÒôÊÂÎñ¹¦·ò´Á£»×îºó£¬ÕâÁ½¸öǰÌὫ±»ÊäÈëÀ©É¢Ä£ÐÍ£¬ÒÔÌìÉúÓëÆÁÄ»ÉϵÄÄÚÈÝÏàÆ¥ÅäµÄÆÁÄÚÒôЧ¡£×ÏÉ«²¿ÃÅÏÔʾÁËÓû§ÈôºÎ´´½¨ºÍ±à×ëÆÁ±íÒôЧ¡£
ÀÏʦÍŶӽéÉÜ£º
л־·å£¬¹¤Ñ§²©Ê¿£¬ÏÖΪб¦GGÉϺ£µçӰѧԺӰÊÓ¹¤³Ìϵ¡¢ÉϺ£µçÓ°ÌØÐ§¹¤³Ì¼¼Êõ×êÑÐÖÐÐĸ±½ÌÊÚ¡¢²©Ê¿Éúµ¼Ê¦£¬ÖйúµçÓ°µçÊÓ¼¼Êõѧ»áµçÓ°¸ßм¼ÊõרҵίԱ»áίԱ¡£ÖØÒª´ÓÊÂÍÆËã»úͼÐÎѧ¡¢ÍÆËã»úÊÓ¾õ¡¢µçÓ°¸ßм¼ÊõµÈ·½ÃæµÄ×êÑС£Ö÷³Ö¹ú¶ÈÌìÈ»¿ÆÑ§»ù½ð¡¢ÉϺ£ÊпÆÎ¯¿Æ¼¼´´Ð¡¢ÉϺ£ÊнÌί¿ÆÑд´Ð¡¢ÆóҵίÍеȸ÷¼¶±ð¿ÎÌâ10ÓàÏ²Î¼Ó973¡¢863¡¢ÌìÈ»»ù½ð³Áµã¡¢Ãæ¸ßµÈ¶àÏî¹ú¶È¼¶¿ÎÌ⣬°ä·¢¸ßˮƽÂÛÎÄ40ÓàÆª£¬ÆäÖÐSCI/EIÊÕ¼30ÓàÆª£¨º¬¹ú¼Ê¶¥¼¶ÆÚ¿¯ºÍ»áÌÖÂÛÎÄ10ƪ£©£¬³ö°æ×¨Öø1±¾£¬ÉêÇëרÀûºÍÈí¼þÖøÊöȨ17Ïî¡£»ñµÄ2014ÄêÉϺ£ÊпƼ¼½øÈ¡¶þµÈ½±£¬2017Äêб¦GG²Ì¹ÚÉîÓÅÁ¼ÇàÀÏ´óʦ½±£¬2022ÄêÖйúÍÆËã»úͼÐÎѧ´ó»á×î¼ÑÂÛÎĽ±£¬¼°2023ÄêCAD/Graphics 2023¹ú¼ÊѧÊõ»áÒé×î¼ÑÂÛÎĽ±¡£ÒѾ¸°Ïã¸Û³ÇÊдóÑ§ÍÆËã»úϵ×÷½Ó¼ûѧÕß¡£
ÀîÃÎÌ𣬹¤Ñ§²©Ê¿£¬²©Ê¿ºó£¬ÏÖΪб¦GGÉϺ£µçӰѧԺ½²Ê¦£¬Ë¶Ê¿Éúµ¼Ê¦¡£ÈÎÖйúÍÆËã»úѧ»áÍÆËã»ú¸¨ÖúÉè¼ÆÓëͼÐÎѧרί»áÖ´ÐÐίԱ£¬ÖйúͼÏñͼÐÎѧѧ»áÊý×ÖÓéÀÖÓëÖÇÄÜÌìÉúרί»áרҵίԱ¡¢Êý×ÖÓéÀÖÓë·ÂÕæ×¨Î¯»áרҵίԱ£¬ÍÆËã»úͼÐÎѧÓë»ìºÏÏÖÇÐʵÏ߯½Ì¨£¨GAMES£©Ö´ÐÐίԱ¡£ÖØÒª×êÑз½ÏòÎªÍÆËã»úÊÓ¾õ¡¢ÍÆËã»úͼÐÎѧ¡£²Î¼Ó¹ú¶ÈÌìÈ»¿ÆÑ§»ù½ð³Á´ó¡¢ÃæÉÏ¡¢Éç¿Æ³Á´ó£¬ÉϺ£ÊпÆÎ¯¡¢¾ÐÅί³Á´óµÈ¿ÆÑÐÏîÄ¿¡£»ñCAD/Graphics 2023¹ú¼ÊѧÊõ»áÒé×î¼ÑÂÛÎĽ±¡£ÒÔµÚÒ»×÷Õß/ͨѶ×÷ÕßÔÚÍÆËã»ú¹ú¼Ê¶¥¼¶ÆÚ¿¯ºÍ»áÒéCVPR¡¢ECCV¡¢PRÉϰ䷢ÂÛÎÄ¶àÆª£¬µ£ÈÎÍÆËã»úÊÓ¾õ¶¥¼¶Ñ§Êõ»áÒéºÍÆÚ¿¯CVPR¡¢ICCV¡¢ECCV¡¢ICLR¡¢ICML£¬NeurIPS¡¢AAAI¡¢TIP£¬TCSVT£¬PRÉó¸åÈË¡£