| アイテムタイプ |
会議発表論文 / Conference Paper(1) |
| 公開日 |
2025-08-08 |
| タイトル |
|
|
タイトル |
Towards Artwork Explanation in Large-scale Vision Language Models |
| 言語 |
|
|
言語 |
eng |
| 資源タイプ |
|
|
資源タイプ |
conference paper |
| アクセス権 |
|
|
アクセス権 |
open access |
| 著者 |
Hayashi, Kazuki
Sakai, Yusuke
上垣外, 英剛
Hayashi, Katsuhiko
渡辺, 太郎
|
| 抄録 |
|
|
内容記述タイプ |
Abstract |
|
内容記述 |
Large-scale Vision-Language Models (LVLMs) output text from images and instructions, demonstrating advanced capabilities in text generation and comprehension. However, it has not been clarified to what extent LVLMs understand the knowledge necessary for explaining images, the complex relationships between various pieces of knowledge, and how they integrate these understandings into their explanations. To address this issue, we propose a new task: the artwork explanation generation task, along with its evaluation dataset and metric for quantitatively assessing the understanding and utilization of knowledge about artworks. This task is apt for image description based on the premise that LVLMs are expected to have pre-existing knowledge of artworks, which are often subjects of wide recognition and documented information.It consists of two parts: generating explanations from both images and titles of artworks, and generating explanations using only images, thus evaluating the LVLMs’ language-based and vision-based knowledge.Alongside, we release a training dataset for LVLMs to learn explanations that incorporate knowledge about artworks.Our findings indicate that LVLMs not only struggle with integrating language and visual information but also exhibit a more pronounced limitation in acquiring knowledge from images alone. The datasets ExpArt=Explain Artworks are available at https://huggingface.co/datasets/naist-nlp/ExpArt |
| 書誌情報 |
en : Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics
p. 705-729,
発行日 2024-08-11
|
| 会議情報 |
|
|
|
会議名 |
The 62nd Annual Meeting of the Association for Computational Linguistics |
|
|
開始年 |
2024 |
|
|
開始月 |
08 |
|
|
開始日 |
11 |
|
|
終了年 |
2024 |
|
|
終了月 |
08 |
|
|
終了日 |
16 |
|
|
開催期間 |
2024-08-11 - 2024-08-16 |
|
|
開催地 |
Bangkok, Thailand |
|
開催国 |
THA |
| 出版者 |
|
|
出版者 |
Association for Computational Linguistics |
| 出版者版DOI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
DOI |
|
|
関連識別子 |
https://doi.org/10.18653/v1/2024.acl-short.65 |
| 出版者版URI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
URI |
|
|
関連識別子 |
https://aclanthology.org/2024.acl-short.65/ |
| 権利 |
|
|
権利情報Resource |
https://creativecommons.org/licenses/by/4.0/ |
|
権利情報 |
$00A92024 Association for Computational Linguistics. ACL materials are Copyright $00A9 1963$20132025 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. |
| 著者版フラグ |
|
|
出版タイプ |
NA |
| 助成情報 |
|
|
|
助成機関名 |
Japan Society for the Promotion of Science (JSPS) |
|
|
研究課題番号 |
JP21K17801 |
|
|
研究課題番号URI |
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-21K17801/ |
|
|
研究課題名 |
共参照クラスタを明示的に推定する先行詞の解析誤りに対し頑健な共参照解析手法 |
| 助成情報 |
|
|
|
助成機関名 |
Japan Society for the Promotion of Science (JSPS) |
|
|
研究課題番号 |
JP23H03458 |
|
|
研究課題番号URI |
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-23K28148/ |
|
|
研究課題名 |
漸進的な知識の拡張を行う汎用自然言語生成モデルの研究 |