报告人:Jacek Mańdziuk
工作单位:华沙理工大学
报告题目:人工智能在抽象视觉推理中的优势与局限
报告时间:2025年9月26日(周五)09:30
报告地点:文理楼290
内容摘要:
抽象视觉推理(Abstract Visual Reasoning, AVR)涵盖一系列任务,这类任务要求具备通过类比推理过程,发现图像集合背后隐含通用概念的能力,这与人类解答智商测试题的方式相似。在本次报告中将概述 AVR 问题的主要类型,以及对应的可能解决方法。报告的第二部分深入探讨邦加德问题(Bongard Problems, BPs)—— 这类问题是 AVR 领域的核心挑战,主要原因在于其要求将视觉推理与语言描述相结合。具体而言,将提出一个关键问题:本质上为融合视觉与语言而设计的多模态大型语言模型(multimodal large language models, MLLMs),是否具备应对邦加德问题的能力?为解答这一问题,将展示并分析当前最先进的 MLLMs 在解决邦加德问题(包括由合成图像或真实世界图像构成的问题)时的实验结果,而这些结果将揭示当代模型在抽象视觉推理方面存在的显著局限性。
个人简介:
亚采克・曼齐乌克(Jacek Mańdziuk)教授拥有哲学博士(Ph.D.)与理学博士(D.Sc.,波兰及部分欧洲国家的高级学术学位,通常授予在科研领域有突出成果者)学位,现任华沙理工大学(Warsaw University of Technology)数学与信息科学公司全职教授、人工智能与计算方法系主任,同时担任克拉科夫 AGH 理工大学(AGH University of Krakow)全职研究教授。亚采克・曼齐乌克教授是 IEEE(电气与电子工程师协会)高级会员,曾担任 2021 年波兰克拉科夫 IEEE 进化计算大会(IEEE Congress on Evolutionary Computation)联合主席,并在 2013-2023 年间担任 IEEE 智能系统与计算进展年会(IEEE SSCI)“类人智能计算智能专题研讨会” 主席。他曾荣获富布赖特高级研究奖(Fulbright Senior Research Award),并赴美国加州大学伯克利分校(UC Berkeley)与国际计算机科学研究所(ICSI Berkeley)开展研究;还曾获得罗伯特・舒曼基金会奖学金(Robert Schuman Foundation Fellowship),赴法国贝桑松国家科学研究中心(CNRS)交流。此外,他是 IEEE 新兴技术与趋势委员会(ETTC)“迈向类人智能” 任务组的创始主席。他曾在全球多地受邀参与学术会议并在高校发表研究报告,学术足迹广泛。截至目前,亚采克・曼齐乌克教授已出版 3 部著作,发表 200 余篇研究论文,其中多篇发表于人工智能与机器学习领域的顶尖期刊及高水平学术会议,学术影响力显著。2024 年,他当选为波兰科公司(Polish Academy of Sciences)院士。其研究方向主要包括:计算智能与人工智能方法在动态优化、双层优化问题中的应用,抽象视觉推理,游戏领域(如游戏 AI),以及人机协作;同时,他也致力于类人学习与问题解决方法的研发。
Academic Report Notice of Jacek Mańdziuk: Successes and Limitations of Artificial Intelligence in Abstract Visual Reasoning
Speaker: Jacek Mańdziuk
Title: Successes and Limitations of Artificial Intelligence in Abstract Visual Reasoning
Time: 9:30 AM, September 26, 2025(Friday)
Location:290 Arts and Science Building
Abstract:
Abstract Visual Reasoning (AVR) involves a suite of tasks that require the ability to discover common concepts underlying the set of images through an analogy-making process, similar to how humans solve IQ tests. This talk will summarize the main types of AVR problems along with possible solution approaches. In the second part of the talk, it will delve into Bongard Problems (BPs) which pose a fundamental AVR challenge, mainly due to the requirement to combine visual reasoning with verbal description. In particular, a question will be posed as to whether multimodal large language models (MLLMs), inherently designed to combine vision and language, are capable of tackling BPs. To answer this question, the results of applying state-of-the-art MLLMs to solving BPs (composed of either synthetic or real-world images) will be presented and analysed, revealing significant AVR limitations of contemporary models.
Personal Introduction:
Prof. Jacek Mańdziuk, Ph.D., D.Sc., is a full professor at the Faculty of Mathematics and Information Science, Warsaw University of Technology, and Head of Division of Artificial Intelligence and Computational Methods. He is also a full research professor of the AGH University of Krakow.He is an IEEE Senior Member and was General Co-Chair of the 2021 IEEE Congress on Evolutionary Computation, Krakow, Poland, and Chair of the annual IEEE SSCI Symposium on Computational Intelligence for Human-like Intelligence 2013- 2023. He is a recipient of the Fulbright Senior Research Award (UC Berkeley and ICSI Berkeley, USA) and the Robert Schuman Foundation Fellowship (CNRS, Besancon, France). He is a Founding Chair of the IEEE ETTC Task Force on TowardHuman-like Intelligence. He presented his research at numerous invited conference and university talks worldwide. He has published 3 books and over 200 research papers, many in leading journals and top-tier conferences on artificial intelligence and machine learning. This year he was elected a Member of the Polish Academy of Sciences.His research interests include application of computational intelligence and artificial intelligence methods to dynamic and bilevel optimization problems, abstract visual reasoning, games, and human-machine cooperation. He is also interested in the development of human-like learning and problem-solving methods.