Keynote Speakers

Kang Zhang

Computational Media and Arts, Hong Kong University of Science and Technology (Guangzhou), China

Bio: Kang Zhang is Acting Head and Professor of Computational Media and Arts, Information Hub, Hong Kong University of Science and Technology (Guangzhou), Professor of Division of Emerging Interdisciplinary Areas, HKUST, and Professor Emeritus of Computer Science, The University of Texas at Dallas. He was a Fulbright Distinguished Chair and an ACM Distinguished Speaker, and held academic positions in China, the UK, Australia and USA. Zhang's current research interests include computational aesthetics, visual languages, and generative art and design; and has published 8 books, and over 120 journal papers in these areas. He has delivered keynotes at art and design, computer science, and management conferences, and is on the editorial boards of Journal of Big Data, The Visual Computer, Journal of Visual Language and Computing, International Journal of Software Engineering and Knowledge Engineering, International Journal of Advanced Intelligence, Visual Computing for Industry, Biomedicine, and Art, and Chinese Journal of Software.

Speech Title: Realistic and Immersive Experiences in Massive Open Metaverse Courses (MOMC)

Abstract: Much effort has been made in using virtual reality (VR) technology to support massive open online course (MOOC) environments. This talk briefly reviews the latest research in VR/AR/XR application in education, and argues how immersive virtual educational experiences could be gained. We then introduce the new concept of Massive Open Metaverse Course (MOMC), combining MOOC and Metaverse and utilizing the latest volumetric video technology. We offer our vision on dual campus online education for HKUST 2.0, with a real case study, i.e., the President’s First Lecture, under development at the Guangzhou campus. This is the world’s first true MOMC environment, providing immersive and realistic virtual and augmented reality experiences to both teachers and learners.

Song Wang

University of South Carolina, USA

Bio: Song Wang received the PhD degree in electrical and computer engineering from the University of Illinois at Urbana Champaign in 2002. He is a professor and the director of Computer Vision Lab at the Department of Computer Science and Engineering, University of South Carolina. His research interests include computer vision, image processing, and machine learning. He has published over 200 papers in related journals and conferences, including TPAMI, IJCV, TIP, CVPR, ICCV, ECCV, AAAI, etc., with ~7800 Google citations. He is serving as an associate editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Transactions on Multimedia (TMM), and Pattern Recognition Letters.

Speech Title: Title: Collaborative Video Analysis of Multiple Moving Cameras

Abstract: Video collection and analysis is a primary way to accurately capture and understand human actions and activities, which plays an important role in many vision and graphics tasks. Compared with a single fixed camera, multiple moving cameras, such as wearable cameras and cameras mounted on flying drones, can significantly expand the covered field, and provide more information for video analysis, especially in the outdoor crowd scene with multiple persons. In this talk, I will introduce several new challenges, as well as our recent efforts, in exploring videos collected by multiple moving cameras, including calibration and synchronization, cross-video human association and tracking, multi-granularity activity analysis and relation understanding.

Bin Sheng

Shanghai Jiao Tong University, China

Bio: Prof. Bin Sheng received the Ph.D. degree in computer science from The Chinese University of Hong Kong. His current research interests include image-based rendering, machine learning, virtual reality, and computer graphics. He is the associate editor of IEEE Tans. CSVT, The Visual Computer and other SCI journals. He has also served as the Associate Editor of Virtual Reality and Intelligent Hardware. He was invited to be the co-chair of the program Committee of CGI2021 and CGI 2022 conferences. He has published 98 papers in important SCI journals in this field, such as Nature Communications, IEEE Transactions on Medical Imaging and Medical Image Analysis. He has published more than 60 papers in important conferences such as MICCAI, ICCV, ACM Multimedia and IEEE VR. He has written 3 textbooks and 1 academic monograph.

Speech Title: Deep Learning-based Parallel Rendering for Metaverse

Abstract: The metaverse is a shared virtual environment that people access via the Internet. The metaverse features large-scale scenes and requires a lot of computation in order to offer an infinite universe and immersive experience, which make the parallel rendering to become important for user interactions. Current parallel rendering methods mainly target on outputting high-ultra resolution images and ignore interactivity. To fill the gap between the timing performance and rendering quality, we propose a multi-modal view-frustum prediction method to predict the future orientation and location of the view-frustum which can be used for data loading in advance. We combine the frame, user input, and object features to get the robust view-frustum movement prediction that can handle complicated metaverse situations. In the experiments, compared with baselines, the proposed method have enhanced the interactivity of parallel rendering.

Jing Qin

The Hong Kong Polytechnic University, China

Bio: QIN, Jing (Harry) is currently an associate professor in Centre for Smart Health, School of Nursing, The Hong Kong Polytechnic University. His research focuses on creatively leveraging advanced virtual/augmented reality (VR/AR) and artificial intelligence (AI) techniques in healthcare and medicine applications and his achievements in relevant areas has been well recognized by the academic community. He won the Hong Kong Medical and Health Device Industries Association Student Research Award for his PhD study on VR-based simulation systems for surgical training and planning. He won 5 best paper awards for his research on AI-driven medical image analysis and computer-assisted surgery. He served as a local organization chair for MICCAI 2019, program committee members for AAAI, IJCAI, MICCAI, etc., speakers for many conferences, seminars, and forums, and referees for many prestigious journals in relevant fields.

Speech Title: Enhancing Representation Capability of Deep Learning Models for Medical Image Analysis under Limited Training Data

Abstract: Deep learning has achieved remarkable success in various medical image analysis tasks. No matter the past, present, or the foreseeable future, one of the main obstacles that prohibits deep learning models from being successfully developed and deployed in clinical settings is the scarcity of training data. In this talk, we shall review, as well as rethink, our long experience in investigating how to enhance representation capability of deep learning models to achieve satisfactory performance under limited training data. Based on our experience, we attempt to identify and sort out the evolution trajectory of applying deep leaning to medical image analysis, somehow reflecting the development path of deep learning itself beyond the context of our specific applications. The models we developed, at least in our experience, are both effects and causes: effects of the clinical challenges we faced and the technical frontiers at that time; causes, if they are really useful and inspiring, of following more advanced models that are capable of addressing their limitations. To the end, by rethinking such an evolution, we can identify some future directions that deserve to be further studied.

Johan F. Hoorn

The Hong Kong Polytechnic University, China

Bio: Prof. dr. dr. Johan F. Hoorn (D. Litt., D. Sc.) is interfaculty full professor of Social Robotics in Dept. of Computing and the School of Design at the Hong Kong Polytechnic University. Prof. Hoorn holds two PhD degrees: in Computer Science and in General Literature. Apart from Design and Computing, he worked in Life Sciences, Social Science, and the Humanities. Between 2007-2012, he was the Managing Director of Center for Advanced Media Research Amsterdam (CAMeRA) at VU University and between 2010-2011, he was Director of Research and Education, Co-founder, and on the Executive Board of THNK, the Amsterdam School of Creative Leadership.

Prof. Hoorn’s work became globally renowned through his “Alice,” a small child-like robot placed in the homes of three older adults to ease their loneliness - with unexpectedly positive results. This field experiment was recorded in the international-prize winning documentary Alice Cares ( and made it to The Lancet. In assessing Extended Reality media, his current ventures into quantum computing to model the vagueness and ambiguity of affect, reflection, virtuality and reality were recently published in Scientific Reports.

Speech Title: Coexistence and Consistence of the Virtual and the Physical World: Avatars and Robots

Abstract: Extended reality (XR) is a catch-all term for Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR) environments and creatures. Design effort is invested into user interaction with the virtual world as a subset and representation of all elements that people regard as the real world. Algorithms address geometry, lighting of a face or scene, vague or sharp shadows, rough or soft materials under the assumption that perceived realism will be improved and ‘thus’ the overall user experience. Although reality, realism, and realistic rendering are highly abstruse conceptually, researchers and developers delve into a Blinn-Phong model to enhance facial skin reflectance or represent lighting as the sum of the sun’s angle and an ambient environment map to determine the incoming light direction that decides which environment map pixel illuminates a particular 3D point - yet, with little concern what it is they are enhancing from a perceptual viewpoint other than their own. For one thing, human perception of 3D is but a heuristics of depth of a 2D projection on the retina. In this keynote address, I go deeply into the quest of taking the virtual for real, my case in point being avatars (VR and AR) and robots (MR) as exemplifications of XR. The area of missed signals and false alarms is where the virtual and the physical coexist and coincide, depending on the observer’s belief system. We explore a knowledge theory of virtuality to comprehend how we know that we are in the virtual or the real or mixes thereof (MR). Instead of using crude cut-off points like a Neyman-Pearson lemma, Likelihood-ratio test, or Bayesian t, I propose to model the ambiguity and vagueness of being in the virtual and the real (V⊆R) as quantum probabilities of reality as partially imagined (Im) in the complex coordinate space ℂ3.

Copyright © ACM SIGGRAPH VRCAI 2022. All rights reserved..