Prof. Zhenzhong Wei
Beihang University, China
Wei Zhenzhong is a professor and PhD supervisor in the
School of Instrumentation Science and Opto-electronics
Engineering of Beihang University. He is a distinguished
professor of the Chang Jiang Scholars Program and an
awardee of the National Science Fund for Distinguished
Young Scholars.
Prof. Wei received his doctor's degree from Beihang
University in 2003. His research focuses on computer
vision, particularly in image processing and pattern
recognition, position and orientation measurement and
object tracking, etc. He has made several contributions
to the field. For example, he proposed the new methods
of field flexible calibration and expanded the field
calibration system of visual measurement. He has led
more than ten national projects and published 36 papers
in journals indexed by the Science Citation Index. He
also holds 32 patents. He has won two second prizes of
the State Technological Invention Award and four
provincial or ministry-level prizes.
Prof. Limei Song
Tiangong University, China
Prof. Limei Song received her bachelor's, master's and
doctor's degrees from Tianjin University in 1999, 2001
and 2004. She spent a year as a visiting scholar at
Tsinghua University from 2016 to 2017. She was selected
as the first-level talent of "131" in Tianjin. She is
the executive member of Tianjin Artificial Intelligence
Council, and the member of Tianjin Robotics Council. She
won two second prizes and two third prizes of Tianjin
Science and Technology Progress. She won the first prize
of the second Tianjin "Haihe Talents" Post-doctoral
Innovation Group and the National Bronze Prize of the
first National Post-doctoral Innovation and
Entrepreneurship Competition. She was awarded the
Tianjin 5.1 Labor Medal and the Tianjin "March 8
Red-banner pacesetter" and other honorary titles. Her
research interests are image processing, pattern
recognition, detection technology and automation
devices, artificial intelligence, etc. She presided over
two National Natural Science Foundation projects and
more than 10 provincial research projects.
Speech Title: 3D Vision-Guided Robot Intelligent
Processing Technology and Applications
Abstract: Robots have been widely used in various
industries in the national economy. 3D vision inspection
systems are the high-precision three-dimensional "eyes"
of robots, which can quickly obtain accurate 3D data of
target scenes and guide robots to carry out intelligent
processing and manufacturing. This report will introduce
the application of the team's self-developed monocular
and binocular 3D precision vision inspection systems in
industries such as end-of-life vehicle dismantling, 3D
intelligent polishing of shoe lasts, and casting
polishing. The 3D vision guides the heavy-duty robotic
arm to automatically complete the positioning and
grasping of end-of-life vehicles, improving the accuracy
and efficiency of grasping; the data after 3D vision
imaging can automatically carry out path and trajectory
planning to guide the robot to carry out intelligent
polishing of shoe lasts, which not only eliminates the
process of programming the robot by human but also
automatically adapts to different sizes and dimensions
of shoe lasts; 3D vision guides the robot to carry out
The intelligent separation algorithm separates the
target and area to be polished from the main casting,
enabling precise and efficient polishing. 3D
vision-guided robotic intelligent processing technology
can avoid the impact of highly dangerous, noisy, and
polluting work on human life and health, and improve
processing efficiency and processing quality in the
manufacturing industry.
Prof. Haiyan Li
Yunnan University, China
Haiyan Li ,Ph.D. Professor, Doctoral supervisor, School
of Information Science and Engineering, Yunnan
University, China. Selected as one of the Yunnan Ten
Thousand Talents Plan "Famous Yunling Teachers".
Presided over 5 NSFC and provincial projects, and
published more than 70 papers, in which more than 60 are
SIC or EI indexed. Published 5 textbooks and monographs.
Won more than 10 patents and software copyrights, and
more than 120 international, national and provincial
teaching awards.
Speech Title: Resampling-based Cost Loss Attention
Network for Explainable Imbalanced Diabetic Retinopathy
Grading
Abstract: Diabetic retinopathy (DR) is considered to
be one of the most common diseases that cause blindness
currently. However, DR grading methods are still
challenged by the presence of imbalanced class
distributions, small lesions, low accuracy for less
sample classes and poor explainability. To address the
issues, a resampling-based cost loss attention network
for explainable imbalanced diabetic retinopathy grading
is proposed. Firstly, the progressively-balanced
resampling strategy is put forward to create a balanced
training data by mixing the two sets of samples obtained
from instance-based sampling and class-based sampling.
Subsequently, a neuron and normalized channel- spatial
attention module (Neu-NCSAM) is designed to learn the
global features with 3-D weights and apply a weight
sparsity penalty to the attention module to suppress
irrelevant channels or pixels, thereby capturing
detailed small lesion information. Thereafter, a
weighted loss function of the Cost-Sensitive (CS)
regularization and Gaussian label smoothing loss, called
cost loss, is proposed to intelligently penalize the
incorrect predictions and thus to improve the grading
accuracy for less sample classes. Finally, the
Gradient-weighted Class Activation Mapping (Grad-CAM) is
performed to acquire the localization map of the
questionable lesions in order to visually interpret and
understand the effect of our model. Comprehensive
experiments are carried out on two public datasets, and
the subjective and objective results demonstrate that
the proposed network outperforms the state-of-the-art
methods, achieving the best DR grading results with
83.46%, 60.44%, 65.18%, 63.69% and 92.26% for Kappa,
BACC, MCC, F1 and mAUC, respectively.
Assoc.
Prof. Zhen Ye
Chang'an University, China
She received the B.S. degree in Electronic & Information
Engineering M.S. and the Ph.D. degree in information &
communication engineering from Northwestern
Polytechnical University, China, in 2007, 2010 and 2015,
respectively. Meanwhile, she spent one year as a
co-training Ph. D student from September, 2011 to
October, 2012 in Mississippi State University, USA. She
is currently an Associate Professor with the School of
Electronics and Control Engineering, Chang’an
University, Xi’an. Her research interests include remote
sensing, pattern recognition and machine learning.
Speech Title: Multi-Scale Spatial-Spectral Feature
Extraction Based on Dilated Convolution for
Hyperspectral Image Classification
Abstract: Convolutional neural networks have
garnered increasing interest for the supervised
classification of hyperspectral imagery. However, images
with a wide variety of spatial land-cover sizes can
hinder the feature-extraction ability of traditional
convolutional networks. Consequently, many approaches
intended to extract multiscale features have emerged;
these techniques typically extract features in multiple
parallel branches using convolutions of differing kernel
sizes with concatenation or addition employed to fuse
the features resulting from the various branches. In
contrast, the present work explores a multiscale
spatial-spectral feature-extraction network that
operates in a more granular manner. Specifically, in the
proposed network, a dual-branch structure expands the
convolutional receptive fields, applying dense connection
and cascaded strategy for spectral and spatial
multi-scale feature extraction, respectively. The
experimental results show that the classification
performance of our methods outstands that of several
state-of-the-art methods, even under small-sample-size
(SSS) situation.
Assoc. Prof. Ioannis Ivrissimtzis
Durham University, UK
Ioannis Ivrissimtzis is an Associate Professor at the
Department of Computer Science at Durham University. His
research has contributed to the areas of subdivision
surfaces, surface reconstruction, and digital 3D
watermarking and steganalysis. His recent research
contributions include work in the area of applied
machine learning, tackling problems such as wind turbine
early fault diagnostics, blind source separation in GNSS
time series, and face anti-spoofing.
Speech Title: Race Bias Analysis in Face
Anti-spoofing
Abstract: In recent years, the study of bias in
Machine Learning has received considerable research
attention. In this talk, we propose the use of a set of
statistical methods for the systematic study of race
bias and present a case study based on a VQ-VAE face
anti-spoofing algorithm. The main characteristics of the
case study are: the focus is on analysing bias in bona
fide errors, where significant ethical issues lie; the
analysis is not restricted to the final binary
classification outcomes, but also covers the
classifier's scalar responses and the latent space; the
threshold determining the classifier’s operating point
is considered variable. The results show that race bias
does not always come from differences in the mean
responses of the various populations. Instead, it can be
better understood as the combined effect of several
possible statistical characteristics of their
distributions: different means; different variances;
bimodal behaviour; existence of outliers. Joint work
with Latifah Abduh.
Assoc.
Prof. Md Baharul Islam
American University of Malta, Malta
Dr. Md Baharul Islam is an Associate Professor in
Computer Science at the American University of Malta
(AUM), Malta, and an Adjunct Professor of Computer
Engineering at Bahcesehir University, Istanbul, Turkey.
Before joining the AUM, he was a Postdoctoral Research
Fellow at AI and Augmented Vision Lab of the Miller
School of Medicine, University of Miami, United States.
He had completed his Ph.D. in Computer Science from
Multimedia University in Malaysia, and M.Sc. in Digital
Media from Nanyang Technological University in
Singapore. Dr. Islam has more than 15 years of working
experience in teaching and cutting-edge research in
image processing and computer vision area. His current
research interests lie in 3D stereoscopic media
processing, computer vision, and AR/VR-based vision
rehabilitation. Dr. Islam secured several (four) gold
medals from different international scientific and
technological competitions and received (three) best
paper awards from different international conferences,
workshops, and symposiums. He received the IEEE SPS
Research Excellence Award in 2018. He
authored/co-authored more than 60 international
peer-reviewed research papers, including journal
articles, conference proceedings, books, and book
chapters. Dr. Islam received TUBITAK 2232 Outstanding
Researchers Award and Grant that support funds for up to
5 postgraduate students under his supervision. Dr. Islam
is an active IEEE Senior Member since 2018.
Assoc.
Prof. Peixian Zhuang
University of Science and Technology Beijing, China
Peixian Zhuang is currently an Associate Professor in
the Key Laboratory of Knowledge Automation for
Industrial Processes, Ministry of Education, the School
of Automation and Electrical Engineering, University of
Science and Technology Beijing, Beijing, China. From
2020 to 2022, he was a Postdoctoral Fellow and an
Assistant Research Fellow with the Department of
Automation, Tsinghua University, Beijing, China. He
received the Ph.D. degree from Xiamen University,
Xiamen, China, in 2016. From 2017 to 2020, he was a
Lecturer and the Master Supervisor with Nanjing
University of Information Science and Technology,
Nanjing, China.
His research interests involve sparse representation,
Bayesian modeling, deep learning, and calcium signal
processing. He has published more than 30 research
papers (IEEE TIP, IEEE TRGS, IEEE TCVST, IEEE ICIP, IEEE
JOE, EAAI, etc.) in Image Processing and Computer
Vision, with more than 800 times of Google citations and
two papers of highly cited ESI. He was a recipient of
the Outstanding Doctoral Dissertations of Fujian
province in 2017. He served as the Guest Editor of
Journal of Electronics and Information Technology in
2021, the session chair of IEEE International Conference
on Signal and Image Processing in 2019, the Top 25%
reviewer of the Association for the Advancement of
Artificial Intelligence (AAAI) in 2021, the invited
speaker of International Conference on Optics and Image
Processing in 2022, etc.
Speech Title: Underwater Image Enhancement With
Hyper-Laplacian Reflectance Priors
Abstract: We develop a hyper-laplacian reflectance
priors inspired retinex variational model for enhancing
single underwater images. The hyper-laplacian
reflectance priors are proposed with the L1/2-norm
penalty on multi-order gradients of the reflectance,
which can exploit sparsity-promoting and
complete-comprehensive reflectance to booth boost
salient and fine-scale structures and recover authentic
color naturalness. Besides, the L2 norm is found to be
suitable for accurately estimating the illumination. As
a result, we transform a complex underwater image
enhancement issue into simple sub-problems, where their
optimal solutions can be theoretically analyzed and
proved. For addressing the proposed model, we present an
alternating minimization algorithm that is efficient on
element-wise operations and independent of additional
underwater prior knowledges. Final experiments
demonstrates the superiority of our method in both
subjective results and objective assessments over
several conventional and state-of-the-art methods.
Assoc. Prof. Kangjian He
Yunnan University, China
Kangjian He is an Associate Professor at School of
Information Science and Engineering, Yunnan University.
He was selected as the “Donglu Talent Young Scholar” of
Yunnan University. He received the Ph.D. degree from
Yunnan University, Kunming, China, in 2019. From 2020 to
2022, he is a Post-Doctoral Research Fellow with the
School of Information Science and Engineering, Yunnan
University. He presided over two NSFC and provincial
projects. He has authored and co-authored over 50 papers
in refereed international journals and conferences. His
current research interests include multimodal image
processing, neural network theory and applications.
Speech Title: Research and Application of Task-driven
Multi-modality Information Fusion
Abstract: Multimodal data can provide more detailed
and sufficient information than the single modal data
source. Multi-modality information fusion can achieve a
more accurate and comprehensive description of the
target. How to extract and fuse effective information
from multi-modality data is crucial for computer vision
tasks under the background of big data. Most of the
existing fusion schemes are driven by data or models,
which pursue high evaluation indicators but ignore the
user perception and the support for subsequent
high-level tasks. In this talk, we take the application
of multi-modality medical imaging in computer-aided
diagnosis as an example to introduce our research on
task-driven multi-modality information fusion.
Assoc. Prof. Janaka Rajapakse
Tainan National University of the Arts, Taiwan, China
Janaka Rajapakse is an Associate Professor at the Graduate Institute of Animation and Film Art, Tainan National University of the Arts, Taiwan. He is also a visiting scholar at the Department of Media Engineering, The Graduate School of Engineering, Tokyo Polytechnic University of Japan. He was an Assistant Professor in the CG Application Laboratory, Graduate School of Engineering, and special researcher in the Center for Hyper Media Research, Tokyo Polytechnic University, Japan. He received his B.Sc. degree in computer science from the University of Colombo, Sri Lanka. He won the Japanese Government Scholarship for his higher studies in Japan. He received his M.Sc. and Ph.D. degrees from the Japan Advanced Institute of Science and Technology in 2005 and 2008. His research interests include computer animations, virtual reality, augmented reality, haptic interfaces, artificial intelligence, motion capture techniques, computer graphics, 3D printing, interactive media, and Kansei engineering. Prof. Rajapakse is the author or co-author of over eighty academic publications. He is a member of the Society for Art and Science, Motion Capture Society (MCS), Association of AISIA NETWORK BEYOND DESIGN, ASIAGRAPH Association, IEEE, and SIG-Design Creativity.
Speech Title: Developments of Passive Interactions in
Virtual and Mixed Realities
Abstract: Recent VR/MR/XR applications have already
provided interactions via devices and controllers. To
create realistic communication between virtual and
physical worlds, we need to develop passive interaction
methods and what it means to be in real interaction in a
virtual environment. This speech will focus on the
differences between "active interaction" and "passive
interaction." This speech will also explore how the
passive interactions will be made by gesture recognition
methods and machine learning in extended realities, as
well as the devices, platforms, and development engines
they will use to build it and how to implement it.
Furthermore, this talk also explores how to evaluate
passive interactions in immersive environments and the
role of user feedback.
Asst. Prof. Arren Matthew Antioquia
De La Salle University, Philippines
Arren Matthew C. Antioquia received his Master of
Science in Computer Science from both De La Salle
University (DLSU) and the National Taiwan University of
Science and Technology (NTUST), as part of the
ladderized BS-MS program and the Dual Masters Program
between DLSU and NTUST. He is an Assistant Professor at
the Department of Software Technology of De La Salle
University in the Philippines. His most recent major
publication introduced a computationally efficient way
of fusing features from different layers of
convolutional neural networks, which resulted in a
faster and more accurate general object detection
performance compared to state-of-the-art techniques. His
research interest includes applying deep learning
techniques in computer vision applications, specifically
in problems involving image classification and object
detection.
Speech Title: Towards Effective Multi-Scale Object
Detection
Abstract: Despite recent improvements, the arbitrary
sizes of objects still impede the predictive ability of
object detectors. Recent solutions combine feature maps
of different receptive fields to detect multi-scale
objects. However, these methods have large computational
costs resulting in slower inference time, which is not
practical for real-time applications. Contrarily, fusion
methods depending on large networks with many skip
connections demand larger memory requirements,
prohibiting usage in devices with limited memory. In
this talk, we will discuss recent works on improving the
performance on multi-scale object detection, together
with their advantages, issues, and suggested
improvements.