P1 |
Image/Video
Enhancement I |
Time |
|
Chair |
Licheng
Liu (Hunan University) |
ID |
Title |
Author |
39 |
Deep
Convolutional Sparse Coding Network for Pansharpening with Guidance of Side
Information |
Shuang
Xu (Xi'an Jiaotong University)*; Jiangshe Zhang (Xi'an Jiaotong University);
Kai Sun (Xi'an Jiaotong University); Zixiang Zhao (Xi’an Jiaotong
University); Lu Huang (Xi’an Jiaotong University); Junmin Liu (Xi'an Jiaotong
University); Chunxia Zhang (Xi'an Jiaotong University) |
131 |
Pyramid
Orthogonal Attention Network Based on Dual Self-Similarity for Accurate MR
Image Super-Resolution |
Xiaowan
Hu (Tsinghua ShenZhen International Graduate School; )*; Haoqian Wang
(Tsinghua Shenzhen International Graduate School, Tsinghua University);
Yuanhao Cai (Tsinghua Univisity, Tsinghua Shenzhen International Graduate
School); Xiaole Zhao (University of Electronic Science and Technology of
China); Yulun Zhang (Northeastern University) |
184 |
LEARNING
A TREE-STRUCTURED CHANNEL-WISE REFINEMENT NETWORK FOR EFFICIENT IMAGE
DERAINING |
Di
Wang (Nanjing University of Science and Technology)*; Hao Tang (Nanjing
University of Science and Technology); Jinshan Pan (Nanjing University of
Science and Technology); Jinhui Tang (Nanjing University of Science and
Technology) |
254 |
Video
Deraining via Temporal Aggregation-and-Guidance |
Long
Ma (Dalian University of Technology); Risheng Liu (Dalian University of
Technology); Xuefeng Zhang (Dalian University of Technology); Wei Zhong
(Dalian University of Technology)*; Xin Fan (Dalian University of Technology) |
256 |
STAR-Net:
Spatial-Temporal Attention Residual Network for Video Deraining |
Wei
Zhong (Dalian University of Technology); Xuefeng Zhang (Dalian University of
Technology); Long Ma (Dalian University of Technology); Risheng Liu (Dalian
University of Technology)*; Xin Fan (Dalian University of Technology);
Zhongxuan Luo (DALIAN UNIVERSITY OF TECHNOLOGY) |
260 |
Spatial-Temporal
Integration Network with Self-Guidance for Robust Video Deraining |
Xiaokun
Liu (Dalian University of Technology); Risheng Liu (Dalian University of
Technology)*; Long Ma (Dalian University of Technology); Xin Fan (Dalian
University of Technology); Zhongxuan Luo (DALIAN UNIVERSITY OF TECHNOLOGY) |
|
|
|
P2 |
Object/Person
detection, Tracking and Recognition I |
Time |
|
Chair |
Nikolaos
Passalis (Aristotle University of Thessaloniki) |
ID |
Title |
Author |
10 |
Rectifying pseudo label by mutual disagreement
learning for unsupervised domain adaptation person re-identification |
Xu
Xu (Nanjing University of Aeronautics and Astronautics)*; Liyan Zhang
(Nanjing University of Aeronautics and Astronautics) |
84 |
DEEPWORD:
A GCN-BASED APPROACH FOR OWNER-MEMBER RELATIONSHIP DETECTION IN AUTONOMOUS
DRIVING |
zizhang
wu (zongmu tech.com)*; man wang (zongmu tech.com); jason wang (zongmutech);
wenkai zhang ( zongmu tech.com); Muqing Fang (Politecnico di Torino); tianhao
xu (University of Braunschweig - Institute of Technology) |
148 |
Modulating
Localization and Classification for Harmonized Object Detection |
Taiheng
Zhang (Zhejiang University); Qiaoyong Zhong (Hikvision Research Institute)*;
Shiliang Pu (Hikvision Research Institute); Di Xie (Hikvision Research
Institute) |
221 |
Learning
Factorized Cross-View Fusion for Multi-View Crowd Counting |
Liangfeng
Zheng (Peking University); Yongzhi Li (Peking University); Yadong Mu (Peking
University)* |
328 |
Attention-guided
Knowledge Distillation for Efficient Single-stage Detector |
Tong
Wang (CASIA)*; yousong zhu (casia); Chaoyang Zhao (National Laboratory of
Pattern Recognition, CASIA); Xu Zhao (Chinese Academy of Sciences); Jinqiao
Wang (Institute of Automation, Chinese Academy of Sciences); Ming Tang
(Institute of Automation, Chinese Academy of Sciences) |
400 |
Graph
Convolutional Hourglass Networks for Skeleton-based Action Recognition |
Yiran
Zhu (University of Electronic Science and Technology of China); Xing Xu
(University of Electronic Science and Technology of China)*; Yanli Ji
(UESTC); Fumin Shen (UESTC); Heng Tao Shen (University of Electronic Science
and Technology of China (UESTC)); Huimin Lu (Kyushu Institute of Technology) |
|
|
|
P3 |
Emerging
applications of artificial intelligence I |
Time |
|
Chair |
Xuehe
Wang (Nanyang Technological University) |
ID |
Title |
Author |
69 |
EXPLORING
EXPLICIT AND IMPLICIT VISUAL RELATIONSHIPS FOR IMAGE CAPTIONING |
Zeliang
Song (Institute of Information Engineering, Chinese Academy of Sciences &
University of Chinese Academy of Sciences, School of Cyber Security );
Xiaofei Zhou (Institute of Information Engineering, Chinese Academy of
Sciences & University of Chinese Academy of Sciences, School of Cyber
Security)* |
192 |
Unifying
Dynamic Optimizer Search and Network Architecture Search |
Binbin
Yang (Sun Yat-sen University)*; Xiaodan Liang (Sun Yat-sen University);
Junhao Zhong (Sun Yat-sen University); Jiefeng Peng (Sun Yat-sen University);
Guangrun Wang (University of Oxford); Liang Lin (Sun Yat-sen University) |
249 |
Reference-Aided
Part-Aligned Feature Disentangling for Video Person Re-Identification |
Guoqing
Zhang (Nanjing University of Information Science & Technology)*; Yuhao
Chen (Nanjing University of Information Science & Technology); Yang Dai
( Nanjing University of Information Science & Technology); Yuhui Zheng
(Nanjing University of Information Science & Technology); Yi Wu (Wormpex
AI Research) |
255 |
Selective,
Structural, Subtle: Trilinear Spatial-Awareness for Few-Shot Fine-Grained
Visual Recognition |
Heng
Wu (Beihang University); Yifan Zhao (Beihang University); Jia Li (Beihang
University)* |
304 |
Non-Local
Attention Learning for Medical Image Classification |
Yang
Wen (School of Computer Science and Engineering, University of Electronic
Science and Technology of China); Leiting Chen (School of Computer Science
and Engineering, University of Electronic Science and Technology of China);
Haisheng Chen (University of Electronic Science and Technology of China);
Ximan Tang (School of Computer Science and Engineering, University of
Electronic Science and Technology of China); Yu Deng (King's College London);
Yongbiao Chen (Shanghai Jiao Tong University); Chuan Zhou (School of Computer
Science and Engineering, University of Electronic Science and Technology of
China)* |
349 |
Towards
Efficient Medical Image Segmentation via Boundary-Guided Knowledge
Distillation |
Yang
Wen (School of Computer Science and Engineering, University of Electronic
Science and Technology of China); Leiting Chen (School of Computer Science
and Engineering, University of Electronic Science and Technology of China);
Shuo Xi (School of Computer Science and Engineering, University of Electronic
Science and Technology of China); Yu Deng (King's College London); Ximan Tang
(School of Computer Science and Engineering, University of Electronic Science
and Technology of China); Chuan Zhou (School of Computer Science and
Engineering, University of Electronic Science and Technology of China)* |
|
|
|
P4 |
Multimedia
databases and data mining |
Time |
|
Chair |
Austin
Shuai (National Chiao Tung University) |
ID |
Title |
Author |
930 |
Multi-view
Learning via Low-rank Tensor Optimization |
Lele
Fu (Fuzhou University); Zhaoliang Chen (College of Mathematics and Computer
Science, Fuzhou University, Fuzhou 350116, China); Sujia Huang (Jiangxi
Normal University); Sheng Huang (Fuzhou University); Shiping Wang (Fuzhou
University)* |
93 |
Deep
Unsupervised Hashing by Global and Local Consistency |
Xiao
Luo (Peking University); Daqing Wu (Peking University); Chong Chen (Alibaba);
Jinwen Ma (Peking University); Minghua Deng (Peking University)* |
321 |
Attention-based
Relation Reasoning Network for Video-Text Retrieval |
Ni
Wang (University of Electronic Science and Technology of China ); Zheng Wang
(UESTC); Xing Xu (University of Electronic Science and Technology of China)*;
Fumin Shen (UESTC); Yang Yang (University of Electronic Science and
Technology of China); Heng Tao Shen (University of Electronic Science and
Technology of China (UESTC)) |
764 |
FAST
BINARY EMBEDDING OF DEEP LEARNING Image FEATURES using Golay-Hadamard
matrices |
Chanattra
Ammatmanee (Brunel University); Lu Gan (Brunel University London)*; Hongqing
Liu (CQUPT) |
855 |
Disturbance
Consistent Self-Ensembling For Semi-Supervised Hashing |
Shuai
Cheng (Institute of Information Engineering, School of Cyber Security,
University of Chinese Academy of Sciences); Yucan Zhou (Chinese Academy of
Sciences)*; Dayan Wu (Institute of Information Engineering, Chinese Academy
of Sciences); Haisu Zhang (NUDT); Bo Li ( Institute of Information
Engineering, Chinese Academy of Sciences); Weiping Wang (Institute of
Information Engineering, CAS, China) |
1355 |
USER-PREFERENCE
BASED KNOWLEDGE GRAPH FEATURE AND STRUCTURE LEARNING FOR RECOMMENDATION |
Hang
Shu (Shanghai Advanced Research Institute, Chinese Academy of Sciences,
Shanghai, China)*; Jun Huang (Shanghai Advanced Research Institute, Chinese
Academy of Sciences) |
|
|
|
P5 |
Special
Session: Advancd Video Coding and Deep Active Learning |
Time |
|
Chair |
Xin
Zhao (Tencent) |
ID |
Title |
Author |
492 |
An
Efficient and Open Source Encoder uavs3e for Video Compression |
Yangang
Cai (Peking University Shenzhen Graduate School)*; Ronggang Wang (Peking
University); Zhenyu Wang (Shenzhen Graduate School, Peking University);
Bingjie Han (Shenzhen Graduate School, Peking University); Xufeng Li
(Shenzhen Graduate School, Peking University) |
866 |
Visual
Analysis Motivated Rate-Distortion Model for Image Coding |
Zhimeng
Huang (Peking University)*; Chuanmin Jia (Peking University); Shanshe Wang
(Peking University); Siwei Ma (Peking University, China) |
969 |
STUDY
ON CODING TOOLS BEYOND AV1 |
Xin
Zhao (Tencent America)*; Liang Zhao (Tencent); Madhu Krishnan (Tencent
America); Yixin Du (Tencent America); Shan Liu (Tencent America); Debargha Mukherjee (Google Inc); Yaowu
Xu (Google Inc.); Adrian Grange (Google Inc) |
1188 |
Dynamic
Computational Resource Allocation for Fast Inter Frame Coding in Video
Conferencing Applications |
Hang
Yuan (Peking University; Peng Cheng Laboratory); Wei Gao (Peking University
& Peng Cheng Laboratory)*; Junle Wang (Tencent) |
78 |
VISIONNET:
A COARSE-TO-FINE MOTION PREDICTION ALGORITHM BASED ON ACTIVE
INTERACTION-AWARE DRIVABLE SPACE LEARNING |
Dongchun
Ren (Meituan); Yanliang Zhu (Sankuai Online Corporation ); Mingyu Fan
(Wenzhou University)*; Deheng Qian (Meituan); Huaxia Xia (Meituan); Zhuang Fu
(Meituan) |
840 |
Real-World
Image Super-Resolution via Spatio-temporal Correlation Network |
Hongyang
Zhou (University Of Science and Technology Beijing); Xiaobin Zhu (University
Of Science and Technology Beijing)*; Zheng Han (University of Science and
Technology Beijing); Xu-Cheng Yin (University of Science and Technology
Beijing) |
|
|
|
P6 |
Image/Video
Enhancement II |
Time |
|
Chair |
Cunjian
Chen (Michigan State University) |
ID |
Title |
Author |
313 |
Multiple
task-oriented encoders for unified image fusion |
Zhuoxiao
Li (Dalian University of Technology); Jinyuan Liu (Dalian University of
Technology); Risheng Liu (Dalian University of Technology); Xin Fan (Dalian
University of Technology)*; Zhongxuan Luo (DALIAN UNIVERSITY OF TECHNOLOGY);
Wen Gao (PKU) |
323 |
Hardware-aware
low-light image enhancement via one-shot neural architecture search with
shrinkage sampling |
Yuansheng
Yao (DLUT); Risheng Liu (Dalian University of Technology)*; Jiaao Zhang
(Dalian University of Technology); Zhong Wei (Dalian University of
Technology); Xin Fan (Dalian University of Technology); Zhongxuan Luo (DALIAN
UNIVERSITY OF TECHNOLOGY) |
378 |
SEARCHING
FRAME-RECURRENT ATTENTIVE DEFORMABLE NETWORK FOR REAL-TIME VIDEO DERAINING |
Xinwei
Xue (Dalian University of Technology)*; Xiangyu Meng (Dalian University of
Technology); Long Ma (Dalian University of Technology); Yi Wang (Dalian
University of Technology); Risheng Liu (Dalian University of Technology); Xin
Fan (Dalian University of Technology) |
424 |
Invertible
color-to-grayscale Conversion By Using Clustering and Reversible Watermarking |
Qiaoyi
Liang (Jinan University); Runwen Hu (Jinan University); Shijun Xiang (Jinan
University)* |
653 |
Restoration
of HDR Images for SVE-based HDRI via a Novel DCNN |
Yilun
Xu (Beihang University)*; Ziyang Liu (Beihang University); Xingming Wu
(Beihang University); Weihai Chen (Beihang University); Zhengguo Li (A*STAR) |
657 |
HEATMAP-AWARE
PYRAMID FACE HALLUCINATION |
Chenyang
Wang (Harbin Institute of Technology)*; Junjun Jiang (Harbin Institute of
Technology); Xianming Liu (Harbin Institute of Technology) |
|
|
|
P7 |
Object/Person
detection, Tracking and Recognition II |
Time |
|
Chair |
Raouf
Hamzaoui (De Montfort University) |
ID |
Title |
Author |
414 |
ASOC:
Adaptive Self-aware Object Co-localization |
Koteswar
Rao Jerripothula (IIIT Delhi)*; Prerana Mukherjee (Jawaharlal Nehru
University) |
641 |
Retrospective
Class Incremental Learning |
Qingyi
Tao (Nanyang Technological University)*; Chen Change Loy (Nanyang Technological
University); Jianfei Cai (Monash University); Zongyuan Ge (Monash); Simon See
(NVIDIA AI Tech Centre) |
774 |
SALIENT
OBJECT DETECTION VIA ATTENTION-AWARE CASCADED BOTTOM-UP FEATURE AGGREGATION |
Fengming
Sun (Nanjing University of Science and Technology); Lufei Huang (Nanjing
University of Science and Technology); Xia Yuan (Nanjing University of
Science and Technology)*; ChunXia Zhao (Nanjing university of science and
technology) |
781 |
Feature
Aggregation Network with Tri-hybrid Loss for Instance Segmentation |
Zeping
Zhou (University of Shanghai for Science and Technology); Yongxiong Wang
(University of Shanghai for Science and Technology)*; Jin Peng (University of
Shanghai for Science and Technology) |
818 |
Robust
Video Text Detection through Parametric Shape Regression, Propagation and
Fusion |
Long
Chen (Nanjing University); Jiahao Shi (Nanjing University); Feng Su (Nanjing
University)* |
819 |
Static
Image Action Recognition with Hallucinated Fine-grained Motion Information |
Shengyuan
Huang (Shanghai Jiao Tong University)*; Xing Zhao (Shanghai Jiao Tong
University); Li Niu (Shanghai Jiao Tong University); Liqing Zhang (Shanghai
Jiao Tong University) |
|
|
|
P8 |
Emerging
multimedia applications of deep learning |
Time |
|
Chair |
Jong-Seok
LEE (Yonsei University) |
ID |
Title |
Author |
7 |
MULTI-PRETEXT
ATTENTION NETWORK FOR FEW-SHOT LEARNING WITH SELF-SUPERVISION |
Hainan
Li (Beihang University); Renshuai Tao (Beihang University); Jun Li (Capital
Normal University); Haotong Qin (Beihang University); Yifu Ding (Beihang
University); Shuo Wang (Beihang University); Xianglong Liu (Beihang
University)* |
31 |
More-Similar-Less-Important:
Filter Pruning via KMeans Clustering |
Zili
Liu (China University of Mining and Technology, Beijing)*; Peisong Wang (Institute of Automation, Chinese
Academy of Sciences); Zaixing Li (China University of Mining and Technology,
Beijing) |
83 |
Interpret
the Predictions of Deep Networks via Re-Label Distillation |
Yingying
Hua (Chinese Academy of Sciences); Shiming Ge (Chinese Academy of Sciences)*;
Daichi Zhang (Chinese Academy of Sciences) |
152 |
Teacher-supervised
Generative Adversarial Networks |
Yan
Gan (Chongqing University); Tao Xiang (Chongqing University)*; Hangcheng Liu
(Chongqing University); Mao Ye (University of Electronic Science and
Technology of China); Dan Liu (University of Shanghai for Science and
Technology) |
270 |
Non-Adversarial
Novelty Detection with Generative Latent Nearest Neighbors |
Chengwei
Chen (East China Normal University); Zhizhong Zhang (East China Normal
University); Yuan Xie (East China Normal University)*; Haichuan Song (East
China Normal University); Lizhuang Ma (East China Normal University) |
706 |
Towards
Effective Adversarial Attack Against 3D Point Cloud Classification |
Chengcheng
Ma (Institute of Automation, Chinese Academy of Sciences); Weiliang Meng
(Institute of Automation, Chinese Academy of Sciences)*; Baoyuan Wu (The
Chinese University of Hong Kong, Shenzhen; Shenzhen Research Institute of Big
Data); Shibiao Xu (Institute of Automation, Chinese Academy of Sciences);
Xiaopeng Zhang (Institute of Automation, Chinese Academy of Sciences) |
|
|
|
P9 |
Multimedia
for society and health |
Time |
|
Chair |
Carl
James Debono (University of Malta) |
ID |
Title |
Author |
617 |
SEEING
HEALTH WITH EYES: FEATURE COMBINATION FOR IMAGE-BASED HUMAN BMI ESTIMATION |
Junjia
Huang (Sun Yat-Sen University); Chenming Shang (School of Intelligent Systems
Engineering, Sun Yat-sen University); Aolin Xiong (Sun Yat-Sen University);
Yuxian Pang (School of Intelligent Systems Engineering, Sun Yat-sen
University ); Zhi Jin (Sun Yat-sen University)* |
1366 |
LOWER
BODY REHABILITATION DATASET AND MODEL OPTIMIZATION |
Chenxi
Wang (University of Massachusetts Lowell)*; Ying Li (University of
Massachusetts Lowell); Zinan Xiong (University of Massachusetts Lowell); Yan
Luo ( The University of Massachusetts Lowel); Yu Cao (University of
Massachusetts Lowell) |
32 |
Constrained
Contrastive Representation: Classification on Chest X-rays with Limited Data |
Weiqi
Zhang (Academy for Engineering and Technology, Fudan University); Hongbo Wang
(Academy for Engineering and Technology, Fudan University)*; Zhiping Lai
(Fudan University); Chao Hou (Academy for Engineering and Technology, Fudan
University) |
317 |
A
Keypoint Transformer to Discover Spine Structure for Cobb Angle Estimation |
Yue
Guo (Institute of Automation, Chinese Academy of Sciences)*; Yanmei Li
(Beijing College of Finance and Commerce); Xiaowei Zhou (Institute of
Automation, Chinese Academy of Sciences); Wenhao He (Institute of Automation,
Chinese Academy of Sciences) |
659 |
A
ZERO-SHOT METHOD FOR 3D MEDICAL IMAGE SEGMENTATION |
Shiqiang
Ma (College of Intelligence and Computing, Tianjin University)*; Xuejian Li
(College of Intelligence and Computing, Tianjin University); Jijun Tang
(Tianjin University); Fei Guo (Tianjin University) |
1043 |
QAU-Net:
Quartet Attention U-Net for Liver and Liver-tumor Segmentation |
Luminzi
Hong (Shaanxi University of Science and Technology); Risheng Wang (Shaanxi
University of Science and Technology); Tao Lei (Shaanxi University of Science
and Technology)*; Xiaogang Du (Shaanxi University of Science and Technology);
Yong Wan (the First Affiliated Hospital of Xi'an Jiaotong University) |
|
|
|
P10 |
Special
Session: Advanced Representation Learning and Depth-Related Processing |
Time |
|
Chair |
Hui
Yuan (Shandong University) |
ID |
Title |
Author |
95 |
ANIME
STYLE TRANSFER WITH SPATIALLY-ADAPTIVE NORMALIZATION |
Jinrong
Cui (South China Agricultural University)* |
138 |
TOWARDS
RICH-DETAIL 3D FACE RECONSTRUCTION AND DENSE ALIGNMENT VIA MULTI-SCALE DETAIL
AUGMENTATION |
Jianjun
Zhang (Ningxia University)*; Suping Wu
(Ningxia University); Lei Li (Ningxia University); Kui Lin (Ningxia
University); Xing Zheng (Ningxia University); hu cao (Ningxia University) |
183 |
Dual
Prototype Relaxation for Generalized Zero Shot Learning |
Jie
Zhang (Nanjing University of Science and Technology); Haofeng Zhang (Nanjing
University of Science and Technology)*; BingZhang Hu (Newcastle University) |
841 |
Variation-net:
Interpretable Variation-inspired Deep Network for Pansharpening |
Kun
Li (Wuhan University); Wei Zhang (Wuhan University); Xin Tian (Wuhan
University)*; Jiayi Ma (Wuhan University); Huabing Zhou (Wuhan Institute of
Technology); Zhongyuan Wang (Wuhan University) |
615 |
BTS-NET:
BI-DIRECTIONAL TRANSFER-AND-SELECTION NETWORK FOR RGB-D SALIENT OBJECT
DETECTION |
Wenbo
Zhang (Sichuan University); Yao Jiang (Sichuan University); Keren Fu (Sichuan
University)*; Qijun Zhao (Sichuan University) |
989 |
Distortion-Tolerant
Monocular Depth Estimation On Omnidirectional Images Using Dual-Cubemap |
Zhijie
Shen (Beijing Jiaotong University); Chunyu Lin (Beijing Jiaotong
University)*; Lang Nie (Beijing Jiaotong University); Kang Liao (Beijing
Jiaotong University); Yao Zhao (Beijing Jiaotong University) |
|
|
|
P11 |
Image/Video
Enhancement III |
Time |
|
Chair |
Zhiyong
Wang (The University of Sydney) |
ID |
Title |
Author |
751 |
Unsupervised
Remoting Sensing Super-Resolution Via Migration Image Prior |
Jiaming
Wang (Wuhan University )*; Zhenfeng Shao (Wuhan University); Tao Lu ( Wuhan
Institute of Technology); Xiao Huang (University of Arkansas); Ruiqian Zhang
(Wuhan University); Yu Wang (Wuhan Institute of technology) |
758 |
Research
on super-resolution enhancement algorithm based on skip residual dense
network |
Mu
Shaoshuo (communication university of zhejiang); Zhang Yanhua (communication
university of zhejiang); Qian Xiaolan (communication university of zhejiang);
Jiang Yanbing (communication university of zhejiang)* |
808 |
SDAN:
Squared Deformable Alignment Network for Learning Misaligned Optical Zoom |
Kangfu
Mei (The Chinese University of Hong Kong, Shenzhen)*; Shenglong Ye (The
Chinese University of Hong Kong, Shenzhen); Rui Huang (The Chinese University
of Hong Kong, Shenzhen) |
816 |
A
novel attention enhanced residual-in-residual dense network for text image
super-resolution |
Minglong
Xue (National Key Lab for Novel Software Technology, Nanjing University,
Nanjing, China)*; Zhiheng Huang (National Key Lab for Novel Software
Technology, Nanjing University); Ruo-Ze Liu (Nanjing University); Tong Lu
(Nanjing University) |
933 |
Graph
Attention Neural Network for Image Restoration |
Chong
Mou (Peking University Shenzhen Graduate School); Jian Zhang (Peking
University Shenzhen Graduate School)* |
938 |
IMPROVING
CONVOLUTIONAL NETWORKS WITH BOOSTING ATTENTION CONVOLUTIONS |
Chao
li ( Shenzhen University)*; Yongsheng Liang (Harbin Institute of Technology
Shenzhen Graduate School); Huoxiang Yang (Shenzhen University); Fanyang Meng
(Peng Cheng Laboratory); Wei Liu (Shenzhen Institute of Information
Technology); Handong Wang (Harbin Institute of Technology Shenzhen Graduate
School) |
|
|
|
P12 |
Multimedia
analysis and understanding I |
Time |
|
Chair |
Wei
Qi Yan (Auckland University of Technology) |
ID |
Title |
Author |
1299 |
Efficient
Fine-Grained Visual-Text Search Using Adversarially-Learned Hash Codes |
Yongzhi
Li (Peking University); Yadong Mu (Peking University)*; Nan Zhuang (Peking
University); Xianglong Liu (Beihang University) |
1329 |
Depth-Guided
AdaIN and Shift Attention Network for Vision-and-Language Navigation |
Qiang
Sun (Fudan University); Yifeng Zhuang (Fudan University); Zhengqing Chen
(Fudan University); Yanwei Fu (Fudan University)*; Xiangyang Xue (Fudan
University) |
1502 |
Learning
Outfit Compatibility with Graph Attention Network and Visual-Semantic
Embedding |
Jianfeng
Wang (Sun Yat-sen University); Xiaochun Cheng (Middlesex University)*; Ruomei
Wang (Sun Yat-sen University); Shaohui Liu (Harbin Institute of Technology) |
1042 |
Key
Facial Components Guided Micro-expression Recognition based on First &
Second-order Motion |
Yu-ting
Su (Tianjin University); Jiaqi Zhang (Tianjin University); Jing Liu (Tianjin
University)*; Guangtao Zhai (Shanghai Jiao Tong University) |
1380 |
FFNET-M:
FEATURE FUSION NETWORK WITH MASKS FOR MULTIMODAL FACIAL EXPRESSION
RECOGNITION |
Mingzhe
Sui (University of Science and Technology of China); Zhaoqing Zhu (University
of Science and Technology of China); Feng Zhao (University of Science and
Technology of China)*; Feng Wu (University of Science and Technology of
China) |
1578 |
Dual-waveform
emotion recognition model for conversations |
Jiayi
Zhang (Beijing University of Posts and Telecommunication)*; Zihe Liu (Beijing
University of Posts and Telecommunications); Peihang Liu (Beijing University
of Posts and Telecommunications); Bin Wu (Beijing University of Posts and
Telecommunications) |
|
|
|
P13 |
Emerging
applications of artificial intelligence II |
Time |
|
Chair |
Chong-Yang
Zhang (Shanghai Jiao Tong University) |
ID |
Title |
Author |
699 |
CRANet:
Cascade Residual Attention Network for Crowd Counting |
Zhongyuan
Wu (Chongqing University); Jun Sang (Chongqing University)*; Ying Shi
(Chongqing University); Qi Liu (Chongqing University); Nong Sang (Huazhong
University of Science and Technology); Xinyue Liu (Chongqing University) |
922 |
FUSING
TEMPORALLY DISTRIBUTED MULTI-MODAL SEMANTIC CLUES FOR VIDEO QUESTION
ANSWERING |
Fuwei
Zhang (Sun Yat-sen University); Ruomei Wang (Sun Yat-sen University); Songhua
Xu (Xi'an Jiaotong University)*; Fan Zhou (Sun Yat-sen university) |
981 |
Driving
Video Fixation Prediction Model via Spatio-Temporal Networks and Attention
Gates |
Tao
Deng (Southwest Jiaotong University)*; Fei Yan (Southwest Jiaotong
University); Hongmei Yan () |
1424 |
Incorporating
Multimodal Cues for Advertorial Discovery |
Lu
Zhang (University of Technology Sydney)*; Jian Zhang (UTS); Jialie Shen
(Queen's Belfast University); JingSong Xu (University of Technology Sydney );
Zhibin Li (University of Technology Sydney ); Litao Yu (UTS) |
1464 |
Dynamic
Cross Fusion Network for Building-Based Damage Assessment |
Huaxin
Xiao (NUDT)*; Yang Peng (National University of Defense Technology); Hanlin
Tan (National University of Defense Technology); Ping Li (Hangzhou Dianzi
University) |
457 |
Toward
Personalized Human Activity Recognition Model with Auto-Supervised Learning
Framework |
Ala
Mhalla Mhalla (Blaise Pascal University); Ala Mhalla (Université Clermont
Auvergne ( UCA ))* |
|
|
|
P14 |
Multimedia
security, privacy and forensics I |
Time |
|
Chair |
Cunjian
Chen (Michigan State University) |
ID |
Title |
Author |
554 |
DeepFake
videos detection using self-supervised decoupling network |
Jian
Zhang (School of Computer Science and Engineering, Sun Yat-sen University)*;
Jiangqun Ni Sun Yat-sen Univ. (); HAO XIE (Sun Yat-sen University) |
747 |
Object-based
Video Forgery Detection via Dual-Stream Networks |
Xiao
Jin (Nankai University)*; he zhen (nankai university); Jing Xu (Nankai
University); Yongwei Wang (University of British Columbia); Yu-ting Su
(Tianjin University) |
917 |
DLFMNet:
End-to-End Detection and Localization of Face Manipulation using Multi-domain
Features |
Peng
Chen (1. Institute of Information Engineering,Chinese Academy of Sciences. 2.
School of Cyber Security, University of Chinese Academy of Sciences); Jin Liu
(1. Institute of Information Engineering,Chinese Academy of Sciences. 2.
School of Cyber Security, University of Chinese Academy of Sciences); Tao
Liang (1. Institute of Information Engineering,Chinese Academy of Sciences.
2. School of Cyber Security, University of Chinese Academy of Sciences); Cai
Yu (1. Institute of Information Engineering,Chinese Academy of Sciences. 2.
School of Cyber Security, University of Chinese Academy of Sciences); Shuqiao
Zou (1. Institute of Information Engineering,Chinese Academy of Sciences. 2.
School of Cyber Security, University of Chinese Academy of Sciences); Jiao
Dai (Institute of Information Engineering,Chinese Academy of Sciences)*;
Jizhong Han (Institute of Information Engineering,Chinese Academy of
Sciences) |
951 |
Zero
Knowledge Adversarial Defense via Iterative Translation Cycle |
Fan
Jia (Tianjin University); Yucheng Shi (Tianjin University); Yahong Han
(Tianjin University)* |
1086 |
Detection
of Deep Video Frame Interpolation via Learning Dual-stream Fusion CNN in the
Compression domain |
xiangling
ding (Hunan University of Science and Technology); Pan YIfeng (Hunan
University of Science and Technology); Gu Qing (Hunan University of Science
and Technology)*; you ji chen (chenjiyou); Gaobo Yang (Hunan University of
Science and Technology); Xiong Yimao (Hunan University of Science and
Technology) |
1302 |
RPATTACK:
REFINED PATCH ATTACK ON GENERAL OBJECT DETECTORS |
Hao
Huang (Peking University); Yongtao Wang (Peking University)*; Zhaoyu Chen
(Fudan University); Zhi Tang (Peking University); Wenqiang Zhang (Fudan
University); Kai-Kuang Ma (Nanyang Technological University, Singapore) |
|
|
|
P15 |
Speial
Session: Multimedia Processing |
Time |
|
Chair |
Yueqi
Duan (Stanford University) |
ID |
Title |
Author |
613 |
DENSE
FUSION NETWORK WITH MULTIMODAL RESIDUAL FOR SENTIMENT CLASSIFICATION |
Huan
Deng (Guangdong University of Technology)*; peipei kang (Guangdong University
of Technology); Zhenguo Yang (Guangdong University of Technology); Tianyong
Hao (South China Normal University); Qing Li (The Hong Kong Polytechnic
University); Wenyin Liu (Guangdong University of Technology) |
90 |
Learning
Multiple Semantic Knowledge for Cross-domain Unsupervised Vehicle
Re-identification |
Huibing
Wang (Dalian Maritime University)*; Jinjia Peng (Dalian Maritime University);
Guangqi Jiang (Dalian Maritime University); Xianping Fu (Dalian Maritime
University) |
1203 |
Auxiliary
Bi-Level Graph Representation for Cross-Modal Image-Text Retrieval |
Xian
Zhong (Wuhan University of Technology); Zhengwei Yang (Wuhan University of
Technology); Mang YE (Wuhan University)*; Wenxin Huang (Hubei University);
Jingling Yuan (Wuhan University of Technology); Chia-Wen Lin (National Tsing
Hua University) |
1413 |
Are
GAN generated images easy to detect?
A critical analysis of the state-of-the-art |
Diego
Gragnaniello (University Federico II of Naples); Davide Cozzolino (University
Federico II of Naples); Francesco Marra (University Federico II of Naples);
GIovanni Poggi (University Federico II of Naples); Luisa Verdoliva
(University Federico II of Naples)* |
380 |
Lightweight
Image Super-Resolution with Multi-scale Feature Interaction Network |
Zhengxue
Wang (Nanjing University of Posts and Telecommunications); Guangwei Gao
(NII)*; Juncheng Li (East China Normal University); Yi Yu (NII); Huimin Lu
(Kyushu Institute of Technology) |
504 |
Self-supervised
Mutual Learning for Video Representation Learning |
Jinpeng
Wang (Sun Yat-sen University); Yutong Li (Sun Yat-sen University); Jianguo Hu
(Sun Yat-sen University)*; Xuebin Yang (Sun Yat-sen University); Yanyu Ding
(Dongguan University Of Technology) |
|
|
|
P16 |
Multimedia
analysis and processing |
Time |
|
Chair |
Nikos
Nikolaidis (Aristotle University of Thessaloniki) |
ID |
Title |
Author |
479 |
Visually
Maintained Image Disturbance Against DeepFake Face Swapping |
Junhao
Dong (Sun Yat-sen University); Xiaohua Xie (Sun Yat-sen University)* |
582 |
Boosting
few-shot classification with view-learnable contrastive learning |
Xu
Luo (University of Electronic Science and Technology of China); Yuxuan Chen
(University of Electronic Science and Technology of China); liangjian Wen
(University of Electronic Science and Technology of China); Lili Pan
(University of Electronic Science and Technology of China); Zenglin Xu
(Harbin Institute of Technology, Shenzhen)* |
1393 |
Few-Shot
Semantic Segmenation via Prototype Augmentation with Image-level Annotations |
Shuo
Lei (Virginia Tech)*; Xuchao Zhang (Virginia Tech); Jianfeng He (Virginia
Tech); Fanglan Chen (Virginia Tech); Chang-Tien Lu (Virginia Tech, USA) |
708 |
Enhancing
Viewing Experience of Generated Visual Storylines for Promotional Videos |
Chang
Liu (Nanyang Technological University)*; Han Yu (Nanyang Technological
University (NTU)); Zhiqi Shen (NTU); Ian Dixon (Nanyang Technological
University); Yingxue Yu (Nanyang Technological University); Zhanning Gao
(Alibaba Group); Pan Wang (Alibaba Group); Peiran Ren (Alibaba Group);
Xuansong Xie (Alibaba); Lizhen Cui (ShanDong University); Chunyan Miao (NTU) |
15 |
SHOW,
RETHINK, AND TELL: IMAGE CAPTION GENERATION WITH HIERARCHICAL TOPIC CUES |
Feng
Chen (National University of Defense Technology)*; Songxian Xie (National
University of Defense Technology); Xinyi Li (National University of Defense
Technology); Jintao Tang (National University of Defense Technology); Kunyuan
Pang (National University of Defense technology); shasha li (National
University of Defence Technology); Ting Wang (National University of Defense
Technology) |
415 |
Pyramid
Feature Attention Network for Monocular Depth Prediction |
Yifang
Xu (NanJing University)*; Chenglei Peng (Nanjing University); Ming Li
(NanJing University); Yang Li (NanJing University); Sidan Du (Nanjing
University) |
|
|
|
P17 |
Image/Video
Enhancement IV |
Time |
|
Chair |
Raouf
Hamzaoui (De Montfort University) |
ID |
Title |
Author |
949 |
Spatial-Temporal
Correlation Learning for Real-Time Video Deinterlacing |
Yuqing
Liu (Dalian University of Technology); xinfeng zhang (University of Chinese
Academy of Sciences); Shanshe Wang (Peking University); Siwei Ma (Peking
University, China)*; Wen Gao (PKU) |
1028 |
Rethinking
Noise Modeling in Extreme Low-light Environments |
Jing
Wang (Sony R&D Center China)*; Yitong Yu (Central University of Finance
and Economics); Songtao Wu (Sony R&D Center China); Chang Lei (Columbia
University); Kuanhong Xu (Sony R&D Center China) |
1078 |
UNDEREXPOSED
IMAGE ENHANCEMENT VIA UNSUPERVISED FEATURE ATTENTION NETWORK |
fengji
ma (Beihang University)*; haitao Li (Beihang University) |
1126 |
Self-Attention
Recurrent Summarization Network with Reinforcement Learning for Video
Summarization Task |
Aniwat
Phaphuangwittayakul (East China University of Science and Technology); Yi Guo
(ecust)*; Fangli Ying (East China University of Science and Technology);
Wentian Xu (East China University of Science and Technology); Zheng Zheng
(East China University of Science and Technology) |
1264 |
Image
inpainting using edge structure aware hierarchical guidance |
Yashi
Su (south china university of technology); Lihong Ma (south china university
of technology); Jing Tian (national university of singapore)* |
1284 |
L2
NORM IS ALL YOUR NEED: INFRARED-VISIBLE IMAGE FUSION VIA GUIDED
TRANSFORMATION MINIMIZATION |
Huibin
Yan (Shenzhen University); Shuoyao Wang (Shenzhen University)* |
|
|
|
P18 |
Object/Person
detection, Tracking and Recognition III |
Time |
|
Chair |
Nikolaos
Passalis (Aristotle University of Thessaloniki) |
ID |
Title |
Author |
820 |
DUAL
CONTRASTIVE UNIVERSAL ADAPTATION NETWORK |
Ziyun
Cai (Nanjing University of Posts and Telecommunications)*; jie song (Nanjing
University of Posts and Telecommunications); Tengfei Zhang (Nanjing
University of Posts and Telecommunications); Jing Xiao-Yuan (School of
Computer, Wuhan University); Ling Shao (Inception Institute of Artificial
Intelligence) |
865 |
Knowledge
Transfer Based Fine-Grained Visual Classification |
Siqing
Zhang (Beijing University of Posts and Telecommunications); Ruoyi Du (Beijing
University of Posts and Telecommunications); Dongliang Chang (Beijing
University of Posts and Telecommunications); Zhanyu Ma (Beijing University of
Posts and Telecommunications)*; Jun Guo (Beijing University of Posts and
Telecommunications) |
935 |
Inharmonious
Region Localization |
Jing
Liang (Shanghai Jiao Tong University); Li Niu (Shanghai Jiao Tong
University)*; Liqing Zhang (Shanghai Jiao Tong University) |
939 |
A
Change-aware Approach for Relative Motion Segmentation |
Zhuojun
Zou (Institute of Automation, Chinese Academy of Sciences; School of
Artificial Intelligence, University of Chinese Academy of Sciences)*; Zhaoteng
Meng (Institute of Automation,Chinese Academy of Sciences; University of
Chinese Academy of Sciences); lin shu (Institute of Automation, Chinese
Academy of Sciences); jie hao (Institute of Automation,Chinese Academy of
Sciences) |
987 |
A
METHOD OF STABLE LONG-TERM SINGLE OBJECT TRACKING |
Zitong
Yi (BUPT)*; Zhihang Tong (BUPT); Yanyun
Zhao (Beijing Univiersity of Posts and Telecommunications); Zhicheng
Zhao (bupt); Fei Su (Beijing University of Posts and Telecommunications) |
1219 |
Multi-camera
Logical Topology Inference via Conditional Probability Graph Convolution
Network |
keyang
Cheng (Jiangsu University); Qing Liu (Jiangsu University)*; Rabia Tahir
(Jiangsu University); Lubamba E
Kasangu Eric (Jiangsu University ); Ligang He (The University of Warwick) |
|
|
|
P19 |
Multimedia
analysis and understanding II |
Time |
|
Chair |
Chong-Yang
Zhang (Shanghai Jiao Tong University) |
ID |
Title |
Author |
1316 |
REAL-TIME
OBJECT DETECTION BY FEATURE MAP FORECAST FOR LIVE STREAMING VIDEO |
Masato
Fujitake (The Graduate University for Advanced Studies, SOKENDAI)*; Akihiro
Sugimoto (NII) |
1446 |
SVRAT:
A SKELETON-BASED INTELLIGENT MONITORING SYSTEM FOR VIOLENCE RECOGNITION AND
ABUSER TRACKING |
Haiqiang
Liu (Jilin University)*; Meibao Yao (Jilin University); Linlin Wang (Jilin
University) |
1048 |
Graph-in-graph
contrastive learning for semi-supervised adaptation |
Liang
Li (Tianjin University); Aming Wu (Tianjin University); Yahong Han (Tianjin
University)* |
1087 |
FINED:
FAST INFERENCE NETWORK FOR EDGE DETECTION |
Jan
Kristanto Wibisono (National Chiao Tung University)*; Hsueh-Ming Hang
(National Chiao Tung University) |
1170 |
Global-to-Local
Dynamic Feature Aggregation for Unsupervised Person Re-identification |
Wei
Li (Fudan University); Jiayuan Fan (Fudan University)*; Yanwei Fu (Fudan
University) |
1341 |
RELATIVE
POSITION REPRESENTATION OVER INTERACTION SPACE FOR NATURAL LANGUAGE INFERENCE |
Huiyan
Wu (Shanghai Advanced Research Institute, Chinese Academy of Sciences,
Shanghai, China)*; Jun Huang (Shanghai Advanced Research Institute, Chinese
Academy of Sciences) |
|
|
|
P20 |
Immersive
media |
Time |
|
Chair |
Carl
James Debono (University of Malta) |
ID |
Title |
Author |
62 |
Rotation
Transformation Network: Learning View-Invariant Point Cloud for
Classification and Segmentation |
Shuang
Deng (NLPR-IA-CAS)*; Bo Liu (NLPR-IA-CAS); Qiulei Dong (NLPR-IA-CAS); Zhanyi
Hu (National Laboratory of Pattern Recognition, Institute of Automation,
Chinese Academy of Sciences) |
172 |
HIGH-RESOLUTION
MULTI-VIEW STEREO WITH DYNAMIC DEPTH EDGE FLOW |
Kui
Lin (Ningxia University)*; Lei Li (Ningxia University); Jianjun Zhang (Ningxia
University); Xing Zheng (Ningxia University); Suping Wu (Ningxia University) |
1363 |
OlaNet:
Self-supervised 360° Depth Estimation with Effective Distortion-Aware View
Synthesis and L1 Smooth Regularization |
Ziye
Lai (Fuzhou University); Dan Chen (Fuzhou University); Kaixiong Su (Fuzhou
University)* |
988 |
THE
IMPACT OF BLACK EDGE ARTIFACT ON USER EXPERIENCE FOR THE INTERACTIVE CLOUD VR
SERVICES |
Jiarun
Song (Xidian University)*; Jianquan Zhou (Xidian University); Xionghui Mao
(Xidian University); FuZheng Yang (Xidian University) |
1313 |
GRAPH
ATTENTION-BASED DEEP NEURAL NETWORK FOR 3D POINT CLOUD PROCESSING |
Xun
Li (Hefei University of Technology); Feng Xue (Hefei University of
Technology); Chao Chen (Hefei University of Technology); Xiaohui Yuan
(University of North Texas); Qiang Lu (Hefei University of Technology)* |
|
|
|
|
|
|
P21 |
Multimedia
security, privacy and forensics II |
Time |
|
Chair |
Liang
He (Tsinghua University) |
ID |
Title |
Author |
734 |
REVERSIBLE
PRIVACY-PRESERVING RECOGNITION |
Zhengxin
You (Fudan University); Sheng Li (Fudan University); Zhenxing Qian (School of
Computer Science, Fudan University)*; Xinpeng Zhang (School of Computer
Science, Fudan University) |
958 |
IMPROVED
LIGHTCNN WITH ATTENTION MODULES FOR ASV SPOOFING DETECTION |
Xinyue
Ma (Tsinghua University)*; Tianyu Liang (Tsinghua University); Shanshan Zhang
(Tencent Research); Shen Huang (Tencent Research); Liang HE (Tsinghua
University) |
972 |
Undetectable
Adversarial Examples based on Microscopical Regularization |
Nan
Zhong (Fudan University); Zhenxing Qian (School of Computer Science, Fudan
University); Xinpeng Zhang (School of Computer Science, Fudan University)* |
1358 |
A
Multi-factor Combinations Enhanced Reversible Privacy Protection System for
Facial Images |
Yi-Lun
Pan (National Taiwan University )*; Jun-Cheng Chen (Academia Sinica); Ja-Ling
Wu (NTU) |
842 |
A
Semantic-enhanced Method Based on Deep SVDD for Pixel-wise Anomaly Detection |
Chuanfei
Hu (University of Shanghai for Science and Technology)*; Kai Chen (University
of Shanghai for Science and Technology); Hang Shao (University of Shanghai
for Science and Technology) |
1609 |
Robust
Cross-Scene Foreground Segmentation in Surveillance Video |
Dong
Liang (Nanjing University of Aeronautics and Astronautics)*; Zongqi Wei
(Nanjing University of Aeronautics and Astronautics); Han Sun (NUAA); Huiyu
Zhou (University of Leicester) |
|
|
|
P22 |
Multimedia
Applications I |
Time |
|
Chair |
Wenhan
Yang (Peking University) |
ID |
Title |
Author |
472 |
A
Generative Model for Partial Label Learning |
Yan
Yan (Northwestern Polytechnical University)*; Shining Li (Northwestern
Polytechnical University) |
1494 |
Asymmetric
Loss for Positive-unlabeled Learning |
Cong
Wang (East China Normal University); Jian Pu (Fudan University)*; Zhi Xu
(East China Normal University); Junping Zhang (Fudan University) |
700 |
ADAPTIVE
FLEXIBLE 3D HISTOGRAM WATERMARKING |
Chen
Hui (Harbin Institute of Technology); Shaohui Liu (Harbin Institute of
Technology)*; Wenxue Cui (Harbin Institute of Technology); Jinghua Zeng
(Harbin Institute of Technology); Feng Jiang (Harbin Institute of Technology,
Harbin); Debin Zhao (Harbin Institute of Technology) |
344 |
OVERSAMPLING
BY A CONSTRAINT-BASED CAUSAL NETWORK IN MEDICAL IMBALANCED DATA
CLASSIFICATION |
Hao
Luo (Chongqing University ); Jun Liao (Chongqing University); Xuewen Yan
(Chongqing University); Li Liu (Chongqing University)* |
1430 |
Depth
Super-Resolution by Texture-Depth Transformer |
Chao
Yao (University of Science and Technology, Beijing); Shuaiyong Zhang (Beijing
Jiaotong University); Mengyao Yang (China Aerospace Academy of Systems
Science and Engineering); Meiqin Liu (Beijing Jiaotong University)*; Junpeng
Qi (China Aerospace Academy of Systems Science and Engineering) |
|
|
|
|
|
|
P23 |
Image/video
synthesis and creation |
Time |
|
Chair |
Chau-Wai
Wong (North Carolina State University) |
ID |
Title |
Author |
537 |
SAFIN:
ARBITRARY STYLE TRANSFER WITH SELF-ATTENTIVE FACTORIZED INSTANCE
NORMALIZATION |
Aaditya
Singh (Indian Institute of Technology Kanpur); Shreeshail Suhas Hingane
(Indian Institute of Technology Kanpur)*; Xinyu Gong (University of Texas at Austin); Zhangyang
Wang (University of Texas at Austin) |
830 |
Label-free
Regional Consistency for Image-to-Image Translation |
Shaohua
Guo (Shanghai Jiao Tong University)*; Qianyu Zhou (Shanghai Jiao Tong
University); Zhou Ye (CLS Fintech); Qiqi Gu (Shanghai Jiao Tong University );
Junshu Tang (Shanghai Jiao Tong
University); Zhengyang Feng (Shanghai Jiao Tong University); Lizhuang Ma
(Shanghai Jiao Tong University) |
864 |
Reinforcement
Learning Based Automatic Personal Mashup Generation |
Panwen
Hu (The Chinese University of Hong Kong,Shenzhen)*; Liu Jiazhen (The Chinese
University of Hong Kong, Shenzhen); Tianyu Cao (The Chinese University of
Hong Kong, Shenzhen); Rui Huang (The Chinese University of Hong Kong,
Shenzhen) |
1144 |
DEEP
SUPERVISED IMAGE RETARGETING |
Yijing
Mei (Tianjin University); Xiaojie Guo (Tianjin University); Di Sun (Tianjin
University of Science and Technology); Gang Pan (Tianjin University)*; Jiawan
Zhang (Tianjin University) |
1243 |
Watermark
Faker: Towards Forgery of Digital Image Watermarking |
Ruowei
Wang (Sichuan university)*; Chenguo Lin (Sichuan University); Qijun Zhao
(Sichuan University); Feiyu Zhu (Sichuan University) |
1551 |
High-quality
Face Sketch Synthesis via Geometric Normalization and Regularization |
Xiang
Li (Hangzhou Dianzi University); Fei Gao (Hangzhou Dianzi University)*; Fei
Huang (Hangzhou Dianzi University) |
|
|
|
P24 |
Cross-modal
and multi-modal media analysis I |
Time |
|
Chair |
Jun
Wan (NLPR, CASIA) |
ID |
Title |
Author |
182 |
Efficient
Online Label Consistent Hashing for Large-Scale Cross-Modal Retrieval |
Jinhan
Yi (Huaqiao University)*; Xin Liu (Huaqiao University); Yiu-ming CHEUNG (Hong
Kong Baptist University); Xing Xu (University of Electronic Science and
Technology of China); Wentao Fan (Huaqiao University); Yi He (Huaqiao
University) |
197 |
Adversarial
Disentanglement and Correlation Network for RGB-Infrared Person
Re-identification |
Bingyu
Hu (University of Science and Technology of China)*; Jiawei Liu (University
of Science and Technology of China); Zheng-Jun Zha (University of Science and
Technology of China) |
285 |
LAVS:
A Lightweight Audio-visual Saliency Prediction Model |
dandan
zhu (Shanghai Jiao Tong University)*; Defang Zhao (Tongji University);
Xiongkuo Min (Shanghai Jiao Tong University); Tian Han (Stevens Institute of
Technology); Qiangqiang Zhou (Jiangxi Normal University); Shaobo Yu (East
China Normal University); yongqing chen ( Hainan Air Traffic Management
Sub-Bureau); Guangtao Zhai (Shanghai Jiao Tong University); Xiaokang Yang
(Shanghai Jiao Tong University of China) |
458 |
Distance
Restricted Transformer Encoder for Multi-label Classification |
Xiaomei
Wang (Fudan University); Yaqian Li (OPPO Research Institute)*; Tong Luo (OPPO
Research Institute); Yandong Guo (OPPO Research Institute); Yanwei Fu (Fudan
University); Xiangyang Xue (Fudan University) |
494 |
Learning
Controlled Semantic Embedding for Cross-Modal Retrieval |
Yong
Yang (Guangdong University of Technology); Min Meng (Guangdong University of
Technology)*; Jun Yu (HDU); Jigang Wu ( Guangdong University of Technology) |
570 |
A
Language Prior Based Focal Loss for Visual Question Answering |
Mingrui
Lao (Leiden University); Yanming Guo (National University of Defense
Technology)*; Yu Liu (Dalian University of Technology); Michael S Lew (Leiden
University) |
|
|
|
P25 |
Multimedia
activity analysis and understanding |
Time |
|
Chair |
Tsung-Wei
Huang (Dolby Labs) |
ID |
Title |
Author |
139 |
Complex
Action Segmentation in Compressed Videos |
Hongfeng
Han (Renmin University of China); Guoxing Yang (Renmin University of China);
Yuqi Huo (Renmin University of China); Zhiwu Lu (Renmin University of
China)*; Ji-Rong Wen (Renmin University of China) |
478 |
Temporal
Label Aggregation for Unintentional Action Localization |
Nuoxing
Zhou (Tsinghua University); Guangyi Chen (Tsinghua University); Jinglin Xu
(Tsinghua University); WEI-SHI ZHENG (Sun Yat-sen University, China); Jiwen
Lu (Tsinghua University)* |
1211 |
CONTEXT
DRIVEN NETWORK WITH BAYES FOR WEAKLY SUPERVISED TEMPORAL ACTION LOCALIZATION |
Jiaruo
Yu (Chongqing University)*; Yongxin Ge (Chongqing University); ziqiang li
(Chongqing university); Zhongming Chen (Chongqing University); xiaolei qin
(Chongqing University) |
1239 |
HIERARCHICAL
TRANSFORMER: UNSUPERVISED REPRESENTATION LEARNING FOR SKELETON-BASED HUMAN
ACTION RECOGNITION |
Yi-Bin
Cheng (Sun Yat-sen university)*; Xipeng Chen (Sun Yat-sen University);
Junhong Chen (Sun Yat-sen University); Pengxu Wei (Sun Yat-sen University);
Dongyu Zhang (Sun Yat-sen University); Liang Lin (Sun Yat-sen University) |
1487 |
Action
Category and Phase Consistency Regularization for High-quality Temporal
Action Proposal Generation |
Yushu
Liu (Harbin Institute of Technology, Weihai)*; Weigang Zhang (Harbin
Institute of Technology, Weihai); Guorong Li (University of Chinese Academy
of Sciences); Qingming Huang (University of Chinese Academy of Sciences) |
1500 |
Spatial-Temporal
Human-Object Interaction Detection |
Xu
Sun (Nanjing University); Yunqing He (Nanjing University); Tongwei Ren
(Nanjing University)*; Gangshan Wu (Nanjing University) |
|
|
|
P26 |
Emerging
multimedia applications and technologies |
Time |
|
Chair |
Austin
Shuai (National Chiao Tung University) |
ID |
Title |
Author |
1127 |
MSFC:
DEEP FEATURE COMPRESSION IN MULTI-TASK NETWORK |
Zhicong
Zhang (Huawei & Harbin Institute of Technology); Mengyang Wang (Huawei
& Harbin Institute of Technology); Mengyao Ma (Huawei); Jiahui Li
(Huawei); Xiaopeng Fan (Harbin Institute of Technology)* |
1250 |
Towards
GANs' Approximation Ability |
Xuejiao
Liu (Qian Xuesen Laboratory of Space Technology); Yao Xu (Qian Xuesen
Laboratory of Space Technology); Xueshuang Xiang (Qian Xuesen Laboratory of
Space Technology)* |
1485 |
FedNS:
Improving Federated Learning for collaborative image classification on mobile
clients |
Yaoxin
Zhuo (Arizona State University)*; baoxin Li (Arizona State University) |
1555 |
Universal
Adversarial Training with Class-Wise Perturbations |
Philipp
Benz (KAIST)*; Chaoning Zhang (KAIST); Adil Karjauv (KAIST); In So Kweon
(KAIST) |
1573 |
Heterogeneous
Federated Learning through Multi-branch Network |
Ching-Hao
Wang (NCTU); Kang-Yang Huang (National Chiao Tung University)*; Jun-Cheng
Chen (Academia Sinica); Hong-Han Shuai (National Chiao Tung University);
Wen-Huang Cheng (National Chiao Tung University) |
1159 |
MotionSnap:
A Motion Sensor-Based Approach for Automatic Capture and Editing of Photos
and Videos on Smartphones |
Adil
Karjauv (KAIST); Sanzhar
Bakhtiyarov (KAIST); Chaoning Zhang (KAIST)*; Jean-Charles Bazin
(KAIST); In So Kweon (KAIST) |
|
|
|
P27 |
Multimedia
Applications II |
Time |
|
Chair |
Xinggong
Zhang (Peking University) |
ID |
Title |
Author |
175 |
SHORT
VIDEO STREAMING WITH DATA WASTAGE AWARENESS |
Guanghui
Zhang (The Hong Kong Polytechnic University (PolyU)); Ke Liu (Chinese Academy
of Sciences); Haibo Hu (Hong Kong Polytechnic University)*; Jing Guo (Bank of
Communications) |
1202 |
PSTR:
Per-title encoding using Spatio-Temporal Resolutions |
Hadi
Amirpour (Alpen-Adria-Universität Klagenfurt)*; Christian Timmerer
(Alpen-Adria-Universität Klagenfurt); Mohammad Ghanbari (University of Essex) |
872 |
LEARNING
CONNECTED ATTENTIONS FOR CONVOLUTIONAL NEURAL NETWORKS |
Xu
Ma (University of North Texas)*; Jingda Guo (University of North Texas);
Sihai Tang (University of North Texas); Zhinan Qiao (University of North
Texas); Qi Chen (University of North Texas); Qing Yang (University of North
Texas); Song Fu (University of North Texas); Paparao Palacharla (Fujitsu
Network Communications); Nannan Wang (Fujitsu Network Communications); Xi
Wang (Fujitsu Network Communications) |
1026 |
Learned
Image Coding for Machines: A Content-Adaptive Approach |
Nam
H Le (Tampere University)*; Honglei Zhang (Nokia Technologies); Francesco
Cricri (Nokia Technologies); Ramin Ghaznavi Youvalari (Nokia Technologies);
Hamed Rezazadegan Tavakoli (Nokia Technologies); Esa Rahtu (Tampere
University) |
1095 |
Toward
Effective Automated Content Analysis via Crowdsourcing |
Jiele
Wu (BIT); Chau-Wai Wong (NC State University)*; Xinyan Zhao (UNC-CH);
Xianpeng Liu (NC State University) |
|
|
|
|
|
|
P28 |
Special
Session: Deep Learning for Multimedia Applications with Limited Supervision |
Time |
|
Chair |
Mang
Ye (Wuhan University) |
ID |
Title |
Author |
164 |
Both
Comparison and Induction are Indispensable for Cross-Domain Few-shot Learning |
Wang
Yuan (East China Normal University); TianXue Ma (East China Normal
University); Haichuan Song (East China Normal University); Yuan Xie (East
China Normal University); Zhizhong Zhang (East China Normal University);
Lizhuang Ma (East China Normal University)* |
166 |
CROSS-MODALITY
GRAPH NEURAL NETWORK FOR FEW-SHOT LEARNING |
ShuBao
Liu (East China Normal University)*; Yuan Xie (East China Normal University);
Wang Yuan (East China Normal University); Lizhuang Ma (East China Normal
University) |
185 |
Improving
weakly supervised object localization by uncertainty estimation of pseudo
supervision |
Xi
Chen (Sun Yat-sen University); Andy J Ma (Sun Yat-sen University)*; Nanxi Guo
(Sun Yat-sen University); Jiajia Chen (Sun Yat-sen University) |
203 |
Weakly-Supervised
Image Semantic Segmentation Using Graph Convolutional Networks |
Shun-Yi
Pan ( National Chiao Tung University); Cheng-You Lu (National Yang Ming Chiao
Tung University); Shih-Po Lee (National Chiao Tung University); Wen-Hsiao
Peng (National Chiao Tung University)* |
216 |
Attention-Guided
Semantic Hashing for Unsupervised Cross-Modal Retrieval |
Xiao
Shen (Nanjing University of Science and Technology); Haofeng Zhang (Nanjing
University of Science and Technology)*; Lunbo Li ( Nanjing University of
Science and Technology); Li Liu (the inception institute of artificial intelligence) |
233 |
Semi-DerainGAN:
A New Semi-supervised Single Image Deraining Network |
Yanyan
Wei (Hefei University of Technology); Zhao Zhang (Hefei University of
Technology)*; Yang Wang (Hefei University of Technology); Haijun Zhang
(Harbin Institute of Technology (Shenzhen)); Mingbo Zhao (Donghua
University); Mingliang Xu (Zhengzhou University); Meng Wang (Hefei University
of Technology) |
|
|
|
P29 |
Multimedia
representation learning |
Time |
|
Chair |
Hongxing
Wang (Chongqing University) |
ID |
Title |
Author |
12 |
Intra-Class
Uncertainty Loss Function for Classification |
He
Zhu ( Brainnetome Center and NLPR; School of Future Technology, UCAS;
University of Chinese Academy of Sciences; Institute of Automation, Chinese
Academy of Sciences)*; Shan Yu (Brainnetome Center and NLPR;University of
Chinese Academy of Sciences;CAS Center for Excellence in Brain Science and
Intelligence Technology, Chinese Academy of Sciences;) |
198 |
PROGRESSIVE
DISTRIBUTION ALIGNMENT BASED ON LABEL CORRECTION FOR UNSUPERVISED DOMAIN
ADAPTATION |
Yong
Li (Shenzhen University); desheng Li (Shenzhen University)*; Yuwu Lu
(Shenzhen University); can gao (Shenzhen University); wenjing Wang (Shenzhen
University); Jianglin Lu (Shenzhen University) |
240 |
LPCC-Net:
RGB Guided Local Point Cloud Completion for Outdoor 3D Object Detection |
Yufei
Wei (Shanghai Em-Data Technology Co., Ltd. ); Yao Xiao (Shanghai Em-Data
Technology Co., Ltd.); Yibo Guo (Shanghai Em-Data Technology Co., Ltd.);
Shichao Liu (Shanghai EM-data Technology Co., Ltd.); Lin Xu (Shanghai Em-Data
Technology Co., Ltd.)* |
505 |
VISUAL
AND SEMANTIC FEATURE COORDINATED BI-LSTM MODEL FOR UNSUPERVISED VIDEO
SUMMARIZATION |
Zhiqiang
Hong (Central China Normal University); Rui Zhong (Central China Normal
University)* |
634 |
Semi-Supervised
Few-Shot Learning with Pseudo Label Refinement |
Pan
Li (Queen Mary University of London)*; Guile Wu (Queen Mary University of
London); Shaogang Gong (Queen Mary University of London); Xu Lan (Queen Mary
University of London) |
788 |
MGARL:
Multiple Graph Adversarial Regularized Learning |
Ziyan
Zhang (Anhui University); Bo Jiang (Anhui University)*; Bin Luo (Anhui
University) |
|
|
|
P30 |
Speech/audio
synthesis and coding |
Time |
|
Chair |
Kong
Aik Lee (Institute for Infocomm Research, A*STAR) |
ID |
Title |
Author |
565 |
MULTI-SCALE
GATED ATTENTION FOR WEAKLY LABELLED SOUND EVENT DETECTION |
HOU
ZHENWEI (CQU)*; Yang Liping (CQU) |
1483 |
Cross-Language
Transfer Learning and Domain Adaptation for End-to-End Automatic Speech
Recognition |
Jian
Luo (Ping An Technology (Shenzhen) Co., Ltd.)*; Jianzong Wang (Ping An
Technology (Shenzhen) Co., Ltd); Ning Cheng (Ping An Technology (Shenzhen)
Co., Ltd); Edward Xiao (Ping An Technology (Shenzhen) Co., Ltd); Jing Xiao
(Ping An Insurance (Group) Company of China); Georg Kucsko (Kensho); Patrick
O'Neill (Kensho); Jagadeesh Balam (NVIDIA); Slyne Deng (NVIDIA); Adriana
Flores (NVIDIA); Boris Ginsburg (NVIDIA); Jocelyn Huang (NVIDIA); Oleksii
Kuchaiev (NVIDIA); Vitaly Lavrukhin (NVIDIA); Jason Li (NVIDIA) |
1601 |
SPEECH
SYNTHESIS OF CHINESE BRAILLE WITH LIMITED TRAINING DATA |
Jianguo
Mao (Institute of Computing Technology, Chinese Academy of Sciences)*;
Jingwen Zhu (Institute of Computing Technology, Chinese Academy of Sciences);
Xiangdong Wang (Institute of Computing Technology, Chinese Academy of
Sciences); Hong Liu (Institute of Computing Technology, Chinese Academy of
Sciences); Yueliang Qian (Institute of Computing Technology, Chinese Academy
of Sciences) |
636 |
A
Result based Portable Framework for Spoken Language Understanding |
Lizhi
Cheng (Shanghai Jiaotong University )*; Wenmian Yang (National University of
Singapore); Weijia Jia (Institute of AI and Future Networks, Beijing Normal
University (Zhuhai); BNU-HKBU United International College, Zhuhai, PR China;
Shanghai Jiao Tong University) |
957 |
Spiker-Converter:
A Semi-supervised Framework for Low-resource Speech Recognition with Stable
Adversarial Training |
Cheng
Yi (University of Chinese Academy of Sciences)*; Bo Xu (Institute of
Automation, Chinese Academy of Sciences) |
|
|
|
|
|
|
P31 |
Image/video
acquisition, compression, and procesing |
Time |
|
Chair |
Wenhan
Yang (Peking University) |
ID |
Title |
Author |
1512 |
Unsupervised
HDR Image Reconstruction Based on Over/Under-Exposed LDR Image Pair |
Hao
Wang (Rochester Institute of Technology); Zhang Tao (Tianjin University);
Guoyu Lu (Rochester Institute of Technology)* |
875 |
Machine
Learning-based Rate Distortion Modeling for VVC/H.266 Intra-Frame |
Miaohui
Wang (Shenzhen University); Jialin Zhang (Shenzhen University)*; Lirong Huang
(Shenzhen University); Jian Xiong (Univeristy of Posts and
Telecommunications) |
1066 |
SEMI-SUPERVISED
LEARNING BY EXPLOITING UNLABELED DATA CORRELATIONS IN A DUAL-BRANCH NETWORK |
Jie
Ling (Sun Yat-sen University); Meng Yang (Sun Yat-sen University)* |
1507 |
Rate-Distortion
Optimized Hierarchical Deep Feature Compression |
Ademola
Ikusan (University of Cincinnati)*; Rui Dai (University of Cincinnati) |
258 |
Consistent
Representation Learning across Modalities for Zero-Shot Image Recognition |
Yu
Wang (Tongji University)*; S-J Zhao (HaiBa Technology) |
355 |
Geometric
Transformation-based Network Ensemble
for Open-set Recognition |
Pramuditha
Perera (Amazon); Vishal Patel (Johns Hopkins University)* |
|
|
|
P32 |
Cross-modal
and multi-modal media analysis II |
Time |
|
Chair |
Raouf
Hamzaoui (De Montfort University) |
ID |
Title |
Author |
580 |
Detecting
Highlighted Video Clips via Emotion-enhanced Audio-Visual Cues |
Linkang
Hu (University of Science and Technology of China)*; Weidong He (University
of Science and Technology of China); Le Zhang (University of Science and
Technology of China); Tong Xu (University of Science and Technology of
China); Hui Xiong (the State
University of New Jersey); Enhong Chen (University of Science and Technology
of China) |
783 |
DEEP
FEATURE SELECTION-AND-FUSION FOR RGB-D SEMANTIC SEGMENTATION |
Yuejiao
Su (Northwestern Polytechnical University); Yuan Yuan ( Northwestern
Polytechnical University); Zhiyu Jiang (Northwestern Polytechnical
University)* |
834 |
CI-GAN:
Co-clustering by Information Maximizing Generative Adversarial Networks |
Jaejun
Lee (University of Waterloo)*; Chul Lee (Amazon); Tomasz Palczewski (Samsung
Research America) |
854 |
What
Matters: Attentive and Relational Feature Aggregation Network for Video-Text
Retrieval |
Xiaoshuai
Hao (Institute of information technology, Chinese Academy of Sciences); Yucan
Zhou (Chinese Academy of Sciences)*; Dayan Wu (Institute of Information
Engineering, Chinese Academy of Sciences); Wanqian Zhang (Institute of
Information Engineering, Chinese Academy of Sciences); Bo Li ( Institute of
Information Engineering, Chinese Academy of Sciences); Weiping Wang
(Institute of Information Engineering, CAS, China); Dan Meng (Institute of
Information Engineering, CAS) |
1220 |
Cognition-driven
Real-time Personality Detection via Language-guided Contrastive Visual
Attention |
Xiaoya
Gao (Soochow University); Jingjing Wang ( Soochow University)*; Shoushan Li
(Soochow University); Zhou Guodong (Soochow University) |
1280 |
Multi-Knowledge
Fusion Network for Generalized Zero-Shot Learning |
Hongxin
Xiang (Yunnan University); Cheng Xie (Yunnan University); Ting Zeng (Yunnan
University); yun yang (yunnan university)* |
|
|
|
P33 |
Multimedia
semantic segmentation |
Time |
|
Chair |
Nikos
Nikolaidis (Aristotle University of Thessaloniki) |
ID |
Title |
Author |
238 |
GSVNet:
Guided Spatially-Varying Convolution for Fast Semantic Segmentation on Video |
Shih-Po
Lee (National Chiao Tung University); Si-Cun Chen (National Chiao Tung
University); Wen-Hsiao Peng (National Chiao Tung University)* |
292 |
Cross-Modal
Guidance for Hyperfluorescence Segmentation in Fundus Fluorescein Angiography |
Chuan
Zhou (School of Computer Science and Engineering, University of Electronic
Science and Technology of China); Tian Zhang (School of Computer Science and
Engineering, University of Electronic Science and Technology of China); Yang
Wen (School of Computer Science and Engineering, University of Electronic
Science and Technology of China); Leiting Chen (School of Computer Science
and Engineering, University of Electronic Science and Technology of China)*;
Lei Zhang (School of Computer Science and Engineering, University of
Electronic Science and Technology of China); Junjing Chen (School of Computer
Science and Engineering, University of Electronic Science and Technology of
China) |
572 |
DOBNet:
Dynamic Object Boundary-refinement Network for Real-time Instance
Segmentation |
Boxiang
Zhang (Jilin University); Yuanyuan Guan (Jilin University); Hongru Liu (Jilin
University); Wenhui Li (Jilin University)*; Ying Wang (Jilin University) |
610 |
Confident
Semantic Ranking Loss for Part Parsing |
Xin
Tan (Shanghai Jiao Tong University)*; Jiachen Xu (Shanghai Jiao Tong
University); Zhou Ye (CLS Fintech); Jinkun Hao (East China University of
Science and Technology); Lizhuang Ma (Shanghai Jiao Tong University) |
1014 |
Input-Output
Balanced Framework for Long-tailed LiDAR Semantic Segmentation |
Peishan
Cong (ShanghaiTech); Xinge Zhu (The Chinese University of Hong Kong); Yuexin
Ma (ShanghaiTech University)* |
1319 |
ARNet:
Active-Reference Network for Few-shot Image Semantic Segmentation |
Guangchen
Shi (Hohai University); Wu Yirui (Hohai University)*; Shivakumara
Palaiahnakote (University of Malaya); Umapada Pal (Indian Statistical Institute,
Kolkata); Tong Lu (Nanjing University) |
|
|
|
P34 |
Multimedia
interaction & Multimedia quality assessment |
Time |
|
Chair |
Arijit
Biswas (Dolby Labs) |
ID |
Title |
Author |
1530 |
A
JENSEN-SHANNON DIVERGENCE DRIVEN METRIC OF VISUAL SCANNING EFFICIENCY
INDICATES PERFORMANCE OF VIRTUAL DRIVING |
Zezhong
Lv (College of Intelligence and Computing, Tianjin University); Qing Xu
(College of Intelligence and Computing, Tianjin University)*; Klaus
Schoeffmann (Klagenfurt University); Simon Parkinson (University of
Huddersfield) |
1151 |
Multimodal
Disentangled Representation for Recommendation |
Xin
Wang (Tsinghua University)*; Hong Chen (Tsinghua University); Wenwu Zhu
(Tsinghua University) |
1149 |
Multi-view
Clustering Based on Self-Weighted High-order Similarity Fusion |
Hong
Peng (South China University of Technology); Hongmin Cai (South China
University of Technology)* |
234 |
Spatial
Attention-based Non-reference Perceptual Quality Prediction Network for
Omnidirectional Images |
Li
Yang (Beihang university)*; Mai Xu (BUAA); Xin Deng (Beihang university); Bo
Feng (Columbia University ) |
396 |
Neurophysiological
Assessment of Image Quality from EEG Using Persistent Homology of Brain
Network |
Chang
Liu ( Communication University of Zhejiang); Xiaoyu Ma (Communication
University of Zhejiang ); Jiaojiao Wang (Communication University of
Zhejiang ); Jiefang Zhang (Communication University of Zhejiang ); Honggang
Zhang (Zhejiang University); Songyun Xie (Northwestern Polytechnical
University); Dingguo Yu (Communication University of Zhejiang)* |
122 |
Blind
Quality Assessment of Night-Time Images via Weak Illumination Analysis |
Miaohui
Wang (Shenzhen University)*; Yijing Huang (Shenzhen University); Jialin Zhang
(Shenzhen University) |
|
|
|