在线咨询
中国工业与应用数学学会会刊
主管:中华人民共和国教育部
主办:西安交通大学
ISSN 1005-3085  CN 61-1269/O1

工程数学学报 ›› 2023, Vol. 40 ›› Issue (4): 511-522.doi: 10.3969/j.issn.1005-3085.2023.04.001

• •    下一篇

基于特征置信度的无源域自适应方法

王世鹏,   孙   剑,   徐宗本   

  1. 西安交通大学数学与统计学院,西安  710049
  • 收稿日期:2023-05-09 接受日期:2023-05-31 出版日期:2023-08-15 发布日期:2023-10-15
  • 基金资助:
    国家自然科学基金(12125104).

Source Free Domain Adaptation Based on Feature Structure

WANG Shipeng,  SUN Jian,  XU Zongben   

  1. School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an 710049
  • Received:2023-05-09 Accepted:2023-05-31 Online:2023-08-15 Published:2023-10-15
  • Supported by:
    The National Natural Science Foundation of China (12125104).

摘要:

由于隐私保护和数据安全等原因,传统的域自适应问题通常假设神经网络在向目标域迁移时源域数据是可读取的。假设并不总能被满足,为此提出一种无源域自适应方法,不需要读取源域数据,即可实现神经网络由源域向目标域的迁移。该方法将目标域数据依置信度的高低划分为两部分,并基于分而治之的策略设计伪标签。对于高置信度数据,直接将神经网络预测作为伪标签;低置信度数据的伪标签则由神经网络的预测和周围高置信度数据的标签共同决定,这一过程被建模为一个优化问题,由优化问题的解析解给出了低置信度数据的伪标签。为更好地估计低置信度数据的伪标签,利用在低置信度数据上的信息最大化损失促使这些数据特征具有很好的聚类结构;同时,在高置信度数据上使用自监督损失,使得高置信度数据尽可能均匀的分散在特征空间中,从而保证每一个低置信度数据周围都存在高置信度数据。实验结果表明,本文所提方法不仅超过了最新的无源域自适应方法的表现,还取得了优于传统的域自适应方法的表现。

关键词: 无源域自适应, 伪标签, 置信度

Abstract:

The goal of domain adaptation is transferring knowledge learned from the labeled source domain to the unlabeled target domain, where the distributions of data from the source domain and target domain are different. Prior domain adaptation methods typically assume data from the source domain are available when learning to adapt to the target domain, which might be not possible due to privacy issues and data security. In order to transfer knowledge to the target domain without accessing data from the source domain, a new source-free domain adaptation method called FCDC is developed in this paper. SFDC splits data from the target domain into two parts according to their confidence and estimates pseudo labels with different strategies for the two parts. For data of high confidence, their pseudo labels are the predictions of the neural network. For data of low confidence, their pseudo labels are guided by data of high confidence, which is modeled as an optimization problem, and the solution of the problem gives the pseudo label of data of low confidence. To make the pseudo labels of data of low confidence more reliable, FCDC utilizes information maximum loss on these data to produce well-behavior clusters. Meanwhile, FCDC takes advantage of self-supervision loss on data of high confidence to make the features of these data more diverse and surround data of low confidence. Experimental results show that the proposed FCDC is an effective method for source-free domain adaptation.

Key words: source free domain adaptation, pseudo-label, confidence

中图分类号: