Skip to content

Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion

License

Notifications You must be signed in to change notification settings

HELL-TO-HEAVEN/PUDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PUDA

This is a demenstrative implementation of our IJCAI 2022 paper Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion (PUDA).

Abstract

Most real-world knowledge graphs (KG) are far from complete and comprehensive. This problem has motivated efforts in predicting the most plausible missing facts to complete a given KG, i.e., knowledge graph completion (KGC). However, existing KGC methods suffer from two main issues, 1) the false negative issue, i.e., the sampled negative training instances may include potential true facts; and 2) the data sparsity issue, i.e., true facts account for only a tiny part of all possible facts. To this end, we propose positive-unlabeled learning with adversarial data augmentation (PUDA) for KGC. In particular, PUDA tailors positive-unlabeled risk estimator for the KGC task to deal with the false negative issue. Furthermore, to address the data sparsity issue, PUDA achieves a data augmentation strategy by unifying adversarial training and positive-unlabeled learning under the positive-unlabeled minimax game. Extensive experimental results on real-world benchmark datasets demonstrate the effectiveness and compatibility of our proposed method.

Requirements

  • python == 3.8.5
  • torch == 1.8.1
  • numpy == 1.19.2
  • pandas == 1.0.1
  • tqdm == 4.61.0

Run

Tunable hyperparameters:

  • num_ng: number of corrupted unlabeled samples for each positive sample
  • num_ng_gen: number of generated unlabeled samples for each positive sample
  • bs: batchsize
  • emb_dim: embedding dimension
  • lrd: learning rate of the discriminator
  • lrg: learning rate of the generator
  • prior: positive prior
  • reg: l2 regularization ratio
  • gen_drop: dropout ratio of the generator
  • gen_std: standard deviation of the input noise of the generator

Auxilliary configurations:

  • data_root: your preferred directory to store data
  • save_path: your prefered directory to save data
  • seed: random seed
  • gpu: an available gpu id
  • verbose: to show the progress bar or not
  • max_epochs: maximum training epochs
  • tolerance: number of validations until executing early stop
  • valid_interval: number of epochs between validations
  • early_stop_metric: h10 or mrr

Please run the code via:

nohup python PUDA.py --data_root your_data_root --config_name config_value >your_log_file 2>&1 &

Reference

@inproceedings{ijcai2022p312,
  title     = {Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion},
  author    = {Tang, Zhenwei and Pei, Shichao and Zhang, Zhao and Zhu, Yongchun and Zhuang, Fuzhen and Hoehndorf, Robert and Zhang, Xiangliang},
  booktitle = {Proceedings of the Thirty-First International Joint Conference on
               Artificial Intelligence, {IJCAI-22}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Lud De Raedt},
  pages     = {2248--2254},
  year      = {2022},
  month     = {7},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2022/312},
  url       = {https://doi.org/10.24963/ijcai.2022/312},
}
@article{tang2022positive,
  title={Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion},
  author={Tang, Zhenwei and Pei, Shichao and Zhang, Zhao and Zhu, Yongchun and Zhuang, Fuzhen and Hoehndorf, Robert and Zhang, Xiangliang},
  journal={arXiv preprint arXiv:2205.00904},
  year={2022}
}

About

Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages