Applying Hybrid Clustering in Pulsar Candidate Sifting with Multi-modality for FAST Survey

Zi-Yi You, Yun-Rong Pan, Zhi Ma, Li Zhang, Shuo Xiao, Dan-Dan Zhang, Shi-Jun Dang, Ru-Shuang Zhao, Pei Wang, Ai-Jun Dong et al


Pulsar search is always the basis of pulsar navigation, gravitational wave detection and other research topics. Currently, the volume of pulsar candidates collected by the Five-hundred-meter Aperture Spherical radio Telescope (FAST) shows an explosive growth rate that has brought challenges for its pulsar candidate filtering system. Particularly, the multi-view heterogeneous data and class imbalance between true pulsars and non-pulsar candidates have negative effects on traditional single-modal supervised classification methods. In this study, a multi-modal and semi-supervised learning based on a pulsar candidate sifting algorithm is presented, which adopts a hybrid ensemble clustering scheme of density-based and partition-based methods combined with a feature-level fusion strategy for input data and a data partition strategy for parallelization. Experiments on both High Time Resolution Universe Survey II (HTRU2) and actual FAST observation data demonstrate that the proposed algorithm could excellently identify pulsars: On HTRU2, the precision and recall rates of its parallel mode reach 0.981 and 0.988 respectively. On FAST data, those of its parallel mode reach 0.891 and 0.961, meanwhile, the running time also significantly decreases with the increment of parallel nodes within limits. Thus, we can conclude that our algorithm could be a feasible idea for large scale pulsar candidate sifting for FAST drift scan observation.


Key words: methods: data analysis – surveys – methods: numerical

