In the neutral hydrogen (H ) galaxy survey, a significant challenge is to identify and extract the H galaxy signal from the observational data contaminated by radio frequency interference (RFI). For a drift-scan survey, or more generally a survey of a spatially continuous region, in the time-ordered spectral data, the H galaxies and RFI all appear as regions that extend an area in the time-frequency waterfall plot, so the extraction of the H galaxies and RFI from such data can be regarded as an image segmentation problem, and machine-learning methods can be applied to solve such problems. In this study, we develop a method to effectively detect and extract signals of H galaxies based on a Mask R-CNN network combined with the PointRend method. By simulating FAST-observed galaxy signals and potential RFI impact, we created a realistic data set for the training and testing of our neural network. We compared five different architectures and selected the best-performing one. This architecture successfully performs instance segmentation of H galaxy signals in the RFI-contaminated time-ordered data, achieving a precision of 98.64% and a recall of 93.59%.
It accepts original submissions from all over the world and is internationally published and distributed by IOP