Author(s):
Si, Hai-Ping ; Zhao, Wen-Rui ; Li, Ting-Ting ; Li, Fei-Tao ; Bação, Fernando ; Sun, Chang-Xia ; Li, Yan-Ling
Date: 2025
Persistent ID: http://hdl.handle.net/10362/180776
Origin: Repositório Institucional da UNL
Subject(s): infrared image; visible image; image fusion; encoder-decoder; multi-scale features; Atomic and Molecular Physics, and Optics
Description
Si, H-P., Zhao, W-R., Li, T-T., Li, F-T., Bação, F., Sun, C-X., & Li, Y-L. (2025). BDMFuse: Multi-scale network fusion for infrared and visible images based on base and detail features. Journal of Infrared and Millimeter Waves (Hongwai Yu Haomibo Xuebao), 44(2), 275-284. https://doi.org/10.11972/j.issn.1001-9014.2025.02.016 --- Supported by the Henan Province Key Research and Development Project(231111211300,the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005,Henan Province Key Research and Development Project(231111110100,Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006,and Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images. To meet these requirements, an autoencoder-based method for infrared and visible image fusion is proposed. The encoder designed according to the optimization objective consists of a base encoder and a detail encoder, which is used to extract low-frequency and high-frequency information from the image. This extraction may lead to some information not being captured, so a compensation encoder is proposed to supplement the missing information. Multi-scale decomposition is also employed to extract image features more comprehensively. The decoder combines low-frequency, high-frequency and supplementary information to obtain multi-scale features. Subsequently, the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction. Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception. --- 红外与可见光图像的融合结果应该突出红外图像的显著目标,保留可见光图像的纹理细节。为满足上述要求,提出一种基于自编码器的红外与可见光图像融合方法。编码器根据优化目标构建基础编码器和细节编码器,用于提取图像的低频信息与高频信息。这种提取方式可能会导致部分信息未被捕捉,因此提出补偿编码器来补充信息。同时,采取多尺度分解来更全面地提取图像特征。解码器将低频、高频和补充信息相加获取多尺度特征。随后,引入注意力策略与Fusion模块进行多尺度融合,实现图像重建。在三个数据集上的实验结果表明,该网络生成的融合图像能有效保留突出目标,同时更符合人类的视觉感知。