Non-rigid multi-modal image registration remains a challenging task due to the complex intensity relation between the two images to be registered. Especially in the presence of spatially varying intensity distortion, the conventional methods are unable to register the two images. Here, the proposed framework includes two steps. First, a structural representation of each image is computed, and then a conventional similarity metric such as mutual information (MI) or sum of squared distance (SSD) is used to register the two structural images. The structural image is obtained by utilizing a modified second-order entropy image and the gradient information. The proposed framework is tested on some data ranging from simulated to real data. Quantitative and qualitative results demonstrate that applying MI similarity metric on the proposed representation is capable of achieving high accuracy results.