Al-Malla, Muhammad Abdelhadie, et al. “Pre-Trained CNNs As Feature-Extraction Modules for Image Captioning: An Experimental Study”. ELCVIA Electronic Letters on Computer Vision and Image Analysis, vol. 21, no. 1, May 2022, pp. 1-16, doi:10.5565/rev/elcvia.1436.