Skip to main content
Loading...
Scholar9 logo True scholar network
  • Login/Sign up
  • Scholar9
    Publications ▼
    Article List Deposit Article
    Mentorship ▼
    Overview Sessions
    Q&A Institutions Scholars Journals
    Publications ▼
    Article List Deposit Article
    Mentorship ▼
    Overview Sessions
    Q&A Institutions Scholars Journals
  • Login/Sign up
  • Back to Top

    Transparent Peer Review By Scholar9

    IMAGE CAPTIONING OF AN ENVIRONMENT USING MACHINE LEARNING ALGORITHSM

    Abstract

    This paper investigates the application of machine learning algorithms for automatic image captioning, focusing on a case study of Gwarzo Road in Kano, Nigeria. The research aims to design a robust VGG16/LSTM-based model that generates accurate and contextually relevant descriptions for images captured along the Kabuga to Bayero University Kano new site route. The methodology involves collecting images at three distinct times of the day (morning, afternoon, and evening) over 60 days, resizing and labelling them with relevant captions to build a comprehensive dataset. The VGG16 model, known for its efficiency in image processing, was employed for feature extraction, while the LSTM network was used to generate captions by interpreting the contextual and semantic details of the images. This study addresses key challenges in image captioning, such as localized object detection and generating meaningful textual descriptions, improving on existing datasets and models that often lack contextual relevance in specific environments. The expected outcomes of this research include the development of a precise caption generation model with high accuracy and efficiency. The resulting model achieved a BLEU score of 0.051, representing baseline performance in caption generation with partial alignment to human-generated references. Additionally, the model's highest accuracy based on the loss function reached 55%, while the lowest accuracy was 50%, with an average accuracy of 53%. The creation of a localized image database further enhances the significance of this research for future applications and studies in image captioning.

    Reviewer Photo

    Murali Mohana Krishna Dandu Reviewer

    badge Review Request Accepted
    Reviewer Photo

    Murali Mohana Krishna Dandu Reviewer

    28 Sep 2024 11:07 AM

    badge Approved

    Relevance and Originality

    Methodology

    Validity & Reliability

    Clarity and Structure

    Results and Analysis

    Relevance and Originality

    The research is highly relevant as it addresses the growing demand for automated image captioning, particularly in the context of localized environments. The originality lies in its focus on a specific case study in Kano, Nigeria, which contributes unique data and insights to the field. This localized approach can enhance the applicability of image captioning technologies in similar regions, making it a significant addition to existing literature.


    Methodology

    The methodology is well-structured, involving a comprehensive approach to image collection at different times of day to capture varied lighting and contextual conditions. The use of the VGG16 model for feature extraction, paired with an LSTM network for caption generation, is appropriate for the task. However, a more detailed description of the image labeling process, including the criteria for caption relevance, would enhance the understanding of how the dataset was prepared.


    Validity & Reliability

    The validity of the model is suggested through the reported BLEU score and accuracy metrics. However, achieving a BLEU score of 0.051 indicates that the model may not be performing effectively compared to human-generated captions. The research would benefit from additional validation methods, such as qualitative assessments of caption relevance by human judges or comparisons with other state-of-the-art models, to better assess reliability.


    Clarity and Structure

    The writing is generally clear, but the organization could be improved. Introducing distinct sections for methodology, results, and discussions would help streamline the narrative. Including headings and subheadings can enhance readability, allowing readers to navigate through the research more easily. Visual aids, such as flowcharts or examples of input images with generated captions, could further clarify the process and outcomes.


    Result Analysis

    While the paper provides some performance metrics, the analysis lacks depth. Discussing the implications of the BLEU score and accuracy results in the context of existing models would strengthen the evaluation. Additionally, identifying potential limitations of the current model, such as biases in the training dataset or challenges in understanding complex scenes, would provide a more comprehensive result analysis. Recommendations for future improvements or directions for further research would also be beneficial.

    Publisher Logo

    IJ Publication Publisher

    Thank You Sir

    Publisher

    IJ Publication

    IJ Publication

    Reviewer

    Murali Mohana

    Murali Mohana Krishna Dandu

    More Detail

    Category Icon

    Paper Category

    Computer Engineering

    Journal Icon

    Journal Name

    IJNRD - INTERNATIONAL JOURNAL OF NOVEL RESEARCH AND DEVELOPMENT External Link

    Info Icon

    p-ISSN

    Info Icon

    e-ISSN

    2456-4184

    Subscribe us to get updated

    logo logo

    Scholar9 is aiming to empower the research community around the world with the help of technology & innovation. Scholar9 provides the required platform to Scholar for visibility & credibility.

    QUICKLINKS

    • What is Scholar9?
    • About Us
    • Mission Vision
    • Contact Us
    • Privacy Policy
    • Terms of Use
    • Blogs
    • FAQ

    CONTACT US

    • +91 82003 85143
    • hello@scholar9.com
    • www.scholar9.com

    © 2026 Sequence Research & Development Pvt Ltd. All Rights Reserved.

    whatsapp