Transparent Peer Review By Scholar9
ANDROID MALWARE DETECTION MODEL USING LOGISTIC REGRESSION
Abstract
The proliferation of Android malicious applications dangerously injure users information, property, and privacy. Aiming at the problem that the characteristics of malware dynamic analysis and detection aren’t excellent, and the detection efficiency and classifier performance are insufficient, this paper proposes a multi-dimensional feature fusion malicious application detection method based on Logistic Regression .The method provided non-invasively extract the framework layer Application Programming Interface(API) call information of the Android application, apply the Logistic Regression to train the N-gram modelled API call sequence, fuse the obtained probability feature with the basic statistical feature, The experimental results show that the method effectively improves the accuracy of Android malicious application detection and decreases the time expense of it.
Shyamakrishna Siddharth Chamarthy Reviewer
10 Oct 2024 06:29 PM
Approved
Relevance and Originality
The research addresses a significant and timely issue in cybersecurity, particularly concerning the increasing prevalence of Android malicious applications that threaten user privacy and data integrity. The originality of the study lies in its proposed multi-dimensional feature fusion method, which combines various data characteristics for more effective malware detection. By focusing on the fusion of API call information and statistical features, the study offers a fresh perspective on enhancing detection methodologies, which is crucial in an era where traditional detection techniques are becoming less effective against sophisticated malware.
Methodology
The methodology outlined in the study is sound and well-defined. The approach of non-invasively extracting API call information from Android applications is particularly noteworthy, as it minimizes the impact on user experience while maintaining effective detection capabilities. Utilizing Logistic Regression in conjunction with an N-gram model to analyze API call sequences is an innovative strategy that leverages statistical learning methods for classification. However, further elaboration on the feature selection process and criteria for determining which features are included in the analysis would enhance the robustness of the methodology. Additionally, discussing any limitations or assumptions made during the implementation could provide a more comprehensive understanding of the approach.
Validity & Reliability
The validity of the proposed method is reinforced by the experimental results demonstrating improved accuracy in detecting malicious applications. The use of Logistic Regression is appropriate given its simplicity and efficiency, particularly in handling large datasets. However, to further establish reliability, the study could benefit from including a comparative analysis with other existing detection methods. This would not only highlight the advantages of the proposed method but also address any potential weaknesses. Moreover, it would be beneficial to discuss the datasets used for training and testing to ensure that the findings are applicable across diverse scenarios and types of malicious applications.
Clarity and Structure
The article is generally well-structured, with a clear flow from the introduction of the problem to the presentation of the proposed solution and results. Key concepts are explained sufficiently, making the research accessible to readers with varying levels of expertise in cybersecurity. To enhance clarity, the inclusion of visual representations, such as diagrams depicting the feature fusion process or flowcharts illustrating the detection workflow, would be advantageous. Additionally, summarizing the main findings in a separate section could reinforce the key contributions of the research for the reader.
Result Analysis
The results of the study indicate a notable improvement in the accuracy of Android malicious application detection, which is a significant contribution to the field of cybersecurity. The discussion around the reduction of time expense in detection is also relevant, as it highlights the practical implications of the research for real-time application in security systems. However, the result analysis could be strengthened by providing more detailed metrics or statistical analyses that quantify the improvements achieved through the proposed method. Including case studies or specific examples of detected malicious applications could further illustrate the effectiveness of the method in practical scenarios. Additionally, addressing potential challenges in implementing this method in real-world applications would provide a more balanced perspective on its applicability.
IJ Publication Publisher
Done Sir
Shyamakrishna Siddharth Chamarthy Reviewer