A Fuzzy Approach for Feature Evaluation and Dimensionality Reduction to Improve the Quality of Web Usage Mining Results

09/01/2015
by   Zahid Ansari, et al.
0

Web Usage Mining is the application of data mining techniques to web usage log repositories in order to discover the usage patterns that can be used to analyze the users navigational behavior. During the preprocessing stage, raw web log data is transformed into a set of user profiles. Each user profile captures a set of URLs representing a user session. Clustering can be applied to this sessionized data in order to capture similar interests and trends among users navigational patterns. Since the sessionized data may contain thousands of user sessions and each user session may consist of hundreds of URL accesses, dimensionality reduction is achieved by eliminating the low support URLs. Very small sessions are also removed in order to filter out the noise from the data. But direct elimination of low support URLs and small sized sessions may results in loss of a significant amount of information especially when the count of low support URLs and small sessions is large. We propose a fuzzy solution to deal with this problem by assigning weights to URLs and user sessions based on a fuzzy membership function. After assigning the weights we apply a Fuzzy c-Mean Clustering algorithm to discover the clusters of user profiles. In this paper, we describe our fuzzy set theoretic approach to perform feature selection (or dimensionality reduction) and session weight assignment. Finally we compare our soft computing based approach of dimensionality reduction with the traditional approach of direct elimination of small sessions and low support count URLs. Our results show that fuzzy feature evaluation and dimensionality reduction results in better performance and validity indices for the discovered clusters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2011

"Improved FCM algorithm for Clustering on Web Usage Mining"

In this paper we present clustering method is very sensitive to the init...
research
09/21/2019

Application of Fuzzy Clustering for Text Data Dimensionality Reduction

Large textual corpora are often represented by the document-term frequen...
research
12/17/2004

Web Usage Mining Using Artificial Ant Colony Clustering and Genetic Programming

The rapid e-commerce growth has made both business community and custome...
research
09/06/2011

An Efficient Preprocessing Methodology for Discovering Patterns and Clustering of Web Users using a Dynamic ART1 Neural Network

In this paper, a complete preprocessing methodology for discovering patt...
research
06/15/2022

"Why Here and Not There?" – Diverse Contrasting Explanations of Dimensionality Reduction

Dimensionality reduction is a popular preprocessing and a widely used to...
research
06/01/2020

Dimensionality Reduction for Sentiment Classification: Evolving for the Most Prominent and Separable Features

In sentiment classification, the enormous amount of textual data, its im...
research
06/20/2023

Mining Interest Trends and Adaptively Assigning SampleWeight for Session-based Recommendation

Session-based Recommendation (SR) aims to predict users' next click base...

Please sign up or login with your details

Forgot password? Click here to reset