A method for mining top-rank-k frequent closed itemsets
Mining frequent closed itemsets (FCIs) is important in mining
non-redundant (minimal) association rules. Therefore, many algorithms
have been developed for mining FCIs with reduced mining time and memory
usage. For mining FCIs, algorithms use the minimum support threshold,
minSup, to prune itemsets. However, using a fixed minSup is not suitable
for mining top-rank-k FCIs. A large threshold will lead to a small
number of generated FCIs, leading to insufficient FCIs to query when k
is large. On the other hand, a small minSup will generate a huge number
of generated FCIs, leading to large runtimes and high memory usage. In
this paper, we propose a method for mining top-rank-k FCIs without using
a fixed minimum support threshold. A strategy is first used to
eliminate 1-items that cannot generate FCIs belonging to top-rank-k
FCIs. Next, based on the set of candidate 1-items, we propose TRK-FCI, a
DCI-Plus-based algorithm, for mining top-rank-k FCIs. In the process of
mining top-rank-k FCIs, TRK-FCI automatically increases minSup
according to the mined FCIs, efficiently pruning itemsets that cannot
belong to top-rank-k FCIs. We also modify the dynamic bit vector (DBV)
structure and apply it to reduce memory usage and runtime in the
TRK-FCI-DBV algorithm. Experimental results show that TRK-FCI-DBV is
more efficient than TRK-FCI for various databases.
Title: | A method for mining top-rank-k frequent closed itemsets |
Authors: | Nguyen, Loan T. T. Trinh, Truc Nguyen Ngoc Thanh |
Keywords: | DCI-Plus;dynamic bit vectors,frequent closed itemsets;top-rank-k frequent closed itemsets |
Issue Date: | 2017 |
Publisher: | IOS PRESS, NIEUWE HEMWEG 6B, 1013 BG AMSTERDAM, NETHERLANDS |
Citation: | ISIKNOWLEDGE |
Abstract: | Mining frequent closed itemsets (FCIs) is important in mining non-redundant (minimal) association rules. Therefore, many algorithms have been developed for mining FCIs with reduced mining time and memory usage. For mining FCIs, algorithms use the minimum support threshold, minSup, to prune itemsets. However, using a fixed minSup is not suitable for mining top-rank-k FCIs. A large threshold will lead to a small number of generated FCIs, leading to insufficient FCIs to query when k is large. On the other hand, a small minSup will generate a huge number of generated FCIs, leading to large runtimes and high memory usage. In this paper, we propose a method for mining top-rank-k FCIs without using a fixed minimum support threshold. A strategy is first used to eliminate 1-items that cannot generate FCIs belonging to top-rank-k FCIs. Next, based on the set of candidate 1-items, we propose TRK-FCI, a DCI-Plus-based algorithm, for mining top-rank-k FCIs. In the process of mining top-rank-k FCIs, TRK-FCI automatically increases minSup according to the mined FCIs, efficiently pruning itemsets that cannot belong to top-rank-k FCIs. We also modify the dynamic bit vector (DBV) structure and apply it to reduce memory usage and runtime in the TRK-FCI-DBV algorithm. Experimental results show that TRK-FCI-DBV is more efficient than TRK-FCI for various databases. |
Description: | TNS07012 ; JOURNAL OF INTELLIGENT & FUZZY SYSTEMS Volume: 32 Issue: 2 Pages: 1297-1305 Published: 2017 |
URI: | http://repository.vnu.edu.vn/handle/VNU_123/28816 |
ISSN: | 1064-1246 1875-8967 |
Appears in Collections: | Bài báo của ĐHQGHN trong Web of Science |
Nhận xét
Đăng nhận xét