FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances

Loading...
Thumbnail Image
Can’t use the file because of accessibility barriers? Contact us

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Discovering functional dependencies (FDs) from an existing relation instance is an important technique in data mining and database design. To date, even the most efficient solutions are exponential in the number of attributes of the relation (n), even when the size of the output is not exponential in n. Lopes et al. developed an algorithm, Dep-Miner, that works well for large n on randomly-generated integer-valued relation instances. Dep-Miner first reduces the FD discovery problem to that of finding minimal covers for hypergraphs, then employs a levelwise search strategy to determine these minimal covers. Our algorithm, FastFDs, instead employs a depth-first, heuristic driven search strategy for generating minimal covers of hypergraphs. This type of search is commonly used to solve search problems in Artificial Intelligence (AI). Our experimental results indicate that the levelwise strategy that is the hallmark of many successful data mining algorithms is in fact significantly surpassed by the depth-first, heuristic driven strategy FastFDs employs, due to the inherent space efficiency of the search. Furthermore, we revisit the comparison between Dep-Miner and Tane, including FastFDs. We report several tests on distinct benchmark relation instances, comparing the Dep-Miner and FastFDs hypergraph approaches to Tane's partitioning approach for mining FDs from a relation instance. At the end of the paper we provide experimental data comparing FastFDs with a third algorithm, fdep of Flach and Savnik.

Table of Contents

Description

Keywords

Citation

Journal

DOI

Link(s) to data and video for this item

Relation

Rights

This work is protected by copyright unless stated otherwise.

Type