On an Information Theoretic Approximation Measure for Functional Dependencies

Loading...
Thumbnail Image
Can’t use the file because of accessibility barriers? Contact us

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

We investigate the problem of defining an approximation measure for functional dependencies (FDs). For fixed sets of attributes, X and Y, an approximation measure is a function which maps relation instances to real numbers. The number to which an instance is mapped, intuitively, describes the strength of the dependency, X --> Y, in that instance. We define an approximation measure for FDs based on a connection between Shannon's information theory and relational database theory. Our measure is normalized to lie between zero and one (inclusive), and maps a relation instance to zero if and only if X --> Y holds in the instance. Hence, the smaller the number to which an instance is mapped, the ``closer'' X --> Y is to being an FD in the instance. To put our measure in context, we compare it to a slight variation of a measure previously defined by Kivinen and Mannila, g_3. We denote the variation as \hat{g_3}, although, our results, essentially, apply unchanged to g_3. The purpose of comparing our measure with \hat{g_3} is to develop a deeper understanding of not only our measure, but also, \hat{g_3}. Moreover, we gain a deeper understanding of the natural intuitive notion of an approximate FD. We observe that our measure and \hat{g_3} agree at their extremes but are quite different in-between. As a result, we conclude that our measure and \hat{g_3} are significantly different. An interesting question emerges from this conclusion: is there a rigorous way to determine when one measure better captures the meaning of the degree to which an FD is approximate?

Table of Contents

Description

Keywords

Citation

Journal

DOI

Link(s) to data and video for this item

Relation

Rights

This work is protected by copyright unless stated otherwise.

Type