著者
Dik Lun Lee, Chun-Wu Leng
タイトル
Partitioned signature files: design issues and performance evaluation
ページ
158-180
日時
April 1989
概要
A signature file acts as a filtering mechanism to reduce the amount of text that needs to be searched for a query. Unfortunately, the signature file itself must be exhaustively searched, resulting in degraded performance for a large file size. The authors use a deterministic algorithm to divide a signature file into partitions, each of which contains signatures with the same 'key'. The signature keys in a partition can be extracted and represented as the partition's key. The search can then be confined to the subset of partitions whose keys match the query key. The main concern here is to study methods for obtaining the keys and their performance in terms of their ability to reduce the search space. Owing to the reduction of search space, partitioning a signature file has a direct benefit in a sequential search (single-processor) environment. In a parallel environment search can be conducted in parallel effectively by allocating one or more partitions to a processor. Partitioning the signature file with a deterministic method (as opposed to a random partitioning scheme) provides intraquery parallelism as well as interquery parallelism. They outline the criteria for evaluating partitioning schemes. Three algorithms are described and studied. An analytical study of the performance of the algorithms is provided and the results are verified with simulation
コメント
シグナチャを検索するのに時間がかかるので分割統治で検 索する。シグナチャの一部分をキーとして別に持っておき、 検索シグナチャはまずキーと照合してから本体と照合する。 キーの選び方として何種類か比較している。
概要
情報の追加のときシグナチャだけでなくキー(ファイル?) も変更しなければならないような気がするが?
カテゴリ
Signature
Category: Signature
Journal: ACM Transactions on Information Systems
Comment: シグナチャを検索するのに時間がかかるので分割統治で検
        索する。シグナチャの一部分をキーとして別に持っておき、
        検索シグナチャはまずキーと照合してから本体と照合する。 
        キーの選び方として何種類か比較している。
Abstract: A signature file acts as a filtering mechanism to
        reduce the amount of text that needs to be searched
        for a query. Unfortunately, the signature file
        itself must be exhaustively searched, resulting in
        degraded performance for a large file size. The
        authors use a deterministic algorithm to divide a
        signature file into partitions, each of which
        contains signatures with the same 'key'. The
        signature keys in a partition can be extracted and
        represented as the partition's key. The search can
        then be confined to the subset of partitions whose
        keys match the query key. The main concern here is
        to study methods for obtaining the keys and their
        performance in terms of their ability to reduce the
        search space. Owing to the reduction of search
        space, partitioning a signature file has a direct
        benefit in a sequential search (single-processor)
        environment. In a parallel environment search can be
        conducted in parallel effectively by allocating one
        or more partitions to a processor. Partitioning the
        signature file with a deterministic method (as
        opposed to a random partitioning scheme) provides
        intraquery parallelism as well as interquery
        parallelism. They outline the criteria for
        evaluating partitioning schemes. Three algorithms
        are described and studied. An analytical study of
        the performance of the algorithms is provided and
        the results are verified with simulation
Number: 2
Bibtype: Article
Author: Dik Lun Lee
        Chun-Wu Leng
Pages: 158-180
Month: apr
Title: Partitioned signature files: design issues and
        performance evaluation
Comment1: 情報の追加のときシグナチャだけでなくキー(ファイル?)
        も変更しなければならないような気がするが?
Year: 1989
Volume: 7
Keyword: database management systems, file organisation,
        information retrieval, performance evaluation,
        information retrieval, performance evaluation,
        signature file, filtering mechanism, degraded
        performance, large file size, deterministic
        algorithm, search space, sequential search, parallel
        environment, intraquery parallelism, interquery
        parallelism