DATABASE STRUCTURE:
Note: I'm intentially avoiding commitments to linear or extended hash at
this point. The comments here are independent of that...
The cache information is stored in two sections:
Data file:
This is an append only file. It is readable by any program that has
file access permissions, but writable only by the server.
The data that is appended is verbatim data. For compressed data,
it is up to the write client to compress the data in any manner it
sees fit.
Index File:
This file is a hash file, and stores a constant length management structure.
It is directly accessible only by the server. The management component
contains constant length fields and is 32 bytes in length:
{
time_t entry_time; // in seconds.
uint32_t hash_value; // the hash value for entry.
uint64_t hash_id; // a key in bucket identifier.
loff_t begin_blob; // starting byte offset in data file.
loff_t end_blob; // ending byte + 1 in data file.
}
Some explanation is in order here as there is some novelty here.
In literature, hash tables contain the key, and our table does not.
The problem with storing the key is they can be variable and long.
This creates an number of inefficiencies. The proposal for this hash
table is to:
1) Store a hash_id. One of the main reasons to store the key is to serve
as an identifier for selection of the correct object from the hash bucket
or hash overflows. If we use an "orthogonal" hash function to the indexing
hash function to create a hash_id, and the size of hash_id is 64 bits,
then the odds that a pair of entries in a hash bucket are the same is
1/18,446,744,073,709,551,616. We see that it is more likely that a
meteor will land on the computer! So by introducing the tiniest amount
of error, we gain tremendous space saving as well as having the
algorithmic benifits of a constant field size structure. Also note that
comparing hash_id's is MUCH faster than string comparisons.
REMARK: In reality, it is _more_ likely that a transient error will be made
in a key string comparison in the computer hardware than comparing two 64 bit
numbers. So an argument can be made that this hash_id style of identifying
a key is actually less error prone then string comparing! That is, while
on a perfect (theoretical) computer there is the tiniest amount that it
is more error prone, on real computers with electrical spikes and what not,
the hash_id should be more secure.
2) Store the hash value itself. As we no longer store the key, we can't
recompute the hash_value when we split buckets. Thus we need to store the
hash_value itself. This costs a bit of space, but as the hash values
don't need to be recomputed, it speeds up the splitting process.
So using this alternate data structure, we
1) use 12 bytes instead of the MUCH larger storage space required for
key storage, saving storage.
2) have the benifits of a constant field structure,
3) have numerical comparisons which are much faster,
4) don't need to recompute hash values when splitting.
If our hash identifier is orthogonal to the hash value function, then
on a real machine we loose no accuracy.
Limitations:
1) As far as the database structure is concerned, there is no limit on the
size of an entry's key.
2) The size of the blob file is limited to 18,446,744,073,709,551,616 bytes.
3) The size of any blob is limited to 18,446,744,073,709,551,616 bytes.
4) The number of reasonable entries in the file system is 4,294,967,296,
which is more than the number of web pages estimated on the Internet.
This can easily be enlarged.
In a very real sense, as far as the database structure is concerned,
the (hash_value,hash_id) pair are the key.
To prepare access to the database, there is a library function
struct bcdb_key bcdb_make_key (
const char *key_string,
const size_t key_length
);
where we have
struct bcdb_key {
uint32_t hash_value; // the hash value for entry.
uint64_t hash_id; // a key in bucket identifier.
};
The key can be of any length and contain embedded zeros, but it must
be less in size than key_length.
If key_length == 0, then make_bcdb_key() assumes that you have passed
a zero terminated "c-string" string.
Note: A c++ version of make_bcdb_key() that accepts string objects will
also be provided.
Multiple instances of a key are not allowed in the database.
To be more precise, if key_string_1 yields bcdb_key_1 and is in the hash
table, and key_string_2 yields bcdb_key_2 == bcdb_key_1 is submitted,
then the record for key_string_1 is updated for the data submitted for
key_string_2, provided the submission for key_2 completed it's write
phase and the hash table gets updated.
|