DELETED www/html_node/Back_002dends.html Index: www/html_node/Back_002dends.html ================================================================== --- www/html_node/Back_002dends.html +++ www/html_node/Back_002dends.html @@ -1,106 +0,0 @@ - - - - -
--Previous: Tutorial, Up: Introduction [Contents][Index]
-By default, zeptodb uses the GNU dbm (GDBM) library to create and manipulate the DBM databases. -Alternatively, you may choose to use the -Kyoto Cabinet library -instead. This is specified by passing the ---with-kyotocabinet option to the configure script -before compiling zeptodb. -
-Note that databases created with these two different back-ends are -not compatible, thus databases created with Kyoto Cabinet can -only be accessed by zeptodb if it has been compiled with support for -the library. -
-Databases created with Kyoto Cabinet are required to have the -.kch file extension. By convention, databases created with -GDBM should have the .db file extension. -
-For most purposes, databases created with GDBM should be sufficient. -For particularly large data sets, however, Kyoto Cabinet is -preferred, since it can add values more quickly and has a much larger -upper limit on the database size. On the other hand, Kyoto Cabinet is -not as widely available in GNU/Linux distributions as GDBM so it often -must be installed manually. -
- - - - - Index: www/html_node/Commands.html ================================================================== --- www/html_node/Commands.html +++ www/html_node/Commands.html @@ -1,10 +1,10 @@ - - +Next: Copying This Manual, Previous: Introduction, Up: Top [Contents][Index]
Three commands are provided with zeptodb: zdbc
, for creating
+
Five commands are provided with zeptodb: zdbc
, for creating
databases, zdbs
for storing records in them, zdbf
,
-for fetching records, and zdbr
, for removing records.
+for fetching records, zdbr
, for removing records, and
+zdbi
for displaying information about a database.
• zdbc: | ||
• zdbs: | ||
• zdbf: | ||
• zdbr: | + | |
• zdbi: |
Previous: Copying This Manual, Up: Copying This Manual [Contents][Index]
Previous: Copying This Manual, Up: Top [Contents][Index]
zeptodb is a small collection of relatively tiny command-line tools for -interacting with DBM databases. For the uninitiated, DBM -databases are flat (non-relational) a databases; in other words, they -are persistent key-value hash tables. Typically they are created via a -library for C, Python, Perl, etc. These tools fill in a gap by providing -useful command-line tools. Some DBM libraries come with really basic -binaries for manipulating the databases, but they are not designed to be -very flexible or useful in the real world. +
zeptodb is a small collection of relatively tiny command-line tools +for interacting with DBM databases. DBM databases are flat +(non-relational) a databases; in other words, they are persistent +key-value hash tables. Typically they are created via a library for C, +Python, Perl, etc. These tools fill in a gap by providing useful +command-line tools. Some DBM libraries come with really basic binaries +for manipulating the databases, but they are not designed to be very +flexible or useful in the real world.
-These tools may be helpful in scripts, for example, when persistant data -storage is needed but when a full database would be overkill. DBM -databases offer a constant look-up time for any record in them, as +
These tools may be helpful in scripts, for example, when persistant +data storage is needed but when a full database would be overkill. +DBM databases offer a constant look-up time for any record in them, as opposed to, say, searching through a text file, which scales linearly -with the number of lines in the file. Thus, scripts requiring fast data -look-up would benefit greatly from them. These commands may also be -useful if, for whatever reason, one would like to manipulate, via the -command-line or scripts, DBM databases created by other programs. +with the number of lines in the file. Thus, scripts requiring fast +data look-up would benefit greatly from them (but note that, of +course, disk access is slower than memory access, so if you really +need the performance and you can fit your table in memory, these are +not the appropriate tools). These commands may also be useful if, for +whatever reason, one would like to manipulate, via the command-line or +scripts, DBM databases created by other programs.
• Tutorial: | ||
• Back-ends: | + | |
• Common Options: |
-Next: Back-ends, Previous: Introduction, Up: Introduction [Contents][Index]
+Next: Common Options, Previous: Introduction, Up: Introduction [Contents][Index]The zeptodb tools are used to create small databases that are stored to -disk and then to store, fetch and remove records from those databases. -Note that these databases are much simpler than, say, SQL databases. -The databases follow the DBM format as created by the GDBM library -(see Back-ends). Each record in a DBM database consists of a key and -a value. All keys and values are stored as plain text, regardless of -their formats. +
The zeptodb tools are used to create small databases that are stored +to disk and then to store, fetch and remove records from those +databases. These databases are much simpler than, say, SQL databases, +so no queries need to be constructed. The databases follow the DBM +format as created by the GDBM library. Each record in a DBM database +consists of a key and a value. All keys and values are stored as +plain text, regardless of their formats.
First, you create a new database with zdbc
:
$ zdbc foo.db
Note: the following two paragraphs contain technical information that is -only necessary if you will be creating large databases with many -records. If that is not the case, you may safely skip them. -
-You can customize the creation of a database in two ways. The first is -by specifying the number of buckets that comprise the database, -specified via the -b/--num-buckets option. A DBM -database can be imagined as a series of buckets. When a new item is -added, an algorithm determines which bucket it belongs in based on its -key. Likewise, the same algorithm will be used in determining the -bucket from which to fetch an item. If each bucket only contains a -maximum of one item, then you are guaranteed to be able to find any item -in the same amount of time as any other item. On the other hand, if the -number of buckets is smaller than the number of items, then when you go -to fetch an item from a bucket, you might then have to search through -all the items in that bucket to find the one that you want. This might -slow you down. On the other hand, if the number of buckets is far -greater than the maximum number of items that will be added, the -algorithm will be wasteful. Thus it’s best to use a number of buckets -that will be slightly greater than the expected maximum number of items. -As a rule of thumb, use about four times more buckets. -
-The second option is the size (in bytes) of the memory mapped region to -use, via the -m/--mmap-size option. While the -database is stored on the disk as a file, when it is opened by zeptodb, -some or all of that file is mapped in a one-to-one manner with a region -of virtual memory. Thus, when the program reads from some address in -that region of memory, it reads directly from the corresponding address -in the file. This will generally speed up reading and writing compared -to traditional file access. If the memory-mapped region is smaller than -the size of the database, only portions of the file can be mapped at a -time, thus slowing down performance. Therefore, it is recommended to -use a sufficiently larger value than the size of the database (taking -into account the expected number of records and the size of the data -that is expected to fill the record values). -
-Thus, for a big database, you might do: -
-$ zdbc --num-buckets=10000 --mmap-size=512000000 big.db -
With the database created, you may now store values to it using
zdbs
. zdbs
normally takes its input from
stdin. It expects one record per line and for each key/value
pair to be separated by a delimiter character (’|’ by default). Note
that records are unique: an attempt to store a record with a
@@ -231,12 +189,12 @@
within your scripts.
-Next: Back-ends, Previous: Introduction, Up: Introduction [Contents][Index]
+Next: Common Options, Previous: Introduction, Up: Introduction [Contents][Index]-Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]
+Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]This manual is for zeptodb (version 2.0.2b, updated 17 November 2013). +
This manual is for zeptodb (version 3.0, updated 12 June 2016).
-Copyright © 2013 Brandon Invergo +
Copyright © 2013, 2016 Brandon Invergo
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; @@ -136,19 +137,21 @@
• Copying This Manual: • Index: + — The Detailed Node Listing — Introduction- • Tutorial: • Back-ends: + • Common Options: + Commands• zdbc: • zdbs: @@ -155,23 +158,27 @@ • zdbf: + • zdbr: • zdbi: + + Copying This Manual• GNU Free Documentation License: License for copying this manual. +
Index: www/html_node/zdbc.html ================================================================== --- www/html_node/zdbc.html +++ www/html_node/zdbc.html @@ -1,10 +1,10 @@ - - +-Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]
+Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]zeptodb: zdbc @@ -35,90 +35,62 @@ + - +
2.1 zdbc
-
zdbc
is used to create a new database file. It accepts two -options, one to choose the number of buckets for the database and the -other to choose the size of the memory-mapped region. These options -may only be set upon database creation and may not be altered later. +-
zdbc
is used to create a new database file. It accepts all +of the common options. Running the command on an existing database +will overwrite the existing contents!As a general rule of thumb, you should have around one to four times -as many buckets as entries in the database. So, if your database will -have 200 entries, you should specify 200 to 800 buckets. A greater -number of buckets lowers the probability of collisions (two entries -mapping to the same location). -
-If possible, you should set the size of the memory-mapped region (in -bytes) to be larger than the expected size of the database or -otherwise as large as possible. +
In addition to the database file to be used and the common options, +the
zdbc
command accepts the following options:-
Index: www/html_node/zdbf.html ================================================================== --- www/html_node/zdbf.html +++ www/html_node/zdbf.html @@ -1,10 +1,10 @@ - - +- -b, --num-buckets=NUM
-- -
The number of buckets to use -
-- -m, --mmap-size=NUM
-- -
The size (in bytes) of the memory-mapped region to use -
-- -v, --verbose
-- -
Print more run-time information -
-- -?, --help
-- -
Show helpful information -
-- --usage
-- -
Show shorter helpful information -
-- -V, --version
-Print the program version +
- -s, --sync
+Automatically synchronize all database operations to the disk.
zeptodb: zdbf @@ -35,40 +35,40 @@ + - + @@ -82,44 +82,28 @@ stdout. By default, only the corresponding values will be printed. However, if a delimiter character is provided, both keys and values will be printed. Finally, an option is available to simply print all records in the database. -In addition to the database file to be used, the
zdbf
-command accepts the following options: +In addition to the database file to be used and the common options, +the
zdbf
command accepts the following options:Index: www/html_node/zdbr.html ================================================================== --- www/html_node/zdbr.html +++ www/html_node/zdbr.html @@ -1,10 +1,10 @@ - - +
- -a, --all
-Fetch all the records in the database +
Fetch all the records in the database.
- -d, --delim=CHAR
Delimiter character to separate printed keys from values (default -none; only values will be printed) +none; only values will be printed).
- -i, --input=FILE
-- -
Read queries from a file instead of from stdin -
-- -v, --verbose
-- -
Print more run-time information -
-- -?, --help
-- -
Show helpful information -
-- --usage
-- -
Show shorter helpful information -
-- -V, --version
-Print the program version +
Read queries from a file instead of from stdin.
zeptodb: zdbr @@ -30,49 +30,49 @@ - + + - +-Previous: zdbf, Up: Commands [Contents][Index]
+Next: zdbi, Previous: zdbf, Up: Commands [Contents][Index]
2.4 zdbr
@@ -81,39 +81,27 @@ stdin or, optionally, they are read from a text file. If many records are removed from the database, some fragmentation can occur. In this case, it is advisable to reorganize the database, which is possible via the --reorganize option. -In addition to the database file to be used, the
zdbf
-command accepts the following options: +In addition to the database file to be used and the common options, +the
zdbf
command accepts the following options:Index: www/html_node/zdbs.html ================================================================== --- www/html_node/zdbs.html +++ www/html_node/zdbs.html @@ -1,10 +1,10 @@ - - +
- -i, --input=FILE
-Read queries from a file instead of from stdin +
Read queries from a file instead of from stdin.
- -r, --reorganize
-Reorganize the database +
- -
Reorganize the database.
- -v, --verbose
-- -
Print more run-time information -
-- -?, --help
-- -
Show helpful information -
-- --usage
-- -
Show shorter helpful information -
-- -V, --version
-Print the program version +
- -s, --sync
+Automatically synchronize all database operations to the disk.
zeptodb: zdbs @@ -35,40 +35,40 @@ + - + @@ -80,39 +80,27 @@ are entered via stdin or, optionally, they are read from an input file, with one record per line. Each record should consist of one key-value pair. The values should be separated from the keys by a common delimiter (’|’ by default), for example “key|value”. -In addition to the database file to be used, the
zdbs
-command accepts the following options: +In addition to the database file to be used and the common options, +the
zdbs
command accepts the following options:Index: www/zeptodb.dvi.gz ================================================================== --- www/zeptodb.dvi.gz +++ www/zeptodb.dvi.gz cannot compute difference between binary files Index: www/zeptodb.html ================================================================== --- www/zeptodb.html +++ www/zeptodb.html @@ -1,10 +1,10 @@ - - +
- -d, --delim=CHAR
-Delimiter character separating keys from values (default ’|’) +
Delimiter character separating keys from values (default ’|’).
- -i, --input=FILE
-Read new records from a file instead of from stdin +
- -
Read new records from a file instead of from stdin.
- -v, --verbose
-- -
Print more run-time information -
-- -?, --help
-- -
Show helpful information -
-- --usage
-- -
Show shorter helpful information -
-- -V, --version
-Print the program version +
- -s, --sync
+Automatically synchronize all database operations to the disk.
zeptodb @@ -29,44 +29,44 @@ - + + - +zeptodb
@@ -76,18 +76,19 @@
- 1 Introduction
- 2 Commands
- Appendix A Copying This Manual
@@ -97,18 +98,18 @@-Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]
+Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]zeptodb
-This manual is for zeptodb (version 2.0.2b, updated 17 November 2013). +
This manual is for zeptodb (version 3.0, updated 12 June 2016).
-Copyright © 2013 Brandon Invergo +
Copyright © 2013, 2016 Brandon Invergo
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; @@ -134,19 +135,21 @@
• Copying This Manual: • Index: + — The Detailed Node Listing — Introduction- • Tutorial: • Back-ends: + • Common Options: + Commands• zdbc: • zdbs: @@ -153,16 +156,20 @@ • zdbf: + • zdbr: • zdbi: + + Copying This Manual• GNU Free Documentation License: License for copying this manual. +
1 Introduction
-zeptodb is a small collection of relatively tiny command-line tools for -interacting with DBM databases. For the uninitiated, DBM -databases are flat (non-relational) a databases; in other words, they -are persistent key-value hash tables. Typically they are created via a -library for C, Python, Perl, etc. These tools fill in a gap by providing -useful command-line tools. Some DBM libraries come with really basic -binaries for manipulating the databases, but they are not designed to be -very flexible or useful in the real world. +
zeptodb is a small collection of relatively tiny command-line tools +for interacting with DBM databases. DBM databases are flat +(non-relational) a databases; in other words, they are persistent +key-value hash tables. Typically they are created via a library for C, +Python, Perl, etc. These tools fill in a gap by providing useful +command-line tools. Some DBM libraries come with really basic binaries +for manipulating the databases, but they are not designed to be very +flexible or useful in the real world.
-These tools may be helpful in scripts, for example, when persistant data -storage is needed but when a full database would be overkill. DBM -databases offer a constant look-up time for any record in them, as +
These tools may be helpful in scripts, for example, when persistant +data storage is needed but when a full database would be overkill. +DBM databases offer a constant look-up time for any record in them, as opposed to, say, searching through a text file, which scales linearly -with the number of lines in the file. Thus, scripts requiring fast data -look-up would benefit greatly from them. These commands may also be -useful if, for whatever reason, one would like to manipulate, via the -command-line or scripts, DBM databases created by other programs. +with the number of lines in the file. Thus, scripts requiring fast +data look-up would benefit greatly from them (but note that, of +course, disk access is slower than memory access, so if you really +need the performance and you can fit your table in memory, these are +not the appropriate tools). These commands may also be useful if, for +whatever reason, one would like to manipulate, via the command-line or +scripts, DBM databases created by other programs.
- • Tutorial: • Back-ends: + • Common Options:
-Next: Back-ends, Previous: Introduction, Up: Introduction [Contents][Index]
+Next: Common Options, Previous: Introduction, Up: Introduction [Contents][Index]1.1 Tutorial
-The zeptodb tools are used to create small databases that are stored to -disk and then to store, fetch and remove records from those databases. -Note that these databases are much simpler than, say, SQL databases. -The databases follow the DBM format as created by the GDBM library -(see Back-ends). Each record in a DBM database consists of a key and -a value. All keys and values are stored as plain text, regardless of -their formats. +
The zeptodb tools are used to create small databases that are stored +to disk and then to store, fetch and remove records from those +databases. These databases are much simpler than, say, SQL databases, +so no queries need to be constructed. The databases follow the DBM +format as created by the GDBM library. Each record in a DBM database +consists of a key and a value. All keys and values are stored as +plain text, regardless of their formats.
First, you create a new database with
zdbc
:-$ zdbc foo.dbNote: the following two paragraphs contain technical information that is -only necessary if you will be creating large databases with many -records. If that is not the case, you may safely skip them. -
-You can customize the creation of a database in two ways. The first is -by specifying the number of buckets that comprise the database, -specified via the -b/--num-buckets option. A DBM -database can be imagined as a series of buckets. When a new item is -added, an algorithm determines which bucket it belongs in based on its -key. Likewise, the same algorithm will be used in determining the -bucket from which to fetch an item. If each bucket only contains a -maximum of one item, then you are guaranteed to be able to find any item -in the same amount of time as any other item. On the other hand, if the -number of buckets is smaller than the number of items, then when you go -to fetch an item from a bucket, you might then have to search through -all the items in that bucket to find the one that you want. This might -slow you down. On the other hand, if the number of buckets is far -greater than the maximum number of items that will be added, the -algorithm will be wasteful. Thus it’s best to use a number of buckets -that will be slightly greater than the expected maximum number of items. -As a rule of thumb, use about four times more buckets. -
-The second option is the size (in bytes) of the memory mapped region to -use, via the -m/--mmap-size option. While the -database is stored on the disk as a file, when it is opened by zeptodb, -some or all of that file is mapped in a one-to-one manner with a region -of virtual memory. Thus, when the program reads from some address in -that region of memory, it reads directly from the corresponding address -in the file. This will generally speed up reading and writing compared -to traditional file access. If the memory-mapped region is smaller than -the size of the database, only portions of the file can be mapped at a -time, thus slowing down performance. Therefore, it is recommended to -use a sufficiently larger value than the size of the database (taking -into account the expected number of records and the size of the data -that is expected to fill the record values). -
-Thus, for a big database, you might do: -
---$ zdbc --num-buckets=10000 --mmap-size=512000000 big.db -With the database created, you may now store values to it using
zdbs
.zdbs
normally takes its input from stdin. It expects one record per line and for each key/value pair to be separated by a delimiter character (’|’ by default). Note that records are unique: an attempt to store a record with a @@ -359,62 +327,87 @@ between multiple databases, storing the keys of one database as values in another database, allowing quite complex, but always fast, look-ups within your scripts.
- +- -Previous: Tutorial, Up: Introduction [Contents][Index]
1.2 Back-ends
- -By default, zeptodb uses the GNU dbm (GDBM) library to create and manipulate the DBM databases. -Alternatively, you may choose to use the -Kyoto Cabinet library -instead. This is specified by passing the ---with-kyotocabinet option to the configure script -before compiling zeptodb. -
-Note that databases created with these two different back-ends are -not compatible, thus databases created with Kyoto Cabinet can -only be accessed by zeptodb if it has been compiled with support for -the library. -
-Databases created with Kyoto Cabinet are required to have the -.kch file extension. By convention, databases created with -GDBM should have the .db file extension. -
-For most purposes, databases created with GDBM should be sufficient. -For particularly large data sets, however, Kyoto Cabinet is -preferred, since it can add values more quickly and has a much larger -upper limit on the database size. On the other hand, Kyoto Cabinet is -not as widely available in GNU/Linux distributions as GDBM so it often -must be installed manually. -
+ +1.2 Common Options
+ +The following options are available for all zeptodb commands. +
++
+ +- -b, --block-size=NUM
+- +
The block size (in bytes) to be used, representing the size of a +transfer from disk to memory. The default value is 512. +
+- -m, --mmap-size=NUM
+- +
The size (in bytes) of the memory-mapped region to be used. With a +value greater than zero, a memory map of the database will be created; +thus the size specified must be large enough to fit the entire +database. +
+- -c, --cache-size=NUM
+- +
The size (in bytes) of the bucket cache size to be used. +
+- -l, --no-lock
+- +
Do not perform file locking an the database. +
+- -n, --no-mmap
+- +
Do not create a memory map of the database. +
+- -v, --verbose
+- +
Print more run-time information. +
+- -?, --help
+- +
Show helpful information. +
+- --usage
+- +
Show shorter helpful information. +
+- -V, --version
+- +
Print the program version. +
Next: Copying This Manual, Previous: Introduction, Up: Top [Contents][Index]
2 Commands
-Three commands are provided with zeptodb:
zdbc
, for creating +Five commands are provided with zeptodb:
zdbc
, for creating databases,zdbs
for storing records in them,zdbf
, -for fetching records, andzdbr
, for removing records. +for fetching records,zdbr
, for removing records, and +zdbi
for displaying information about a database.
• zdbc: • zdbs: • zdbf: + • zdbr: + • zdbi:
@@ -424,48 +417,20 @@ Next: zdbs, Previous: Commands, Up: Commands [Contents][Index]2.1 zdbc
-
zdbc
is used to create a new database file. It accepts two -options, one to choose the number of buckets for the database and the -other to choose the size of the memory-mapped region. These options -may only be set upon database creation and may not be altered later. +-
zdbc
is used to create a new database file. It accepts all +of the common options. Running the command on an existing database +will overwrite the existing contents!As a general rule of thumb, you should have around one to four times -as many buckets as entries in the database. So, if your database will -have 200 entries, you should specify 200 to 800 buckets. A greater -number of buckets lowers the probability of collisions (two entries -mapping to the same location). -
-If possible, you should set the size of the memory-mapped region (in -bytes) to be larger than the expected size of the database or -otherwise as large as possible. +
In addition to the database file to be used and the common options, +the
zdbc
command accepts the following options:-
- -b, --num-buckets=NUM
-- -
The number of buckets to use -
-- -m, --mmap-size=NUM
-- -
The size (in bytes) of the memory-mapped region to use -
-- -v, --verbose
-- -
Print more run-time information -
-- -?, --help
-- -
Show helpful information -
-- --usage
-- -
Show shorter helpful information -
-- -V, --version
-Print the program version +
- -s, --sync
+Automatically synchronize all database operations to the disk.
@@ -480,36 +445,24 @@ are entered via stdin or, optionally, they are read from an input file, with one record per line. Each record should consist of one key-value pair. The values should be separated from the keys by a common delimiter (’|’ by default), for example “key|value”. -In addition to the database file to be used, the
zdbs
-command accepts the following options: +In addition to the database file to be used and the common options, +the
zdbs
command accepts the following options:
- -d, --delim=CHAR
-Delimiter character separating keys from values (default ’|’) +
Delimiter character separating keys from values (default ’|’).
- -i, --input=FILE
-Read new records from a file instead of from stdin +
- -
Read new records from a file instead of from stdin.
- -v, --verbose
-- -
Print more run-time information -
-- -?, --help
-- -
Show helpful information -
-- --usage
-- -
Show shorter helpful information -
-- -V, --version
-Print the program version +
- -s, --sync
+Automatically synchronize all database operations to the disk.
@@ -526,49 +479,33 @@ stdout. By default, only the corresponding values will be printed. However, if a delimiter character is provided, both keys and values will be printed. Finally, an option is available to simply print all records in the database. -In addition to the database file to be used, the
zdbf
-command accepts the following options: +In addition to the database file to be used and the common options, +the
zdbf
command accepts the following options:
- -a, --all
-Fetch all the records in the database +
Fetch all the records in the database.
- -d, --delim=CHAR
Delimiter character to separate printed keys from values (default -none; only values will be printed) +none; only values will be printed).
- -i, --input=FILE
-- -
Read queries from a file instead of from stdin -
-- -v, --verbose
-- -
Print more run-time information -
-- -?, --help
-- -
Show helpful information -
-- --usage
-- -
Show shorter helpful information -
-- -V, --version
-Print the program version +
Read queries from a file instead of from stdin.
-Previous: zdbf, Up: Commands [Contents][Index]
+Next: zdbi, Previous: zdbf, Up: Commands [Contents][Index]2.4 zdbr
-
zdbr
is used to remove records from a database. The records @@ -576,39 +513,39 @@ stdin or, optionally, they are read from a text file. If many records are removed from the database, some fragmentation can occur. In this case, it is advisable to reorganize the database, which is possible via the --reorganize option.In addition to the database file to be used, the
zdbf
-command accepts the following options: +In addition to the database file to be used and the common options, +the
zdbf
command accepts the following options:+
- -i, --input=FILE
-Read queries from a file instead of from stdin +
Read queries from a file instead of from stdin.
- -r, --reorganize
-Reorganize the database +
- -
Reorganize the database.
- -v, --verbose
-- -
Print more run-time information -
-- -?, --help
-- -
Show helpful information -
-- --usage
-- -
Show shorter helpful information -
-- -V, --version
-Print the program version +
- -s, --sync
+Automatically synchronize all database operations to the disk.
+ + + +2.5 zdbi
+ +
zdbi
prints out information on a database file. It accepts +the common options. +
Index: www/zeptodb.html.gz ================================================================== --- www/zeptodb.html.gz +++ www/zeptodb.html.gz cannot compute difference between binary files Index: www/zeptodb.html_node.tar.gz ================================================================== --- www/zeptodb.html_node.tar.gz +++ www/zeptodb.html_node.tar.gz cannot compute difference between binary files Index: www/zeptodb.info.tar.gz ================================================================== --- www/zeptodb.info.tar.gz +++ www/zeptodb.info.tar.gz cannot compute difference between binary files Index: www/zeptodb.pdf ================================================================== --- www/zeptodb.pdf +++ www/zeptodb.pdf cannot compute difference between binary files Index: www/zeptodb.texi.tar.gz ================================================================== --- www/zeptodb.texi.tar.gz +++ www/zeptodb.texi.tar.gz cannot compute difference between binary files Index: www/zeptodb.txt ================================================================== --- www/zeptodb.txt +++ www/zeptodb.txt @@ -1,23 +1,24 @@ zeptodb 1 Introduction 1.1 Tutorial - 1.2 Back-ends + 1.2 Common Options 2 Commands 2.1 zdbc 2.2 zdbs 2.3 zdbf 2.4 zdbr + 2.5 zdbi Appendix A Copying This Manual A.1 GNU Free Documentation License Index zeptodb ******* -This manual is for zeptodb (version 2.0.2b, updated 17 November 2013). +This manual is for zeptodb (version 3.0, updated 12 June 2016). - Copyright (C) 2013 Brandon Invergo + Copyright (C) 2013, 2016 Brandon Invergo Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and @@ -31,82 +32,45 @@ 1 Introduction ************** zeptodb is a small collection of relatively tiny command-line tools for -interacting with "DBM databases". For the uninitiated, DBM databases -are flat (non-relational) a databases; in other words, they are -persistent key-value hash tables. Typically they are created via a -library for C, Python, Perl, etc. These tools fill in a gap by -providing useful command-line tools. Some DBM libraries come with -really basic binaries for manipulating the databases, but they are not -designed to be very flexible or useful in the real world. +interacting with "DBM databases". DBM databases are flat +(non-relational) a databases; in other words, they are persistent +key-value hash tables. Typically they are created via a library for C, +Python, Perl, etc. These tools fill in a gap by providing useful +command-line tools. Some DBM libraries come with really basic binaries +for manipulating the databases, but they are not designed to be very +flexible or useful in the real world. These tools may be helpful in scripts, for example, when persistant data storage is needed but when a full database would be overkill. DBM databases offer a constant look-up time for any record in them, as opposed to, say, searching through a text file, which scales linearly with the number of lines in the file. Thus, scripts requiring fast data -look-up would benefit greatly from them. These commands may also be -useful if, for whatever reason, one would like to manipulate, via the -command-line or scripts, DBM databases created by other programs. +look-up would benefit greatly from them (but note that, of course, disk +access is slower than memory access, so if you really need the +performance and you can fit your table in memory, these are not the +appropriate tools). These commands may also be useful if, for whatever +reason, one would like to manipulate, via the command-line or scripts, +DBM databases created by other programs. 1.1 Tutorial ============ The zeptodb tools are used to create small databases that are stored to disk and then to store, fetch and remove records from those databases. -Note that these databases are much simpler than, say, SQL databases. -The databases follow the DBM format as created by the GDBM library -(*note Back-ends::). Each record in a DBM database consists of a key +These databases are much simpler than, say, SQL databases, so no queries +need to be constructed. The databases follow the DBM format as created +by the GDBM library. Each record in a DBM database consists of a key and a value. All keys and values are stored as plain text, regardless of their formats. First, you create a new database with 'zdbc': $ zdbc foo.db - Note: the following two paragraphs contain technical information that -is only necessary if you will be creating large databases with many -records. If that is not the case, you may safely skip them. - - You can customize the creation of a database in two ways. The first -is by specifying the number of "buckets" that comprise the database, -specified via the '-b'/'--num-buckets' option. A DBM database can be -imagined as a series of buckets. When a new item is added, an algorithm -determines which bucket it belongs in based on its key. Likewise, the -same algorithm will be used in determining the bucket from which to -fetch an item. If each bucket only contains a maximum of one item, then -you are guaranteed to be able to find any item in the same amount of -time as any other item. On the other hand, if the number of buckets is -smaller than the number of items, then when you go to fetch an item from -a bucket, you might then have to search through all the items in that -bucket to find the one that you want. This might slow you down. On the -other hand, if the number of buckets is far greater than the maximum -number of items that will be added, the algorithm will be wasteful. -Thus it's best to use a number of buckets that will be slightly greater -than the expected maximum number of items. As a rule of thumb, use -about four times more buckets. - - The second option is the size (in bytes) of the memory mapped region -to use, via the '-m'/'--mmap-size' option. While the database is stored -on the disk as a file, when it is opened by zeptodb, some or all of that -file is mapped in a one-to-one manner with a region of virtual memory. -Thus, when the program reads from some address in that region of memory, -it reads directly from the corresponding address in the file. This will -generally speed up reading and writing compared to traditional file -access. If the memory-mapped region is smaller than the size of the -database, only portions of the file can be mapped at a time, thus -slowing down performance. Therefore, it is recommended to use a -sufficiently larger value than the size of the database (taking into -account the expected number of records and the size of the data that is -expected to fill the record values). - - Thus, for a big database, you might do: - - $ zdbc --num-buckets=10000 --mmap-size=512000000 big.db - With the database created, you may now store values to it using 'zdbs'. 'zdbs' normally takes its input from 'stdin'. It expects one record per line and for each key/value pair to be separated by a delimiter character ('|' by default). Note that records are unique: an attempt to store a record with a pre-existing key will overwrite that @@ -183,78 +147,66 @@ another script reads from that data. You can even build up relations between multiple databases, storing the keys of one database as values in another database, allowing quite complex, but always fast, look-ups within your scripts. -1.2 Back-ends -============= - -By default, zeptodb uses the GNU dbm (http://www.gnu.org/software/gdb) -(GDBM) library to create and manipulate the DBM databases. -Alternatively, you may choose to use the Kyoto Cabinet -(http://fallabs.com/kyotocabinet/) library instead. This is specified -by passing the '--with-kyotocabinet' option to the 'configure' script -before compiling zeptodb. - - Note that databases created with these two different back-ends are -_not_ compatible, thus databases created with Kyoto Cabinet can only be -accessed by zeptodb if it has been compiled with support for the -library. - - Databases created with Kyoto Cabinet are required to have the '.kch' -file extension. By convention, databases created with GDBM should have -the '.db' file extension. - - For most purposes, databases created with GDBM should be sufficient. -For particularly large data sets, however, Kyoto Cabinet is preferred, -since it can add values more quickly and has a much larger upper limit -on the database size. On the other hand, Kyoto Cabinet is not as widely -available in GNU/Linux distributions as GDBM so it often must be -installed manually. +1.2 Common Options +================== + +The following options are available for all zeptodb commands. + +'-b, --block-size=NUM' + The block size (in bytes) to be used, representing the size of a + transfer from disk to memory. The default value is 512. + +'-m, --mmap-size=NUM' + The size (in bytes) of the memory-mapped region to be used. With a + value greater than zero, a memory map of the database will be + created; thus the size specified must be large enough to fit the + entire database. + +'-c, --cache-size=NUM' + The size (in bytes) of the bucket cache size to be used. + +'-l, --no-lock' + Do not perform file locking an the database. + +'-n, --no-mmap' + Do not create a memory map of the database. + +'-v, --verbose' + Print more run-time information. + +'-?, --help' + Show helpful information. + +'--usage' + Show shorter helpful information. + +'-V, --version' + Print the program version. 2 Commands ********** -Three commands are provided with zeptodb: 'zdbc', for creating -databases, 'zdbs' for storing records in them, 'zdbf', for fetching -records, and 'zdbr', for removing records. +Five commands are provided with zeptodb: 'zdbc', for creating databases, +'zdbs' for storing records in them, 'zdbf', for fetching records, +'zdbr', for removing records, and 'zdbi' for displaying information +about a database. 2.1 zdbc ======== -'zdbc' is used to create a new database file. It accepts two options, -one to choose the number of buckets for the database and the other to -choose the size of the memory-mapped region. These options may only be -set upon database creation and may not be altered later. +'zdbc' is used to create a new database file. It accepts all of the +common options. Running the command on an existing database will +_overwrite_ the existing contents! - As a general rule of thumb, you should have around one to four times -as many buckets as entries in the database. So, if your database will -have 200 entries, you should specify 200 to 800 buckets. A greater -number of buckets lowers the probability of collisions (two entries -mapping to the same location). + In addition to the database file to be used and the common options, +the 'zdbc' command accepts the following options: - If possible, you should set the size of the memory-mapped region (in -bytes) to be larger than the expected size of the database or otherwise -as large as possible. - -'-b, --num-buckets=NUM' - The number of buckets to use - -'-m, --mmap-size=NUM' - The size (in bytes) of the memory-mapped region to use - -'-v, --verbose' - Print more run-time information - -'-?, --help' - Show helpful information - -'--usage' - Show shorter helpful information - -'-V, --version' - Print the program version +'-s, --sync' + Automatically synchronize all database operations to the disk. 2.2 zdbs ======== 'zdbs' is used to store records in a database file. Records are entered @@ -261,30 +213,21 @@ via 'stdin' or, optionally, they are read from an input file, with one record per line. Each record should consist of one key-value pair. The values should be separated from the keys by a common delimiter ('|' by default), for example "key|value". - In addition to the database file to be used, the 'zdbs' command -accepts the following options: + In addition to the database file to be used and the common options, +the 'zdbs' command accepts the following options: '-d, --delim=CHAR' - Delimiter character separating keys from values (default '|') + Delimiter character separating keys from values (default '|'). '-i, --input=FILE' - Read new records from a file instead of from 'stdin' + Read new records from a file instead of from 'stdin'. -'-v, --verbose' - Print more run-time information - -'-?, --help' - Show helpful information - -'--usage' - Show shorter helpful information - -'-V, --version' - Print the program version +'-s, --sync' + Automatically synchronize all database operations to the disk. 2.3 zdbf ======== 'zdbf' is used to fetch records from a database file. Queries are read @@ -292,34 +235,22 @@ that match the queries will be printed to 'stdout'. By default, only the corresponding values will be printed. However, if a delimiter character is provided, both keys and values will be printed. Finally, an option is available to simply print all records in the database. - In addition to the database file to be used, the 'zdbf' command -accepts the following options: + In addition to the database file to be used and the common options, +the 'zdbf' command accepts the following options: '-a, --all' - Fetch all the records in the database + Fetch all the records in the database. '-d, --delim=CHAR' Delimiter character to separate printed keys from values (default - none; only values will be printed) + none; only values will be printed). '-i, --input=FILE' - Read queries from a file instead of from 'stdin' - -'-v, --verbose' - Print more run-time information - -'-?, --help' - Show helpful information - -'--usage' - Show shorter helpful information - -'-V, --version' - Print the program version + Read queries from a file instead of from 'stdin'. 2.4 zdbr ======== 'zdbr' is used to remove records from a database. The records to be @@ -327,30 +258,27 @@ optionally, they are read from a text file. If many records are removed from the database, some fragmentation can occur. In this case, it is advisable to reorganize the database, which is possible via the '--reorganize' option. - In addition to the database file to be used, the 'zdbf' command -accepts the following options: + In addition to the database file to be used and the common options, +the 'zdbf' command accepts the following options: '-i, --input=FILE' - Read queries from a file instead of from 'stdin' + Read queries from a file instead of from 'stdin'. '-r, --reorganize' - Reorganize the database + Reorganize the database. -'-v, --verbose' - Print more run-time information +'-s, --sync' + Automatically synchronize all database operations to the disk. -'-?, --help' - Show helpful information +2.5 zdbi +======== -'--usage' - Show shorter helpful information - -'-V, --version' - Print the program version +'zdbi' prints out information on a database file. It accepts the common +options. Appendix A Copying This Manual ****************************** A.1 GNU Free Documentation License Index: www/zeptodb.txt.gz ================================================================== --- www/zeptodb.txt.gz +++ www/zeptodb.txt.gz cannot compute difference between binary files