- Forced Writes on Linux now works!
- Databases on raw devices
- Remote interface improvements
- API changes
- Configuration and tuning
- Other global improvements
Some global improvements and changes have been implemented in Firebird 2.1, as engine development moves towards the architectural changes planned for Firebird 3.
Note: Unless otherwise indicated, these improvements apply from v.2.1 forward.
For maximum database safety, we configure databases for synchronous writes, a.k.a.
Forced Writes ON. This mode - strongly recommended for normal production usage - makes the
write() system call return only after the physical write to disk is complete. In turn, it guarantees that, after a
COMMIT, any data modified by the transaction is physically on the hard-drive, not waiting in the operating system's cache.
Its implementation on Linux was very simple - invoke
fcntl(dbFile, F_SETFL, O_SYNC).
Yet databases on Linux were sometimes corrupted anyway.
Speed tests on Linux showed that setting
O_SYNC on a file has no effect at all on performance! Fine, fast operating system we may think? Alas, no, it's a documented bug in the Linux kernel!
According to the Linux manual, "On Linux this command (i.e.
F_SETFL, flags)) can only change the
O_NONBLOCK flags". Though it is not documented in any place known to me, it turns out that an attempt to set any flag other than those listed in the manual (such as
O_SYNC, for example) won't work but it does not cause
fcntl() to return an error, either.
For Firebird and for InterBase versions since Day One, it means that Forced Writes has never worked on Linux. It certainly works on Windows. It seems likely that this is not a problem that affects other operating systems, although we cannot guarantee that. To make sure, you can check whether the implementation of
fcntl() on your OS is capable of setting the
The technique used currently, introduced in the Beta 2 release of Firebird 2.1, is to re-open the file. It should guarantee correct operation on any OS, provided the
open() system call works correctly in this respect. It appears that no such problems are reported.
The Firebird developers have no idea why such a bug would remain unfixed almost two years after getting into the Linux kernel's bug-tracker. Apparently, in Linux, a documented bug evolves into a feature...
Here's a tip if you want to do an instant fix for the problem in an older version of Firebird: use the
sync option when mounting any partition with a Firebird database on board. An example of a line in
/dev/sda9 /usr/database ext3 noatime,sync 1 2
File system I/O can degrade performance severely when a database in Forced Writes mode grows rapidly. On Linux, which lacks the appropriate system calls to grow the database efficiently, performance with Forced Writes can be as much as three times slower than with asynchronous writes.
When such conditions prevail, performance may be greatly enhanced by bypassing the file system entirely and restoring the database directly to a raw device. A Firebird database can be recreated on any type of block device.
Moving your database to a raw device can be as simple as restoring a backup directly to an unformatted partition in the local storage system. For example,
gbak -c my.fbk /dev/sda7
will restore your database on the third logical disk in the extended partition of your first SCSI or SATA harddrive (
Note: The database does not have a "database name" other than the device name itself. In the example given, the name of the database is
The physical backup utility
nbackup must be supplied with an explicit file path and name for its difference file, in order to avoid this file being written into the
/dev/ directory. You can achieve this with the following statement, using
# isql /dev/sda7 SQL> alter database add difference file '/tmp/dev_sda7';
To keep the size of the
nbak copy within reasonable bounds, it is of benefit to know how much storage on the device is actually occupied. The
-s switch of
nbackup will return the size of the database in database pages:
# nbackup -s -l /dev/sda7 77173
Don't confuse the result here with the block size of the device. The figure returned —
77173 — is the number of pages occupied by the database. Calculate the physical size (in bytes) as (number of pages * page size). If you are unsure of the page size, you can query it from the database header using
# gstat -h /dev/sda7 Database "/dev/sda7" Database header page information: Flags 0 Checksum 12345 Generation 43 Page size 4096 <——— ODS version 11.1 . . . . . . .
nbackup usage with a raw device
1. A backup can be performed in a script, using the output from the
-s switch directly. For example,
# DbFile=/dev/sda7 # DbSize='nbackup -L $DbFile -S' || exit 1 # dd if=$DbFile ibs=4k count=$DbSize | # compress and record DVD # nbackup -N $DbFile
2. A physical backup using
nbackup directly from the command line:
# nbackup -B 0 /dev/sda7 /tmp/lvl.0
Although no other specific issues are known at this point about the use of raw device storage for databases, keep in mind that
- the growth and potential growth of the database is less obvious to end-users than one that lives as a file within a file system. If control of the production system's environment is out of your direct reach, be certain to deploy adequate documentation for any monitoring that will be required!
- the very Windows-knowledgeable might want to try out the concept of raw device storage on Windows systems. It has not been a project priority to explore how it might be achieved on that platform. However, if you think you know a way to do it, please feel welcome to test the idea in your Windows lab and report your observations - good or bad or indifferent - back to the firebird-devel list.
Tip: Maintain your raw devices in
aliases.conf. That way, in the event of needing to reconfigure the storage hardware, there will be no need to alter any connection strings in your application code.
V. Khorsun, D. Yemanov
The remote protocol has been slightly improved to perform better in slow networks. In order to achieve this, more advanced packets batching is now performed, along with some buffer transmission optimizations. In a real world test scenario, these changes showed about 50 per cent fewer API round trips, thus incurring about 40 per cent fewer TCP roundtrips.
In Firebird 2.1 the remote interface limits the packet size of the response to various
isc_XXX_info calls to the real used length of the contained data, whereas before it sent the full specified buffer back to the client buffer, even if only 10 bytes were actually filled. Firebird 2.1 remote interface sends back only 10 bytes in this case.
Some of our users should see a benefit from the changes, especially two-tier clients accessing databases over the Internet.
The changes can be summarised as
a. Batched packets delivery. Requires both server and client of version v.2.1, enabled upon a successful protocol handshake. Delays sending packets of certain types which can be deferred for batched transfer with the next packet. (Allocate/deallocate statement operations come into this category, for example.)
b. Pre-fetching some pieces of information about a statement or request and caching them on the client side for (probable) following API calls. Implemented on the client side only, but relies partly on the benefits of reduced round trips described in (a).
It works with any server version, even possibly providing a small benefit for badly written client applications, although best performance is not to be expected if the client is communicating with a pre-v.2.1 server.
c. Reduced information responses from the engine (no trailing zeroes). As the implementation is server-side only, it requires a v.2.1 server and any client. Even old clients will work with Firebird 2.1 and see some benefit from the reduction of round trips, although the old remote interface, unlike the new, will still send back big packets for
d. Another round-trip saver, termed "defer execute", whereby
SELECT requests will be held at the point just before execution of the
isc_dsql_execute until the next API call on the same statement. The benefit of the saved round-trip becomes most visible where there is a bunch of
SELECT requests whose result set fits into one or two network packets.
This enhancement takes effect only if both client and server are v.2.1 or higher.
Note: A faintly possible side-effect is that, if
isc_dsql_execute should happen to fail with a certain exception, this exception is returned to the client in the response to the API call that was actually responsible; i.e., instead of being returned by
isc_dsql_execute it would be returned by
isc_dsql_info, or whichever API call actually dispatched the
In most cases, the side-effect would be transparent: it might show up in a case where some error occurred with default values for PSQL parameters or variables and would be noticed as an exception array where the exceptions were delivered in an unusual sequence.
The changes work with either TCP/IP or NetBEUI. They are backward-compatible, so existing client code will not be broken. However, when you are using a driver layer that implements its own interpretation of the remote protocol - such as the Jaybird JDBC and the FirebirdClient .NET drivers — your existing code will not enable the enhancements unless you use drivers that are updated.
A. dos Santos Fernandes
The identifier of the connection character set or, when the connection character set is
NONE, the BLOB character set, is now passed in the
XSQLVAR::sqlscale item of text BLOBs.
An optimization was done for index scanning when more than one index is to be scanned with
Optimization was done for sparse bitmap operations (set, test and clear) when values are mostly consecutive.
- The maximum number of hash slots is raised from 2048 to 65,536. Because the actual setting should be a prime number, the exact supported maximum is 65,521 (the biggest prime number below 65,536). The minimum is 101.
- The new default number of hash slots is 1009.
- The default lock table size has been increased to 1 Mb on all platforms.
Page sizes of 1K and 2K are deprecated as inefficient.
Note: The small page restriction applies to new databases only. Old ones can be attached to regardless of their page size.
Until v.2.1, Firebird had no special rules about allocating disk space for database file pages. Because of dependencies between pages that it maintains itself, to service its "careful write" strategy, it has just written to newly-allocated pages in indeterminate order.
For databases using ODS 11.1 and higher, Firebird servers from v.2.1 onward use a different algorithm for allocating disk space, to address two recognised problems associated with the existing approach:
1. Corruptions resulting from out-of-space conditions on disk
The indeterminate order of writes can give rise to a situation that, at a point where the page cache contains a large number of dirty pages and Firebird needs to request space for a new page in the process of writing them out, there is insufficient disk space to fulfil the request. Under such conditions it often happens that the administrator decides to shut down the database in order to make some more disk space available, causing the remaining dirty pages in the cache to be lost. This leads to serious corruptions.
2. File fragmentation
Allocating disk space in relatively small chunks can lead to significant fragmentation of the database file at file system level, impairing the performance of large scans, as during a backup, for example.
The solution is to introduce some rules and rationales to govern page writes according to the state of available disk space, as follows:
a. Each newly allocated page writes to disk immediately before returning to the engine. If the page cannot be written then the allocation does not happen: the PIP bit remains uncleared and the appropriate I/O error is raised. Corruption cannot arise, since it is guaranteed that all dirty pages in cache have disk space allocated and can be written safely.
Because this change adds an extra write for each newly-allocated page, some performance penalty is to be expected. To mitigate the effect, writes of newly-allocated pages are performed in batches of up to 128 Kb and Firebird keeps track of the number of these "initialized" pages in the PIP header.
Note: A page that has been allocated, released and re-allocated is already "space in hand", meaning that no further verification is required in order to "initialize" it. Hence, a newly allocated page is subjected to this double-write only if it is a block that has never been allocated before.
b. To address the issue of file fragmentation, Firebird now uses the appropriate call to the API of the file system to preallocate disk space in relatively large chunks.
Preallocation also gives room to avoid corruptions in the event of an "out of disk space" condition. Chances are that the database will have enough space preallocated to continue operating until the administrator can make some disk space available.
Windows only (for now)
Currently, only Windows file systems publish such API calls, which means that, for now, this aspect of the solution is supported only in the Windows builds of Firebird. However, similar facilities have recently been added to the Linux API, allowing the prospect that a suitable API function call will appear in such popular file systems as
ext3 in future.
For better control of disk space preallocation, the new parameter
DatabaseGrowthIncrement has been added to
firebird.conf. It represents the upper limit for the preallocation chunk size in bytes.
Firebird uses and maintains its own cache in memory for page buffers. The operating system, in turn, may recache Firebird's cache in its own file system cache. If Firebird is configured to use a cache that is large relative to the available RAM and Forced Writes is on, this cache duplication drains resources for little or no benefit.
Often, when the operating system tries to cache a big file, it moves the Firebird page cache to the swap, causing intensive, unnecessary paging. In practice, if the Firebird page cache size for Superserver is set to more than 80 per cent of the available RAM, resource problems will be extreme.
Note: File system caching is of some benefit on file writes, but only if Forced Writes is
OFF, which is not recommended for most conditions.
Now, Superserver on both Windows and POSIX can be configured by a new configuration parameter,
MaxFileSystemCache, to prevent or enable file system caching. It may provide the benefit of freeing more memory for other operations such as sorting and, where there are multiple databases, reduce the demands made on host resources.
Note: For Classic, there is no escaping file system caching.
For details of the
MaxFileSystemCache parameter, see
The background garbage collector process was reading all back versions of records on a page, including those created by active transactions. Since back versions of active records cannot be considered for garbage collection, it was wasteful to read them.
Database housekeeping and garbage collection
Firebird administration using IBExpert: Garbage collection
How do you know if your database server garbage collection is working?
The engine will now release external table files as soon as they are no longer in use by user requests.
A. dos Santos Fernandes
A. dos Santos Fernandes
Conversion of temporary blobs to the destination blob type now occurs when materializing.
Introduced a type flag for stored procedures, adding column
RDB$PROCEDURE_TYPE to the table
RDB$PROCEDURES. Possible values are:
|Legacy procedure (no validation checks are performed).|
|Selectable procedure (one that contains a |
|Executable procedure (no |
The configuration parameter
BugcheckAbort provides the capability to make the server stop trying to continue operation after a bugcheck and instead, to call
abort() immediately and dump a core file. Since a bugcheck usually occurs as a result of a problem the server does not recognise, continuing operation with an unresolved problem is not usually possible anyway, and the core dump can provide useful debug information.
In the more recent Linux distributions the default setups no longer dump core automatically when an application crashes. Users often have trouble trying to get them working. Differing rules for Classic and Superserver, combined with a lack of consistency between the OS setup tools from distro to distro, make it difficult to help out with any useful "general rule".
Code has been added for Classic and Superserver on Linux to bypass these problems and automate generation of a core dump file when an
BUGCHECK occurs. The Firebird server will make the required
cwd (change working directory) to an appropriate writable location (
/tmp) and set the core file size limit so that the 'soft' limit equals the 'hard' limit.
Note: In a release version, the automated core-dumping is active only when the
BugcheckAbort parameter in
firebird.conf is set to true (
1). In a debug version, it is always active.
If you need to enable the facility, don't forget that the server needs to be restarted to activate a parameter change.