Skip to content

Commit

Permalink
Improve online_delete configuration and DB tuning:
Browse files Browse the repository at this point in the history
* Document delete_batch, back_off_milliseconds, age_threshold_seconds.
* Convert those time values to chrono types.
* Fix bug that ignored age_threshold_seconds.
* Add a "recovery buffer" to the config that gives the node a chance to
  recover before aborting online delete.
* Add begin/end log messages around the SQL queries.
* Add a new configuration section: [sqlite] to allow tuning the sqlite
  database operations. Ignored on full/large history servers.
* Update documentation of [node_db] and [sqlite] in the
  rippled-example.cfg file.
* Resolves #3321
  • Loading branch information
ximinez committed Jun 2, 2020
1 parent 063c3b8 commit 6e9051e
Show file tree
Hide file tree
Showing 15 changed files with 563 additions and 83 deletions.
149 changes: 126 additions & 23 deletions cfg/rippled-example.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -869,18 +869,62 @@
#
# These keys are possible for any type of backend:
#
# earliest_seq The default is 32570 to match the XRP ledger
# network's earliest allowed sequence. Alternate
# networks may set this value. Minimum value of 1.
# If a [shard_db] section is defined, and this
# value is present either [node_db] or [shard_db],
# it must be defined with the same value in both
# sections.
#
# online_delete Minimum value of 256. Enable automatic purging
# of older ledger information. Maintain at least this
# number of ledger records online. Must be greater
# than or equal to ledger_history.
#
# advisory_delete 0 for disabled, 1 for enabled. If set, then
# require administrative RPC call "can_delete"
# to enable online deletion of ledger records.
# These keys modify the behavior of online_delete, and thus are only
# relevant if online_delete is defined and non-zero:
#
# earliest_seq The default is 32570 to match the XRP ledger
# network's earliest allowed sequence. Alternate
# networks may set this value. Minimum value of 1.
# advisory_delete 0 for disabled, 1 for enabled. If set, the
# administrative RPC call "can_delete" is required
# to enable online deletion of ledger records.
# Online deletion does not run automatically if
# non-zero and the last deletion was on a ledger
# greater than the current "can_delete" setting.
# Default is 0.
#
# delete_batch When automatically purging, SQLite database
# records are deleted in batches. This value
# controls the maximum size of each batch. Larger
# batches keep the databases locked for more time,
# which may cause other functions to fall behind,
# and thus cause the node to lose sync.
# Default is 100.
#
# back_off_milliseconds
# Number of milliseconds to wait between
# online_delete batches to allow other functions
# to catch up.
# Default is 100.
#
# age_threshold_seconds
# The online delete process will only run if the
# latest validated ledger is younger than this
# number of seconds.
# Default is 60.
#
# recovery_buffer_seconds
# The online delete process checks periodically
# that rippled is still in sync with the network,
# and that the validated ledger is less than
# *age_threshold_seconds* old. By default, if it
# is not the online delete process aborts and
# tries again later. If *recovery_buffer_seconds*
# is set, online delete will wait this number of
# seconds for rippled to recover before it aborts.
# Set this value if the node is otherwise staying
# in sync, or recovering quickly.
# Default is unset.
#
# Notes:
# The 'node_db' entry configures the primary, persistent storage.
Expand All @@ -892,6 +936,12 @@
# [import_db] Settings for performing a one-time import (optional)
# [database_path] Path to the book-keeping databases.
#
# There are 4 or 5 bookkeeping SQLite database that the server creates and
# maintains. If you omit this configuration setting, it will default to
# creating a directory called "db" located in the same place as your
# rippled.cfg file. Partial pathnames will be considered relative to
# the location of the rippled executable.
#
# [shard_db] Settings for the Shard Database (optional)
#
# Format (without spaces):
Expand All @@ -907,12 +957,64 @@
#
# max_size_gb Maximum disk space the database will utilize (in gigabytes)
#
# [sqlite] Tuning settings for the SQLite databases (optional)
#
# There are 4 bookkeeping SQLite database that the server creates and
# maintains. If you omit this configuration setting, it will default to
# creating a directory called "db" located in the same place as your
# rippled.cfg file. Partial pathnames will be considered relative to
# the location of the rippled executable.
# Format (without spaces):
# One or more lines of case-insensitive key / value pairs:
# <key> '=' <value>
# ...
#
# Example:
# sync_level=low
# journal_mode=off
#
# WARNING: These settings can have significant effects on data integrity,
# particularly in failure scenarios. It is strongly recommended that they
# be left at their defaults unless the server is having performance issues
# during normal operation or during automatic purging (online_delete)
# operations.
#
# Optional keys:
#
# safety_level Valid values: high, low
# The default is "high", and tunes the SQLite
# databases in the most reliable mode. "low"
# is equivalent to
# journal_mode=memory
# synchronous=off
# temp_store=memory
# These settings trade speed and reduced I/O
# for a higher risk of data loss. See the
# individual settings below for more information.
#
# journal_mode Valid values: delete, truncate, persist, memory, wal, off
# The default is "wal", which uses a write-ahead
# log to implement database transactions.
# Alternately, "memory" saves disk I/O, but if
# rippled crashes during a transaction, the
# database is likely to be corrupted.
# See https://www.sqlite.org/pragma.html#pragma_journal_mode
# for more details about the available options.
#
# synchronous Valid values: off, normal, full, extra
# The default is "normal", which works well with
# the "wal" journal mode. Alternatively, "off"
# allows rippled to continue as soon as data is
# passed to the OS, which can significantly
# increase speed, but risks data corruption if
# the host computer crashes before writing that
# data to disk.
# See https://www.sqlite.org/pragma.html#pragma_synchronous
# for more details about the available options.
#
# temp_store Valid values: default, file, memory
# The default is "file", which will use files
# for temporary database tables and indices.
# Alternatively, "memory" may save I/O, but
# rippled does not currently use many, if any,
# of these temporary objects.
# See https://www.sqlite.org/pragma.html#pragma_temp_store
# for more details about the available options.
#
#
#
Expand Down Expand Up @@ -1212,23 +1314,24 @@ medium

# This is primary persistent datastore for rippled. This includes transaction
# metadata, account states, and ledger headers. Helpful information can be
# found here: https://ripple.com/wiki/NodeBackEnd
# delete old ledgers while maintaining at least 2000. Do not require an
# external administrative command to initiate deletion.
# found at https://xrpl.org/capacity-planning.html#node-db-type
# type=NuDB is recommended for non-validators with fast SSDs. Validators or
# slow / spinning disks should use RocksDB.
# online_delete=512 is recommended to delete old ledgers while maintaining at
# least 512.
# advisory_delete=0 allows the online delete process to run automatically
# when the node has approximately two times the "online_delete" value of
# ledgers. No external administrative command is required to initiate
# deletion.
[node_db]
type=RocksDB
path=/var/lib/rippled/db/rocksdb
open_files=2000
filter_bits=12
cache_mb=256
file_size_mb=8
file_size_mult=2
online_delete=2000
type=NuDB
path=/var/lib/rippled/db/nudb
online_delete=512
advisory_delete=0

# This is the persistent datastore for shards. It is important for the health
# of the ripple network that rippled operators shard as much as practical.
# NuDB requires SSD storage. Helpful information can be found here
# NuDB requires SSD storage. Helpful information can be found at
# https://ripple.com/build/history-sharding
#[shard_db]
#path=/var/lib/rippled/db/shards/nudb
Expand Down
2 changes: 2 additions & 0 deletions src/ripple/app/consensus/RCLConsensus.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1001,7 +1001,9 @@ void
RCLConsensus::Adaptor::updateOperatingMode(std::size_t const positions) const
{
if (!positions && app_.getOPs().isFull())
{
app_.getOPs().setMode(OperatingMode::CONNECTED);
}
}

void
Expand Down
4 changes: 2 additions & 2 deletions src/ripple/app/ledger/Ledger.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -228,14 +228,14 @@ Ledger::Ledger(
!txMap_->fetchRoot(SHAMapHash{info_.txHash}, nullptr))
{
loaded = false;
JLOG(j.warn()) << "Don't have TX root for ledger";
JLOG(j.warn()) << "Don't have TX root for ledger" << info_.seq;
}

if (info_.accountHash.isNonZero() &&
!stateMap_->fetchRoot(SHAMapHash{info_.accountHash}, nullptr))
{
loaded = false;
JLOG(j.warn()) << "Don't have AS root for ledger";
JLOG(j.warn()) << "Don't have AS root for ledger" << info_.seq;
}

txMap_->setImmutable();
Expand Down
5 changes: 3 additions & 2 deletions src/ripple/app/main/Application.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1026,7 +1026,7 @@ class ApplicationImp : public Application, public RootStoppable, public BasicApp

// transaction database
mTxnDB = std::make_unique<DatabaseCon>(
setup, TxDBName, TxDBPragma, TxDBInit);
setup, TxDBName, true, TxDBPragma, TxDBInit);
mTxnDB->getSession() << boost::str(
boost::format("PRAGMA cache_size=-%d;") %
kilobytes(config_->getValueFor(SizedItem::txnDBCache)));
Expand Down Expand Up @@ -1065,7 +1065,7 @@ class ApplicationImp : public Application, public RootStoppable, public BasicApp

// ledger database
mLedgerDB = std::make_unique<DatabaseCon>(
setup, LgrDBName, LgrDBPragma, LgrDBInit);
setup, LgrDBName, true, LgrDBPragma, LgrDBInit);
mLedgerDB->getSession() << boost::str(
boost::format("PRAGMA cache_size=-%d;") %
kilobytes(config_->getValueFor(SizedItem::lgrDBCache)));
Expand All @@ -1075,6 +1075,7 @@ class ApplicationImp : public Application, public RootStoppable, public BasicApp
mWalletDB = std::make_unique<DatabaseCon>(
setup,
WalletDBName,
false,
std::array<char const*, 0>(),
WalletDBInit);
}
Expand Down
33 changes: 21 additions & 12 deletions src/ripple/app/main/DBInit.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,23 @@ namespace ripple {

////////////////////////////////////////////////////////////////////////////////

// These pragmas are built at startup and applied to all database
// connections, unless otherwise noted.
inline constexpr char const* CommonDBPragmaJournal{"PRAGMA journal_mode=%s;"};
inline constexpr char const* CommonDBPragmaSync{"PRAGMA synchronous=%s;"};
inline constexpr char const* CommonDBPragmaTemp{"PRAGMA temp_store=%s;"};
// Default values will always be used for the common pragmas if
// at least this much ledger history is configured. This includes
// full history nodes. This is because such a large amount of data will
// be more difficult to recover if a rare failure occurs, which are
// more likely with some of the other available tuning settings.
inline constexpr std::uint32_t SQLITE_TUNING_CUTOFF = 100'000'000;

// Ledger database holds ledgers and ledger confirmations
inline constexpr auto LgrDBName{"ledger.db"};

inline constexpr std::array<char const*, 3> LgrDBPragma{
{"PRAGMA synchronous=NORMAL;",
"PRAGMA journal_mode=WAL;",
"PRAGMA journal_size_limit=1582080;"}};
inline constexpr std::array<char const*, 1> LgrDBPragma{
{"PRAGMA journal_size_limit=1582080;"}};

inline constexpr std::array<char const*, 5> LgrDBInit{
{"BEGIN TRANSACTION;",
Expand Down Expand Up @@ -63,15 +73,14 @@ inline constexpr auto TxDBName{"transaction.db"};

inline constexpr
#if (ULONG_MAX > UINT_MAX) && !defined(NO_SQLITE_MMAP)
std::array<char const*, 6>
std::array<char const*, 4>
TxDBPragma
{
{
#else
std::array<char const*, 5> TxDBPragma {{
std::array<char const*, 3> TxDBPragma {{
#endif
"PRAGMA page_size=4096;", "PRAGMA synchronous=NORMAL;",
"PRAGMA journal_mode=WAL;", "PRAGMA journal_size_limit=1582080;",
"PRAGMA page_size=4096;", "PRAGMA journal_size_limit=1582080;",
"PRAGMA max_page_count=2147483646;",
#if (ULONG_MAX > UINT_MAX) && !defined(NO_SQLITE_MMAP)
"PRAGMA mmap_size=17179869184;"
Expand Down Expand Up @@ -115,10 +124,8 @@ inline constexpr std::array<char const*, 8> TxDBInit{
// Temporary database used with an incomplete shard that is being acquired
inline constexpr auto AcquireShardDBName{"acquire.db"};

inline constexpr std::array<char const*, 3> AcquireShardDBPragma{
{"PRAGMA synchronous=NORMAL;",
"PRAGMA journal_mode=WAL;",
"PRAGMA journal_size_limit=1582080;"}};
inline constexpr std::array<char const*, 1> AcquireShardDBPragma{
{"PRAGMA journal_size_limit=1582080;"}};

inline constexpr std::array<char const*, 1> AcquireShardDBInit{
{"CREATE TABLE IF NOT EXISTS Shard ( \
Expand All @@ -130,6 +137,7 @@ inline constexpr std::array<char const*, 1> AcquireShardDBInit{
////////////////////////////////////////////////////////////////////////////////

// Pragma for Ledger and Transaction databases with complete shards
// These override the CommonDBPragma values defined above.
inline constexpr std::array<char const*, 2> CompleteShardDBPragma{
{"PRAGMA synchronous=OFF;", "PRAGMA journal_mode=OFF;"}};

Expand Down Expand Up @@ -172,6 +180,7 @@ inline constexpr std::array<char const*, 6> WalletDBInit{

static constexpr auto stateDBName{"state.db"};

// These override the CommonDBPragma values defined above.
static constexpr std::array<char const*, 2> DownloaderDBPragma{
{"PRAGMA synchronous=FULL;", "PRAGMA journal_mode=DELETE;"}};

Expand Down
6 changes: 4 additions & 2 deletions src/ripple/app/main/Main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -542,7 +542,7 @@ run(int argc, char** argv)
}

auto txnDB = std::make_unique<DatabaseCon>(
dbSetup, TxDBName, TxDBPragma, TxDBInit);
dbSetup, TxDBName, false, TxDBPragma, TxDBInit);
auto& session = txnDB->getSession();
std::uint32_t pageSize;

Expand All @@ -555,7 +555,9 @@ run(int argc, char** argv)
session << "PRAGMA temp_store_directory=\"" << tmpPath.string()
<< "\";";
session << "VACUUM;";
session << "PRAGMA journal_mode=WAL;";
assert(dbSetup.CommonPragma);
for (auto const& p : *dbSetup.CommonPragma)
session << p;
session << "PRAGMA page_size;", soci::into(pageSize);

std::cout << "VACUUM finished. page_size: " << pageSize
Expand Down
Loading

0 comments on commit 6e9051e

Please sign in to comment.