Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow running multiple isolated instances of gaia_db_server #661

Merged
merged 38 commits into from
May 19, 2021

Conversation

simone-gaia
Copy link
Contributor

@simone-gaia simone-gaia commented May 10, 2021

This change introduces the possibility of running multiple instances of gaia_db_server in complete isolation. Each instance has its own name which is used to create the socket and shm names. Whoever needs to connect to the gaia_db_server (client code, gaiac, gaiat, etc..) has to specify the instance name they want to connect to. If nothing is provided a default instance name is used.

The tests now run each into their own instance of gaia_db_server. The instance name is "randomly" generated at startup.

db

  • Add session_opts_t struct to pass to begin_session() to allow specifying the instance name to connect to.
    • NOTE: this has been hidden from the public API until we ensure there are no global states that can mess across sessions.
  • Add db_client_config.hpp as internal API to help managing default session_opts_t.
  • Add server_conf_t class to pass to server_t to configure the server behavior, including the db instance name.
  • Use the instance name to create the db_server socket and shared memory segments.
  • Add server_instance_t to help spin up and tear down instances of gaia_db_server.
    This is only exposed for testing ATM. It could be used beyond that.

catalog

  • ddl_executor_t and type_metadata_t store the singleton instance into a thread_local variable since multiple sessions on different instances may be running at the same time.

common

  • Add API to generate random numbers and strings.

gaiac

  • Add --instance-name|-n parameter
  • Remove --destroy-db, it is no longer necessary with --instance-name

gaiat

  • Add -n parameter

Cmake

  • Add INSTANCE_NAME parameter to cmake functions
  • Remove "serialize parallel builds" treick from schemas compilations.
  • Disable FALTBUFFERS test compilation

Tests

  • Each test now runs into its own instance of gaia_db_server. This will help running them in parallel in the future.
  • Explicit call to db_test_base_t::SetUpTestSuite() that initializes the Gaia db instance.
  • Deprecate the tests that inject the DDL from the environment variable in favor of schema_loader.
  • test_recovery is now part of ctest.

std::vector<const char*> command = get_server_command();

// TODO this should be wrapped into a if_debug_enabled()
std::string command_str;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like Hungarian, except str is a suffix instead of a prefix.

This is very trivial code, functionally, but the naming makes it very mysterious. What exactly is being done here?

Copy link
Contributor

@LaurentiuCristofor LaurentiuCristofor May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be something like:

get_server_command -> get_server_command_line_arguments
command -> command_line_arguments
part -> command_line_argument
command_str -> command_line or just command

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot call it get_server_command_line_arguments because it is not just the arguments, but also the command itself. As mentioned in the comment above, I could change to: create_server_stratup_command

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could call it get_server_command_and_arguments then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of the and within method names, fine this time.

// NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast)
if (-1 == ::execve(command[0], const_cast<char**>(command.data()), nullptr))
{
ASSERT_UNREACHABLE(fmt::format("Failed to execute '{}'", m_conf.server_exec_path.c_str()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Punctuation missing in the message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I think this is an abuse of asserts. You should not just turn any fatal error into an assert. There could be any number of reasons why execve() might fail, unrelated to any logical programming error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you should be doing here is just using the existing facility throw_system_error() to automatically extract errno and its associated system description. There are many examples of this in the DB code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to system_error.

}
}

gaia_log::sys().debug("Server instance {} started with pid:{}", instance_name(), m_server_pid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd punctuate log messages too, but if this hasn't been done yet, it's probably best left for a separate cleanup change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to come up with guidelines for logging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were supposed to be the same as for messages as far as punctuation and capitalization was concerned.
Note that one value is preceded by a colon and the other not - the use of colons should be consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added punctuation and colon to each message


gaia::db::begin_session(session_opts);
}
catch (gaia::common::system_error& ex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weren't we going with just e for exceptions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

// I don't know if this is by design or a bug.
// @chuan to review (if it is a bug I will file a JIRA)
std::unique_lock lock(m_gaia_parser_lock);
schema_loader_t::instance().load_schema("addr_book.ddl");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess @chuan didn't expect to have to load metadata multiple times. This is another reason not to allow clients to access different servers. Our assumption has always been that we deal with 1 server, 1 metadata, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real problem here, I think, is that a layered design was rejected early on, so now everything (predictably) depends on everything else. Logically, the low-level client API shouldn't depend on anything but the corresponding low-level DB feature set ("dbcore"). In that case, there would be no issue with allowing different sessions in the same process to connect to different servers. The existence of process-global data here is the problem, not the introduction of multiple server instances. Any client state which depends on server state should have session scope.

Copy link
Contributor

@LaurentiuCristofor LaurentiuCristofor May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existence of process-global data here is the problem, not the introduction of multiple server instances.

The existence of the process-global data is due to the assumption that we only need to care about a single server instance. Obviously, changing that assumption now is a problem. We cannot come and say - "changing the assumption is not a problem, we should not have made it in the first place". We made it and nobody pushed back.

This also has nothing to do with the layered design issues. That is an orthogonal problem. Having a layered design is not incompatible with making the assumption that we work with a single server instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point I was trying to make is that a layered design would have avoided this because the session API would be oblivious to the higher-level features (e.g. the catalog) that are now entangled with the client's process-global state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"assumption that we only need to care about a single server instance" I think this assumption was a bad assumption to begin with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused. gaia parser which is just a wrapper around flex/bison generated parser is thread-safe. Executing DDLs in parallel is not multi-thread safe in the sense DDLs do not support txns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment to:

        // TODO The gaia parser does not support running
        //  multiple DDL from different threads.

Note that this test is likely to be disabled after the meetings we will have tomorrow about DB sessions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean DDL executor. Parser is just a string parsing library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the parser itself that fails, not the DDL executor.

{
gaia_log::initialize({});

s_server_instance = server_instance_t{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a difference between this line and this one:

s_server_instance = server_instance_t();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case (no arguments) there is no difference. I always use braces for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the value of braces at declaration time, but in code it seems confusing to use them when not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Copy link
Contributor

@LaurentiuCristofor LaurentiuCristofor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally completed my pass through the entire PR.

My main concern is about introducing the ability for a client to connect to different server instances. I don't think we should support that.

Other than that, the comments I made are just fit&finish type of stuff.

@@ -146,6 +146,7 @@ endif()
# Rocksdb.
add_subdirectory(${ROCKSDB} rocksdb EXCLUDE_FROM_ALL)
# Flatbuffers.
set(FLATBUFFERS_BUILD_TESTS OFF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that we don't build flatbuffers tests, saving a little bit of build time.

@@ -13,6 +13,7 @@
#include <thread>

#include <flatbuffers/flatbuffers.h>
#include <spdlog/fmt/fmt.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to use fmt, we should use an independent version instead of the embedded or bundled version in spdlog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is a temporary work-around. Created a follow-up task: https://gaiaplatform.atlassian.net/browse/GAIAPLAT-938

production/gaia.conf Outdated Show resolved Hide resolved
@simone-gaia
Copy link
Contributor Author

simone-gaia commented May 18, 2021

@senderista @LaurentiuCristofor @chuan It was not necessary to customize the reading of the configuration file.What we have is an internal APi that allows setting the default session_options_t:

    // This is not visible to the customer.
    gaia::db::config::session_options_t session_options;
    session_options.db_instance_name = instance_name;
    gaia::db::config::set_default_session_options(session_options);

    // This is visible to the customer.
    gaia::db::begin_session();

When the user call:

    gaia::system::initialize("./gaia.conf", "./gaia_log.conf");

initialize internally call the set_default_session_options using the instance name found in the config file.

That being said, I till intend to make the session_name configurable via Environment variable and also the configuration file modifiable at runtime, in a different PR though.

@LaurentiuCristofor LaurentiuCristofor dismissed their stale review May 18, 2021 22:33

Reverting change request after meeting.

// All rights reserved.
/////////////////////////////////////////////

#include "gaia_internal/db/db_server_instance.hpp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this and db_server_instance.cpp be moved under test since they look like test helper utility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, there are other things that should be grouped under a "test" section. I created a jira task for it:

https://gaiaplatform.atlassian.net/browse/GAIAPLAT-954

Copy link
Contributor

@senderista senderista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple trivial comments, I don't have time to do a proper review and I trust the general approach.

}
server_instance_config_t server_conf = server_instance_config_t::get_default();
server_conf.instance_name = instance_name;
server_conf.server_exec_path = string(path_to_db_server) + "/" + string(c_db_server_exec_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you normalize this with std::filesystem::path? I want to make this robust since this is user-supplied data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


if (!fs::exists(db_exec_path))
{
throw common::gaia_exception("Impossible to find the database server path");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Database server path does not exist."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@simone-gaia simone-gaia merged commit f92b0e1 into master May 19, 2021
@simone-gaia simone-gaia deleted the rondelli-db-server-sessions branch August 11, 2021 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants