Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt ES-Archiver to use savedObject types defined in the new platform #55860

Closed
pgayvallet opened this issue Jan 24, 2020 · 17 comments · Fixed by #56971
Closed

Adapt ES-Archiver to use savedObject types defined in the new platform #55860

pgayvallet opened this issue Jan 24, 2020 · 17 comments · Fixed by #56971
Assignees
Labels
blocker Feature:New Platform Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@pgayvallet
Copy link
Contributor

pgayvallet commented Jan 24, 2020

Currently, the es-archiver is creating a KibanaMigrator to migrate the kibana index after a load

// If we affected the Kibana index, we need to ensure it's migrated...
if (Object.keys(result).some(k => k.startsWith('.kibana'))) {
await migrateKibanaIndex({ client, log, kibanaPluginIds });
if (kibanaPluginIds.includes('spaces')) {
await createDefaultSpace({ client, index: '.kibana' });
}
}

However to do that, it manually gathers the uiExports from the legacy plugins to get the SO schemas/mappings/migrations

const getUiExports = async kibanaPluginIds => {
const xpackEnabled = kibanaPluginIds.includes('xpack_main');
const { spec$ } = await findPluginSpecs({
plugins: {
scanDirs: [path.resolve(__dirname, '../../../legacy/core_plugins')],
paths: xpackEnabled ? [path.resolve(__dirname, '../../../../x-pack')] : [],
},
});
const specs = await spec$.pipe(toArray()).toPromise();
return collectUiExports(specs);
};

export async function migrateKibanaIndex({ client, log, kibanaPluginIds }) {
const uiExports = await getUiExports(kibanaPluginIds);

savedObjectSchemas: new SavedObjectsSchema(uiExports.savedObjectSchemas),
savedObjectMappings: convertLegacyMappings(uiExports.savedObjectMappings),
savedObjectMigrations: uiExports.savedObjectMigrations,
savedObjectValidations: uiExports.savedObjectValidations,
callCluster: (path, ...args) => _.get(client, path).call(client, ...args),
};
return await new KibanaMigrator(migratorOptions).runMigrations();

After #55825 and the other SO types related issues are done, the SO types will be created from the new platform, and the archiver will not be able to be aware of them for the migration.

We need to find a way to have the archiver be aware of the NP-declared types. However, as the types registration is now done programmatically from the plugins setup phase, I'm really not seeing anything obvious to address this.

@pgayvallet pgayvallet added blocker Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:New Platform labels Jan 24, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform (Team:Platform)

@pgayvallet
Copy link
Contributor Author

One solution is to get back from our programmatic API to a more static declaration (as it's done for the plugin's configuration for example). This way, we would have the archiver scan for the SO definitions as it's currently doing for legacy plugins. But I think we would really like to avoid this if possible.

Other question that could be asked is why does the archiver needs to performs the migration after a load? could there be a workaround to avoid doing this?

@rudolf
Copy link
Contributor

rudolf commented Jan 24, 2020

Other question that could be asked is why does the archiver needs to performs the migration after a load? could there be a workaround to avoid doing this?

My guess is that because the functional test runner creates one kibana server for the entire test suite (not for each it() test) we only run migrations once. Most tests load an archive for the entire suite before, but some tests reload the archived data in a beforeEach. If the archive contains unmigrated data this won't be migrated before being persisted in ES which means SavedObjects.find() queries might return the wrong results.

To solve this, the esArchiver could call an internal SavedObjects API to retrieve all registered types, or the esArchiver could call an internal SavedObjects rerunMigrations API. This would also reduce the amount of duplication inside the esArchiver which is brittle to maintain because it's still in JS.

@pgayvallet
Copy link
Contributor Author

To solve this, the esArchiver could call an internal SavedObjects API to retrieve all registered types

Implementation depends on #55857, but as the esArchiver is not run from inside kibana, and as some SO-related internals are not serializable (thinking mostly of migrations), I don't think this is an option

or the esArchiver could call an internal SavedObjects rerunMigrations API. This would also reduce the amount of duplication inside the esArchiver which is brittle to maintain because it's still in JS.

Just to be sure to understand, as the esArchiver does not run from inside the kibana server during FTR, by internal SavedObjects API you mean an actual internal HTTP endpoint that esarchiver could call?

@rudolf
Copy link
Contributor

rudolf commented Jan 27, 2020

Even though the esArchiver runs in a separate process from the Kibana FTR server, it could start it's own Kibana server just for migrations which would give it access to core's setup/start API's. Similar to how we start the server in test_utils/kbn_server.ts

const root = createRootWithCorePlugins(kbnSettings);
await root.setup();
await root.start();

This internal server would have to use the same configuration as the FTR server (but listen on a different http port). The disadvantage would be that we potentially have a slower startup for test suites but since these two processes can run in parallel it might be negligible...

Another option would be to use IPC between the FTR parent process and the kibana server child process to send a "rerun migrations" message.

I don't know the internals of the FTR very well so there might be some complications I didn't consider. But I think it's worth exploring the options so that we don't design the API around our testing infrastructure.

@pgayvallet
Copy link
Contributor Author

Even though the esArchiver runs in a separate process from the Kibana FTR server, it could start it's own Kibana server just for migrations which would give it access to core's setup/start API's. Similar to how we start the server in test_utils/kbn_server.ts

If we go in this direction, I think it would be better to have a specific method on our root server (or somewhere else) that would performs only what is required for the migration (doing setup, but only performing the migration and not doing the whole server start lifecycle) to avoid booting up the whole thing (the http server port you mention is an issue, but I think there would be others).

I agree that avoiding exposing test-purposes APIs should be avoided.

I don't really like giving the capability to the es-archiver to boot a whole kibana/core Server, as it means heavy changes in the FTR archiver service instantiation (and we still need to confirm actual doability of this proposition), however I'm not sure I see a better thing to try right now.

ping @spalger @joshdover ^ your insight is welcome.

@pgayvallet
Copy link
Contributor Author

TIL: we actually already have a SO util for FTR in packages/kbn-dev-utils/src/kbn_client/kbn_client_saved_objects.ts, which is used by the archiver. It's currently only using existing APIs to perform get/create/update calls on saved objects.

However I have the feeling that adding an additional, internal API endpoint to perform a savedObject migration and using the test-util KbnClient to call the endpoint may be the quickest win if we don't find another proper solution...

@joshdover
Copy link
Contributor

However I have the feeling that adding an additional, internal API endpoint to perform a savedObject migration and using the test-util KbnClient to call the endpoint may be the quickest win if we don't find another proper solution...

This was going to be my suggestion. If we go this route, it would be great if this endpoint is only available if some CLI flag is passed to the Kibana server. While it is the "quick-win" solution, I don't think it's necessarily a bad one.

Another option would be to use the SavedObjects REST API to import the documents that EsArchiver needs to create. Since all documents that go through the import API are migrated by the Kibana server, this should solve the problem. This would only be necessary for data that is imported into the .kibana index. I don't know much about the internals of EsArchiver so I can't speak to what kind of effort this would require.

@pgayvallet
Copy link
Contributor Author

If we go this route, it would be great if this endpoint is only available if some CLI flag is passed to the Kibana server. While it is the "quick-win" solution, I don't think it's necessarily a bad one.

I know some where not big fan of 'test-only' flags, as it means different behavior during FTR (which are supposed to run in 'pure' production mode), and actual prod mode. But I agree that exposing that kind of endpoint in production should (really) be avoided.

Another option would be to use the SavedObjects REST API to import the documents that EsArchiver needs to create.

That would be great. However it seems the es archiver is actually dumping whole indexes instead of only SO documents (see test/functional/fixtures/es_archiver/kibana_sample_data_flights_index_pattern/mappings.json for example), so I'm not sure this is an option.

@elastic/kibana-operations @tylersmalley Can we get some insight / help here ?

@tylersmalley
Copy link
Contributor

My feeling is the proper solution would be to remove the need for migrations to be ran before functional tests. I have long disliked that the es_archive's were gunziped for multiple reasons - if we kept these in plain text, provided tooling to update them when new migrations are added, could we do away with needing to run migrations at all in functional tests?

@pgayvallet
Copy link
Contributor Author

@tylersmalley To be sure to understand: The only usage of executing a SO migration after es-archiver performs a load is because the actual test data are never updated / migrated and therefor needs to be migrated?

@joshdover
Copy link
Contributor

joshdover commented Jan 30, 2020

The only usage of executing a SO migration after es-archiver performs a load is because the actual test data are never updated / migrated and therefor needs to be migrated?

This sounds right from my experience.

The only other case I can think of is when maybe you want to test migrations themselves by loading old objects and then making sure features still work. I'm not even sure that is a current usecase but even if it is, those subset of tests could use the SO Import API instead.

@pgayvallet
Copy link
Contributor Author

those subset of tests could use the SO Import API instead.

The SO import API does not migrate imported objects?

@pgayvallet
Copy link
Contributor Author

I spent some time trying to see how we could adapt the savedObject's create API to be able to use it from es-archiver.

However, the data format for the archiver's documents is the 'raw' es format (RawDoc), and not our SavedObjectDoc type, which is the one expected for any SO API call.

Our SavedObjectsSerializer type can convert from one to another. However, it needs the types definition to instantiate, meaning that we are back to our initial issue: the archiver is not aware of the NP registered types. if we convert the raw objects to SO objects, this needs to be done on the kibana-server side.

Options

I think we only have two options here:

  1. Add a new /create_from_raw API (or add a 'from_raw' flag to the create endpoint) to our SO API.

We would be creating an API that would only be used for tests. If we go in this direction, I see no upside on this approach comparing to add a /perform_migration endpoint instead., which answer the same need, and is way easier to both implement and test.

  1. Migrate all the test data to convert from ES-type documents to proper SO documents, and use the /create SO API to load them

We will need proper tooling to convert all the test data. As the converter will need to be aware of registered types, the tool should be able to kickstart the server (at least to the end of the setup phase for plugin to register their types), load / convert / write the migrated datas back.
Also, as the test data contains both doc and index types, it should be able to only migrate the actual docs, and keep the other 'types' (ES types, not SO types) untouched.

This solution should provides both a one-time migration of the existing data, and adds proper tooling for plugin to be able to add/convert any new test data they would like to add later on.

Course of action

I think solution two is what we want long term. Having actual SO representation of our test data probably makes more sense than the actual low-level ES format, and we would be using proper SO api to load SOs instead of dumping them into ES.

However, these are some deep, long and sensible changes. As this issue is a blocker to implements SO registration in NP, but migrating the data to the new format is not, I think the best course of action is to implement the first solution now (the /migrate endpoint), and then later, when we are less time-stressed with NP migration, take the time we need to discuss more about the data migration and the tooling we want to provide around it.

If this is acceptable for everyone, I would start working in the migrate endpoint, and create a new issue for the long-term solution.

WDYT @joshdover @tylersmalley @rudolf

@rudolf
Copy link
Contributor

rudolf commented Feb 3, 2020

As this issue is a blocker to implements SO registration in NP, but migrating the data to the new format is not, I think the best course of action is to implement the first solution now (the /migrate endpoint), and then later, when we are less time-stressed with NP migration, take the time we need to discuss more about the data migration and the tooling we want to provide around it.

Completely agree 👍

@joshdover
Copy link
Contributor

Also agree with the workaround solution for now. Seems like technical debt that is worth the tradeoff. Let's make sure to open an issue with enough details on how to resolve this in the future.

@pgayvallet
Copy link
Contributor Author

Created #56727 for the long term solution. Will start working on the migrate endpoint asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker Feature:New Platform Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants