Skip to content

Commit

Permalink
Update GeoIP processor documentation (#71211)
Browse files Browse the repository at this point in the history
This PR adds documentation for GeoIPv2 auto-update feature.
It also changes related settings names from geoip.downloader.* to ingest.geoip.downloader to have the same convention as current setting.

Relates to #68920

Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: James Rodewig <[email protected]>
  • Loading branch information
3 people authored Apr 15, 2021
1 parent d1c3b71 commit 308aee2
Show file tree
Hide file tree
Showing 14 changed files with 243 additions and 42 deletions.
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -520,7 +520,7 @@ subprojects {
subprojects { Project subproj ->
plugins.withType(TestClustersPlugin).whenPluginAdded {
testClusters.all {
systemProperty "geoip.downloader.enabled.default", "false"
systemProperty "ingest.geoip.downloader.enabled.default", "false"
}
}
}
4 changes: 2 additions & 2 deletions distribution/docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ services:
- cluster.routing.allocation.disk.watermark.high=1b
- cluster.routing.allocation.disk.watermark.flood_stage=1b
- node.store.allow_mmap=false
- geoip.downloader.enabled=false
- ingest.geoip.downloader.enabled=false
- xpack.security.enabled=true
- xpack.security.transport.ssl.enabled=true
- xpack.security.http.ssl.enabled=true
Expand Down Expand Up @@ -69,7 +69,7 @@ services:
- cluster.routing.allocation.disk.watermark.high=1b
- cluster.routing.allocation.disk.watermark.flood_stage=1b
- node.store.allow_mmap=false
- geoip.downloader.enabled=false
- ingest.geoip.downloader.enabled=false
- xpack.security.enabled=true
- xpack.security.transport.ssl.enabled=true
- xpack.security.http.ssl.enabled=true
Expand Down
2 changes: 2 additions & 0 deletions docs/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ testClusters.matching { it.name == "integTest"}.configureEach {
if (singleNode().testDistribution == DEFAULT) {
setting 'xpack.license.self_generated.type', 'trial'
setting 'indices.lifecycle.history_index_enabled', 'false'
setting 'ingest.geoip.downloader.enabled', 'false'
systemProperty 'es.geoip_v2_feature_flag_enabled', 'true'
systemProperty 'es.shutdown_feature_flag_enabled', 'true'
keystorePassword 'keystore-password'
}
Expand Down
93 changes: 93 additions & 0 deletions docs/reference/ingest/apis/geoip-stats-api.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
[[geoip-stats-api]]
=== GeoIP stats API
++++
<titleabbrev>GeoIP stats</titleabbrev>
++++

Gets download statistics for GeoIP2 databases used with the
<<geoip-processor,`geoip` processor>>.

[source,console]
----
GET _ingest/geoip/stats
----

[[geoip-stats-api-request]]
==== {api-request-title}

`GET _ingest/geoip/stats`

[[geoip-stats-api-prereqs]]
==== {api-prereq-title}

* If the {es} {security-features} are enabled, you must have the `monitor` or
`manage` <<privileges-list-cluster,cluster privilege>> to use this API.

* If <<ingest-geoip-downloader-enabled,`ingest.geoip.downloader.enabled`>> is
disabled, this API returns zero values and an empty `nodes` object.

[role="child_attributes"]
[[geoip-stats-api-response-body]]
==== {api-response-body-title}

`stats`::
(object)
Download statistics for all GeoIP2 databases.
+
.Properties of `stats`
[%collapsible%open]
====
`successful_downloads`::
(integer)
Total number of successful database downloads.
`failed_downloads`::
(integer)
Total number of failed database downloads.
`total_download_time`::
(integer)
Total milliseconds spent downloading databases.
`database_count`::
(integer)
Current number of databases available for use.
`skipped_updates`::
(integer)
Total number of database updates skipped.
====

`nodes`::
(object)
Downloaded GeoIP2 databases for each node.
+
.Properties of `nodes`
[%collapsible%open]
====
`<node_id>`::
(object)
Downloaded databases for the node. The field key is the node ID.
+
.Properties of `<node_id>`
[%collapsible%open]
=====
`databases`::
(array of objects)
Downloaded databases for the node.
+
.Properties of `databases` objects
[%collapsible%open]
======
`name`::
(string)
Name of the database.
======

`files_in_temp`::
(array of strings)
Downloaded database files, including related license files. {es} stores these
files in the node's <<es-tmpdir,temporary directory>>:
`$ES_TMPDIR/geoip-databases/<node_id>`.
=====
====
24 changes: 19 additions & 5 deletions docs/reference/ingest/apis/index.asciidoc
Original file line number Diff line number Diff line change
@@ -1,15 +1,29 @@
[[ingest-apis]]
== Ingest APIs

The following ingest APIs are available for managing pipelines:
Use ingest APIs to manage tasks and resources related to <<ingest,ingest
pipelines>> and processors.

* <<put-pipeline-api>> to add or update a pipeline
* <<get-pipeline-api>> to return a specific pipeline
[[ingest-pipeline-apis]]
=== Ingest pipeline APIs

Use the following APIs to create, manage, and test ingest pipelines:

* <<put-pipeline-api>> to create or update a pipeline
* <<get-pipeline-api>> to retrieve a pipeline configuration
* <<delete-pipeline-api>> to delete a pipeline
* <<simulate-pipeline-api>> to simulate a call to a pipeline
* <<simulate-pipeline-api>> to test a pipeline

[[ingest-stat-apis]]
=== Stat APIs

Use the following APIs to get statistics about ingest processing:

* <<geoip-stats-api>> to get download statistics for GeoIP2 databases used with
the <<geoip-processor,`geoip` processor>>.

include::put-pipeline.asciidoc[]
include::get-pipeline.asciidoc[]
include::delete-pipeline.asciidoc[]
include::get-pipeline.asciidoc[]
include::geoip-stats-api.asciidoc[]
include::simulate-pipeline.asciidoc[]
129 changes: 113 additions & 16 deletions docs/reference/ingest/processors/geoip.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,20 @@
<titleabbrev>GeoIP</titleabbrev>
++++

The `geoip` processor adds information about the geographical location of IP addresses, based on data from the Maxmind databases.
This processor adds this information by default under the `geoip` field. The `geoip` processor can resolve both IPv4 and
IPv6 addresses.

The `ingest-geoip` module ships by default with the GeoLite2 City, GeoLite2 Country and GeoLite2 ASN GeoIP2 databases from Maxmind made available
under the CCA-ShareAlike 4.0 license. For more details see, http://dev.maxmind.com/geoip/geoip2/geolite2/

The `geoip` processor can run with other city, country and ASN GeoIP2 databases
from Maxmind. The database files must be copied into the `ingest-geoip` config
directory located at `$ES_CONFIG/ingest-geoip`. Custom database files must be
stored uncompressed and the extension must be `-City.mmdb`, `-Country.mmdb`, or
`-ASN.mmdb` to indicate the type of the database. These database files can not
have the same filename as any of the built-in database names. The
`database_file` processor option is used to specify the filename of the custom
database to use for the processor.
The `geoip` processor adds information about the geographical location of an
IPv4 or IPv6 address.

[[geoip-automatic-updates]]
By default, the processor uses the GeoLite2 City, GeoLite2 Country, and GeoLite2
ASN GeoIP2 databases from
http://dev.maxmind.com/geoip/geoip2/geolite2/[MaxMind], shared under the
CCA-ShareAlike 4.0 license. {es} automatically downloads updates for
these databases from the Elastic GeoIP endpoint:
https://geoip.elastic.co/v1/database. To get download statistics for these
updates, use the <<geoip-stats-api,GeoIP stats API>>.

If your cluster can't connect to the Elastic GeoIP endpoint or you want to
manage your own updates, see <<manage-geoip-database-updates>>.

[[using-ingest-geoip]]
==== Using the `geoip` Processor in a Pipeline
Expand All @@ -29,7 +28,7 @@ database to use for the processor.
|======
| Name | Required | Default | Description
| `field` | yes | - | The field to get the ip address from for the geographical lookup.
| `target_field` | no | geoip | The field that will hold the geographical information looked up from the Maxmind database.
| `target_field` | no | geoip | The field that will hold the geographical information looked up from the MaxMind database.
| `database_file` | no | GeoLite2-City.mmdb | The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the `ingest-geoip` config directory.
| `properties` | no | [`continent_name`, `country_iso_code`, `country_name`, `region_iso_code`, `region_name`, `city_name`, `location`] * | Controls what properties are added to the `target_field` based on the geoip lookup.
| `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
Expand Down Expand Up @@ -300,6 +299,79 @@ GET /my_ip_locations/_search
// TESTRESPONSE[s/"took" : 3/"took" : $body.took/]
////

[[manage-geoip-database-updates]]
==== Manage your own GeoIP2 database updates

If you can't <<geoip-automatic-updates,automatically update>> your GeoIP2
databases from the Elastic endpoint, you have a few other options:

* <<use-proxy-geoip-endpoint,Use a proxy endpoint>>
* <<use-custom-geoip-endpoint,Use a custom endpoint>>
* <<manually-update-geoip-databases,Manually update your GeoIP2 databases>>

[[use-proxy-geoip-endpoint]]
**Use a proxy endpoint**

If you can't connect directly to the Elastic GeoIP endpoint, consider setting up
a secure proxy. You can then specify the proxy endpoint URL in the
<<ingest-geoip-downloader-endpoint,`ingest.geoip.downloader.endpoint`>> setting
of each node’s `elasticsearch.yml` file.

[[use-custom-geoip-endpoint]]
**Use a custom endpoint**

You can create a service that mimics the Elastic GeoIP endpoint. You can then
get automatic updates from this service.

. Download your `.mmdb` database files from the
http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site].

. Copy your database files to a single directory.

. From your {es} directory, run:
+
[source,sh]
----
./bin/elasticsearch-geoip -s my/source/dir [-t target/directory]
----

. Serve the static database files from your directory. For example, you can use
Docker to serve the files from an nginx server:
+
[source,sh]
----
docker run -v my/source/dir:/usr/share/nginx/html:ro nginx
----

. Specify the service's endpoint URL in the
<<ingest-geoip-downloader-endpoint,`ingest.geoip.downloader.endpoint`>> setting
of each node’s `elasticsearch.yml` file.
+
By default, {es} checks the endpoint for updates every three days. To use
another polling interval, use the <<cluster-update-settings,update cluster
settings API>> to set
<<ingest-geoip-downloader-poll-interval,`ingest.geoip.downloader.poll.interval`>>.

[[manually-update-geoip-databases]]
**Manually update your GeoIP2 databases**

. Use the <<cluster-update-settings,update cluster settings API>> to set
`ingest.geoip.downloader.enabled` to `false`. This disables automatic updates
that may overwrite your database changes. This also deletes all downloaded
databases.

. Download your `.mmdb` database files from the
http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site].
+
You can also use custom city, country, and ASN `.mmdb` files. These files must
be uncompressed and use the respective `-City.mmdb`, `-Country.mmdb`, or
`-ASN.mmdb` extensions.

. Copy the database files to `$ES_CONFIG/ingest-geoip`.

. In your `geoip` processors, configure the `database_file` parameter to use a
custom database file.

[[ingest-geoip-settings]]
===== Node Settings

Expand All @@ -310,3 +382,28 @@ The `geoip` processor supports the following setting:
The maximum number of results that should be cached. Defaults to `1000`.

Note that these settings are node settings and apply to all `geoip` processors, i.e. there is one cache for all defined `geoip` processors.

[[geoip-cluster-settings]]
===== Cluster settings

[[ingest-geoip-downloader-enabled]]
`ingest.geoip.downloader.enabled`::
(<<dynamic-cluster-setting,Dynamic>>, Boolean)
If `true`, {es} automatically downloads and manages updates for GeoIP2 databases
from the `ingest.geoip.downloader.endpoint`. If `false`, {es} does not download
updates and deletes all downloaded databases. Defaults to `true`.

[[ingest-geoip-downloader-endpoint]]
`ingest.geoip.downloader.endpoint`::
(<<static-cluster-setting,Static>>, string)
Endpoint URL used to download updates for GeoIP2 databases. Defaults to
`https://geoip.elastic.co/v1/database`. {es} stores downloaded database files in
each node's <<es-tmpdir,temporary directory>> at
`$ES_TMPDIR/geoip-databases/<node_id>`.

[[ingest-geoip-downloader-poll-interval]]
`ingest.geoip.downloader.poll.interval`::
(<<dynamic-cluster-setting,Dynamic>>, <<time-units,time value>>)
How often {es} checks for GeoIP2 database updates at the
`ingest.geoip.downloader.endpoint`. Must be greater than `1d` (one day). Defaults
to `3d` (three days).
5 changes: 0 additions & 5 deletions docs/reference/redirects.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1513,8 +1513,3 @@ See <<put-enrich-policy-api>>.
=== Rollup API

See <<rollup-apis>>.

[role="exclude",id="geoip-stats-api"]
=== GeoIP stats API

coming::[7.x]
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,9 @@ public class GeoIpDownloader extends AllocatedPersistentTask {

private static final Logger logger = LogManager.getLogger(GeoIpDownloader.class);

public static final Setting<TimeValue> POLL_INTERVAL_SETTING = Setting.timeSetting("geoip.downloader.poll.interval",
public static final Setting<TimeValue> POLL_INTERVAL_SETTING = Setting.timeSetting("ingest.geoip.downloader.poll.interval",
TimeValue.timeValueDays(3), TimeValue.timeValueDays(1), Property.Dynamic, Property.NodeScope);
public static final Setting<String> ENDPOINT_SETTING = Setting.simpleString("geoip.downloader.endpoint",
public static final Setting<String> ENDPOINT_SETTING = Setting.simpleString("ingest.geoip.downloader.endpoint",
"https://geoip.elastic.co/v1/database", Property.NodeScope);

public static final String GEOIP_DOWNLOADER = "geoip-downloader";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,12 @@

/**
* Persistent task executor that is responsible for starting {@link GeoIpDownloader} after task is allocated by master node.
* Also bootstraps GeoIP download task on clean cluster and handles changes to the 'geoip.downloader.enabled' setting
* Also bootstraps GeoIP download task on clean cluster and handles changes to the 'ingest.geoip.downloader.enabled' setting
*/
public final class GeoIpDownloaderTaskExecutor extends PersistentTasksExecutor<GeoIpTaskParams> implements ClusterStateListener {

private static final boolean ENABLED_DEFAULT = "false".equals(System.getProperty("geoip.downloader.enabled.default")) == false;
public static final Setting<Boolean> ENABLED_SETTING = Setting.boolSetting("geoip.downloader.enabled", ENABLED_DEFAULT,
private static final boolean ENABLED_DEFAULT = "false".equals(System.getProperty("ingest.geoip.downloader.enabled.default")) == false;
public static final Setting<Boolean> ENABLED_SETTING = Setting.boolSetting("ingest.geoip.downloader.enabled", ENABLED_DEFAULT,
Setting.Property.Dynamic, Setting.Property.NodeScope);

private static final Logger logger = LogManager.getLogger(GeoIpDownloader.class);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ public static void filterDistros() {

@Before
public void setupTest() throws IOException {
installation = runContainer(distribution(), builder().envVars(Map.of("geoip.downloader.enabled", "false")));
installation = runContainer(distribution(), builder().envVars(Map.of("ingest.geoip.downloader.enabled", "false")));
tempDir = createTempDir(DockerTests.class.getSimpleName());
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ public void test11InstallPackageDistribution() throws Exception {
public void test12InstallDockerDistribution() throws Exception {
assumeTrue(distribution().isDocker());

installation = Docker.runContainer(distribution(), builder().envVars(Map.of("geoip.downloader.enabled", "false")));
installation = Docker.runContainer(distribution(), builder().envVars(Map.of("ingest.geoip.downloader.enabled", "false")));

try {
waitForPathToExist(installation.config("elasticsearch.keystore"));
Expand Down Expand Up @@ -273,7 +273,7 @@ public void test60DockerEnvironmentVariablePassword() throws Exception {

// restart ES with password and mounted keystore
Map<Path, Path> volumes = Map.of(localKeystoreFile, dockerKeystore);
Map<String, String> envVars = Map.of("KEYSTORE_PASSWORD", password, "geoip.downloader.enabled", "false");
Map<String, String> envVars = Map.of("KEYSTORE_PASSWORD", password, "ingest.geoip.downloader.enabled", "false");
runContainer(distribution(), builder().volumes(volumes).envVars(envVars));
waitForElasticsearch(installation);
ServerUtils.runElasticsearchTests();
Expand Down Expand Up @@ -304,7 +304,7 @@ public void test61DockerEnvironmentVariablePasswordFromFile() throws Exception {
Map<String, String> envVars = Map.of(
"KEYSTORE_PASSWORD_FILE",
"/run/secrets/" + passwordFilename,
"geoip.downloader.enabled",
"ingest.geoip.downloader.enabled",
"false"
);

Expand Down
Loading

0 comments on commit 308aee2

Please sign in to comment.