diff --git a/docs/playground/images/chat-interface.png b/docs/playground/images/chat-interface.png new file mode 100644 index 0000000000000..8f13fba59666e Binary files /dev/null and b/docs/playground/images/chat-interface.png differ diff --git a/docs/playground/images/edit-query.png b/docs/playground/images/edit-query.png new file mode 100644 index 0000000000000..a1122dbf9ba6a Binary files /dev/null and b/docs/playground/images/edit-query.png differ diff --git a/docs/playground/images/select-indices.png b/docs/playground/images/select-indices.png new file mode 100644 index 0000000000000..9fe455bdb0be2 Binary files /dev/null and b/docs/playground/images/select-indices.png differ diff --git a/docs/playground/index.asciidoc b/docs/playground/index.asciidoc new file mode 100644 index 0000000000000..fe0aaea05a305 --- /dev/null +++ b/docs/playground/index.asciidoc @@ -0,0 +1,203 @@ +[role="xpack"] +[[playground]] += Playground + +preview::[] + +// Variable (attribute) definition +:x: Playground + +Use {x} to combine your Elasticsearch data with the power of large language models (LLMs) for retrieval augmented generation (RAG). +The chat interface translates your natural language questions into {es} queries, retrieves the most relevant results from your {es} documents, and passes those documents to the LLM to generate tailored responses. + +Once you start chatting, use the UI to view and modify the Elasticsearch queries that search your data. +You can also view the underlying Python code that powers the chat interface, and download this code to integrate into your own application. + +Learn how to get started on this page. +Refer to the following for more advanced topics: + +* <> +* <> +* <> + +[float] +[[playground-how-it-works]] +== How {x} works + +Here's a simpified overview of how {x} works: + +* User *creates a connection* to LLM provider +* User *selects a model* to use for generating responses +* User *define the model's behavior and tone* with initial instructions +** *Example*: "_You are a friendly assistant for question-answering tasks. Keep responses as clear and concise as possible._" +* User *selects {es} indices* to search +* User *enters a question* in the chat interface +* {x} *autogenerates an {es} query* to retrieve relevant documents +** User can *view and modify underlying {es} query* in the UI +* {x} *auto-selects relevant fields* from retrieved documents to pass to the LLM +** User can *edit fields targeted* +* {x} passes *filtered documents* to the LLM +** The LLM generates a response based on the original query, initial instructions, chat history, and {es} context +* User can *view the Python code* that powers the chat interface +** User can also *Download the code* to integrate into application + +[float] +[[playground-availability-prerequisites]] +== Availability and prerequisites + +For Elastic Cloud and self-managed deployments {x} is available in the *Search* space in {kib}, under *Content* > *{x}*. + +For Elastic Serverless, {x} is available in your {es} project UI. +// TODO: Confirm URL path for Serverless + +To use {x}, you'll need the following: + +1. An Elastic *v8.14.0+* deployment or {es} *Serverless* project. (Start a https://cloud.elastic.co/registration[free trial]). +2. At least one *{es} index* with documents to search. +** See <> if you'd like to ingest sample data. +3. An account with a *supported LLM provider*. {x} supports the following: ++ +[cols="2a,2a,1a", options="header"] +|=== +| Provider | Models | Notes + +| *Amazon Bedrock* +a| +* Anthropic: Claude 3 Sonnet +* Anthropic: Claude 3 Haiku +a| +Does not currently support streaming. + +| *OpenAI* +a| +* GPT-3 turbo +* GPT-4 turbo +a| + +| *Azure OpenAI* +a| +* GPT-3 turbo +* GPT-4 turbo +a| + +|=== + +[float] +[[playground-getting-started]] +== Getting started + +[float] +[[playground-getting-started-connect]] +=== Connect to LLM provider + +To get started with {x}, you need to create a <> for your LLM provider. +Follow these steps on the {x} landing page: + +. Under *Connect to LLM*, click *Create connector*. +. Select your *LLM provider*. +. *Name* your connector. +. Select a *URL endpoint* (or use the default). +. Enter *access credentials* for your LLM provider. + +[TIP] +==== +If you need to update a connector, or add a new one, click the wrench button (🔧) under *Model settings*. +==== + +[float] +[[playground-getting-started-ingest]] +=== Ingest data (optional) + +_You can skip this step if you already have data in one or more {es} indices._ + +There are many options for ingesting data into {es}, including: + +* The {enterprise-search-ref}/crawler.html[Elastic crawler] for web content (*NOTE*: Not yet available in _Serverless_) +* {enterprise-search-ref}/connectors.html[Elastic connectors] for data synced from third-party sources +* The {es} {ref}/docs-bulk.html[Bulk API] for JSON documents ++ +.*Expand* for example +[%collapsible] +============== +To add a few documents to an index called `books` run the following in Dev Tools Console: + +[source,console] +---- +POST /_bulk +{ "index" : { "_index" : "books" } } +{"name": "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470} +{ "index" : { "_index" : "books" } } +{"name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585} +{ "index" : { "_index" : "books" } } +{"name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328} +{ "index" : { "_index" : "books" } } +{"name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227} +{ "index" : { "_index" : "books" } } +{"name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268} +{ "index" : { "_index" : "books" } } +{"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311} +---- +============== + +We've also provided some Jupyter notebooks to easily ingest sample data into {es}. +Find these in the https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/ingestion-and-chunking[elasticsearch-labs] repository. +These notebooks use the official {es} Python client. +// TODO: [The above link will be broken until https://github.com/elastic/elasticsearch-labs/pull/232 is merged] + +[float] +[[playground-getting-started-index]] +=== Select {es} indices + +Once you've connected to your LLM provider, it's time to choose the data you want to search. +Follow the steps under *Select indices*: + +. Select one or more {es} indices under *Add index*. +. Click *Start* to launch the chat interface. ++ +[.screenshot] +image::select-indices.png[width=400] + +Learn more about the underlying {es} queries used to search your data in <>. + +[float] +[[playground-getting-started-setup-chat]] +=== Set up the chat interface + +You can start chatting with your data immediately, but you might want to tweak some defaults first. + +[.screenshot] +image::chat-interface.png[] + +You can adjust the following under *Model settings*: + +* *Model*. The model used for generating responses. +* *Instructions*. Also known as the _system prompt_, these initial instructions and guidelines define the behavior of the model throughout the conversation. Be *clear and specific* for best results. +* *Include citations*. A toggle to include citations from the relevant {es} documents in responses. + +{x} also uses another LLM under the hood, to encode all previous questions and responses, and make them available to the main model. +This ensures the model has "conversational memory". + +Under *Indices*, you can edit which {es} indices will be searched. +This will affect the underlying {es} query. + +[TIP] +==== +Click *✨ Regenerate* to resend the last query to the model for a fresh response. + +Click *⟳ Clear chat* to clear chat history and start a new conversation. +==== + +[float] +[[playground-next-steps]] +=== Next steps + +Once you've got {x} up and running, and you've tested out the chat interface, you might want to explore some more advanced topics: + +* <> +* <> +* <> + +include::playground-context.asciidoc[] +include::playground-query.asciidoc[] +include::playground-troubleshooting.asciidoc[] + diff --git a/docs/playground/playground-context.asciidoc b/docs/playground/playground-context.asciidoc new file mode 100644 index 0000000000000..c0c4533fcb1a0 --- /dev/null +++ b/docs/playground/playground-context.asciidoc @@ -0,0 +1,68 @@ +[role="xpack"] +[[playground-context]] +== Optimize model context + +preview::[] + +// Variable (attribute) definition +:x: Playground + +Context is the information you provide to the LLM, to optimize the relevance of your query results. +Without additional context, an LLM will generate results solely based on its training data. +In {x}, this additional context is the information contained in your {es} indices. + +There are a few ways to optimize this context for better results. +Some adjustments can be made directly in the {x} UI. +Others require refining your indexing strategy, and potentially reindexing your data. + +[float] +[[playground-context-ui]] +== Edit context in UI + +Use the *Edit context* button in the {x} UI to adjust the number of documents and fields sent to the LLM. + +If you're hitting context length limits, try the following: + +* Limit the number of documents retrieved +* Pick a field with less tokens, reducing the context length + +[float] +[[playground-context-index]] +== Other context optimizations + +This section covers additional context optimizations that you won't be able to make directly in the UI. + +[float] +[[playground-context-index-chunking]] +=== Chunking large documents + +If you're working with large fields, you may need to adjust your indexing strategy. +Consider breaking your documents into smaller chunks, such as sentences or paragraphs. + +If you don't yet have a chunking strategy, start by chunking your documents into passages. + +Otherwise, consider updating your chunking strategy, for example, from sentence based to paragraph based chunking. + +Refer to the following Python notebooks for examples of how to chunk your documents: + +* https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/json-chunking-ingest.ipynb[JSON documents] +* https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/pdf-chunking-ingest.ipynb[PDF document] +* https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/website-chunking-ingest.ipynb[Website content] + +[float] +[[playground-context-balance]] +=== Balancing cost and latency + +Here are some general recommendations for balancing cost and latency with different context sizes: + +Optimize context length:: +Determine the optimal context length through empirical testing. +Start with a baseline and adjust incrementally to find a balance that optimizes both response quality and system performance. +Implement token pruning for ELSER model:: +If you're using our ELSER model, consider implementing token pruning to reduce the number of tokens sent to the model. +Refer to these relevant blog posts: ++ +* https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-2[Optimizing retrieval with ELSER v2] +* https://www.elastic.co/search-labs/blog/text-expansion-pruning[Improving text expansion performance using token pruning] +Monitor and adjust:: +Continuously monitor the effects of context size changes on performance and adjust as necessary. diff --git a/docs/playground/playground-query.asciidoc b/docs/playground/playground-query.asciidoc new file mode 100644 index 0000000000000..11ed2e71b1a2d --- /dev/null +++ b/docs/playground/playground-query.asciidoc @@ -0,0 +1,51 @@ +[xpack] +[[playground-query]] +== View and modify queries + +:x: Playground + +preview::[] + +Once you've set up your chat interface, you can start chatting with the model. +{x} will automatically generate {es} queries based on your questions, and retrieve the most relevant documents from your {es} indices. +The {x} UI enables you to view and modify these queries. + +* Click *View query* to open the visual query editor. +* Modify the query by selecting fields to query per index. +* Click *Save changes*. + +[TIP] +==== +The `{query}` variable represents the user's question, rewritten as an {es} query. +==== + +The following screenshot shows the query editor in the {x} UI. +In this simple example, the `books` index has two fields: `author` and `name`. +Selecting a field adds it to the `fields` array in the query. + +[.screenshot] +image::images/edit-query.png[View and modify queries] + +[float] +[[playground-query-relevance]] +=== Improving relevance + +The fields you select in the query editor determine the relevance of the retrieved documents. + +Remember that the next step in the workflow is to send the retrieved documents to the LLM to answer the question. +Context length is an important factor in ensuring the model has enough information to generate a relevant answer. +Refer to <> for more information. + +<> provides tips on how to diagnose and fix relevance issues. + +[.screenshot] + + + +[NOTE] +==== +{x} uses the {ref}/retriever.html[`retriever`] syntax for {es} queries. +Retrievers make it easier to compose and test different retrieval strategies in your search pipelines. +==== +// TODO: uncomment and add to note once following page is live +//Refer to {ref}/retrievers-overview.html[documentation] for a high level overview of retrievers. \ No newline at end of file diff --git a/docs/playground/playground-troubleshooting.asciidoc b/docs/playground/playground-troubleshooting.asciidoc new file mode 100644 index 0000000000000..8fece498b12d5 --- /dev/null +++ b/docs/playground/playground-troubleshooting.asciidoc @@ -0,0 +1,26 @@ +[role="xpack"] +[[playground-troubleshooting]] +== Troubleshooting + +preview::[] + +:x: Playground + +Dense vectors are not searchable:: +Embeddings must be generated using the {ref}/inference-processor.html[inference processor] with an ML node. + +Context length error:: +You'll need to adjust the size of the context you're sending to the model. +Refer to <>. + +LLM credentials not working:: +Under *Model settings*, use the wrench button (🔧) to edit your GenAI connector settings. + +Poor answer quality:: +Check the retrieved documents to see if they are valid. +Adjust your {es} queries to improve the relevance of the documents retrieved. Refer to <>. ++ +You can update the initial instructions to be more detailed. This is called 'prompt engineering'. Refer to this https://platform.openai.com/docs/guides/prompt-engineering[OpenAI guide] for more information. ++ +You might need to click *⟳ Clear chat* to clear chat history and start a new conversation. +If you mix topics, the model will find it harder to generate relevant responses. \ No newline at end of file diff --git a/docs/redirects.asciidoc b/docs/redirects.asciidoc index 007a9d9f48cfc..be017fbd1c94e 100644 --- a/docs/redirects.asciidoc +++ b/docs/redirects.asciidoc @@ -432,9 +432,4 @@ This connector was renamed. Refer to <>. == APIs For the most up-to-date API details, refer to the -{kib-repo}/tree/{branch}/x-pack/plugins/alerting/docs/openapi[alerting], {kib-repo}/tree/{branch}/x-pack/plugins/cases/docs/openapi[cases], {kib-repo}/tree/{branch}/x-pack/plugins/actions/docs/openapi[connectors], and {kib-repo}/tree/{branch}/x-pack/plugins/ml/common/openapi[machine learning] open API specifications. - -[role="exclude",id="playground"] -== Playground - -Coming in 8.14.0. \ No newline at end of file +{kib-repo}/tree/{branch}/x-pack/plugins/alerting/docs/openapi[alerting], {kib-repo}/tree/{branch}/x-pack/plugins/cases/docs/openapi[cases], {kib-repo}/tree/{branch}/x-pack/plugins/actions/docs/openapi[connectors], and {kib-repo}/tree/{branch}/x-pack/plugins/ml/common/openapi[machine learning] open API specifications. \ No newline at end of file diff --git a/docs/user/index.asciidoc b/docs/user/index.asciidoc index bf21f7b262924..419574804312c 100644 --- a/docs/user/index.asciidoc +++ b/docs/user/index.asciidoc @@ -28,6 +28,8 @@ include::alerting/index.asciidoc[] include::{kibana-root}/docs/observability/index.asciidoc[] +include::{kibana-root}/docs/playground/index.asciidoc[] + include::{kibana-root}/docs/apm/index.asciidoc[] include::{kibana-root}/docs/siem/index.asciidoc[]