Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuilding search index takes a long time #3698

Closed
chrislam opened this issue Jan 23, 2019 · 18 comments
Closed

Rebuilding search index takes a long time #3698

chrislam opened this issue Jan 23, 2019 · 18 comments
Labels
enhancement improvements to existing features

Comments

@chrislam
Copy link

Description

Rebuilding the search index on my project is taking a very long time. At this point it seems to be about 10% done after about 3 hours.

It also seems that I need to keep the Search Indexes page open to continue rebuilding the search index and if it closes or my session ends I have to restart from the beginning.

It would be good if the rebuilding process could be more robust, not requiring the user to keep a window open and session active.
It could be a good idea to make this operation run as a job or multiple jobs working on batches of content.
Another idea could be to add a console command to rebuild the search index.

Steps to reproduce

  1. Click the Update search indexes button on the Search Indexes Utility screen.

Additional info

  • Craft version: Craft Pro 3.1.3
  • PHP version: 7.2.5
  • Database driver & version: MySQL 5.6.42
  • Plugins & versions:
    • Contact Form 2.2.3
    • Contact Form Extensions 1.0.13
    • Contact Form Honeypot 1.0.2
    • Cookies 1.1.11
    • Element API 2.5.4
    • Queue Manager 1.1.0
    • Redactor 2.2.1
    • Scout 1.1.1
    • Super Table 2.1.5.1
    • Video Embedder 1.1.1
@brandonkelly
Copy link
Member

Craft 3.1.2 added an index-assets console command that you can use instead of the web-based utility. (#3595)

Sounds like that will work better for you :)

Run this to learn more about it:

./craft index-assets --help

@chrislam
Copy link
Author

That command seems to be solely for asset volumes, I am talking about the search index for content.

@brandonkelly brandonkelly reopened this Jan 23, 2019
@brandonkelly
Copy link
Member

Doh I’m sorry, I misread the request!

@brandonkelly brandonkelly added the enhancement improvements to existing features label Jan 23, 2019
@guyinpv
Copy link

guyinpv commented Feb 28, 2019

I concur with this one.
I was just playing around and sort of accidentally clicked to rebuilt index for fun, not thinking it would be an all-day event.

Also found out the hard way I have to leave the page open.

I only have about 700 entries but this is going to take forever.

I'm fine with it taking forever, but would be nice to not keep page open, or to at least have a progress bar or some kind of warning like "leave this page open!" and let me know when it's done, etc.

@echantigny
Copy link
Contributor

I'm also looking for a better way of doing this. I have over 5000+ commerce products, 10000 users and a dozen pages. It takes hours for me too of course. A CLI command to trigger this would be great, with a progress visual if possible.

And if that makes any sense, might be worth deleting/recreating the items one by one instead of clearing the whole table out before starting. That way, any search functionality would still mostly work on the front end if anything needs to happen.

@himynameisphil
Copy link

+1 here. My searchindex is sitting at over 3 millions rows and it's basically impossible to rebuild as it stands.

@EternalBlack
Copy link

I pretty much have the same problem. Running search indexes takes about 10 minutes. Also using 2FA Plugin which kicks me out of the session after 5 min. So I'm not able to run it properly just because it requires me to watch it inside the browser... Would be nice to wrap this into a task and make it available to CLI. Also totally agree with @echantigny. As our front-end search relies on the search capability of Craft itself it is not feasible to run an re-index at all as this would result in downtime/the search not working...

@d--j
Copy link
Contributor

d--j commented Jun 17, 2019

Until Craft adds a console version, just copy&paste this one to a plugin/module of yours.
It loads the elements in batches and does not have the overhead of a request per element like the CP version. On our Craft install this speeds up the search index rebuild process quite nicely – from several hours to 7 minutes for 50.000+ elements to re-index.

<?php
namespace bejond\rebels\console\controllers;

use Craft;
use craft\base\Element;
use craft\base\ElementInterface;
use craft\base\Field;
use craft\db\Query;
use craft\db\Table;
use craft\helpers\Console;
use yii\console\Controller;
use yii\console\ErrorHandler;

/**
 * Rebuild the search index (more efficiently than the CP version)
 *
 * ./craft rebels/search-index/re-index
 *
 * @author    Bejond
 * @package   Rebels
 * @since     1.0.0
 */
class SearchIndexController extends Controller
{

    // Public Methods
    // =========================================================================

    /**
     * Rebuilds search index
     *
     * @return mixed
     * @throws \Throwable
     */
    public function actionReIndex()
    {
        $searchService = Craft::$app->getSearch();

        echo "Dropping search index\n";
        Craft::$app->getDb()->createCommand()
            ->truncateTable(Table::SEARCHINDEX)
            ->execute();

        $elements = (new Query())
            ->select(['id', 'type'])
            ->from([Table::ELEMENTS])
            ->where([
                'dateDeleted' => null,
            ])
            //->andWhere(['not', ['type' => 'barrelstrength\sproutbasereports\elements\Report']]) // is not compatible with CLI
            ->orderBy(['type' => SORT_ASC, 'id' => SORT_ASC])
            ->all();

        printf("Found %d elements to re-index\n", count($elements));

        $batch = [];
        foreach ($elements as $element) {
            if (empty($batch[$element['type']])) {
                $batch[$element['type']] = [];
            }
            $batch[$element['type']][] = $element['id'];
        }
        unset($elements);

        $typeCount = count($batch);
        $typeIndex = 0;

        foreach ($batch as $class => $ids) {
            $idCount = count($ids);
            $typeIndex += 1;
            try {
                /** @var ElementInterface $class */
                if ($class::isLocalized()) {
                    $siteIds = Craft::$app->getSites()->getAllSiteIds();
                } else {
                    $siteIds = [Craft::$app->getSites()->getPrimarySite()->id];
                }

                foreach ($siteIds as $siteId) {
                    printf("[%02d/%02d] Re-index %d %s on site id %s\n", $typeIndex, $typeCount, $idCount, $class, $siteId);
                    $i = 0;
                    Console::startProgress($i, $idCount);
                    try {
                        foreach (array_chunk($ids, 100) as $idChunk) {
                            $query = $class::find()
                                ->id($idChunk)
                                ->siteId($siteId)
                                ->anyStatus();

                            foreach ($query->all() as $element) {
                                /** @var Element $element */
                                $searchService->indexElementAttributes($element);

                                if ($class::hasContent() && ($fieldLayout = $element->getFieldLayout()) !== null) {
                                    $keywords = [];

                                    foreach ($fieldLayout->getFields() as $field) {
                                        /** @var Field $field */
                                        if ($field->searchable) {
                                            // Set the keywords for the content's site
                                            $fieldValue = $element->getFieldValue($field->handle);
                                            $fieldSearchKeywords = $field->getSearchKeywords($fieldValue, $element);
                                            $keywords[$field->id] = $fieldSearchKeywords;
                                        }
                                    }

                                    $searchService->indexElementFields($element->id, $siteId, $keywords);
                                }
                            }
                            unset($query);
                            $i += count($idChunk);
                            Console::updateProgress($i, $idCount);
                        }
                    } finally {
                        Console::endProgress();
                    }
                }
            } catch (\Throwable $e) {
                Console::stderr("While processing $class we got the following exception: \n" . ErrorHandler::convertExceptionToVerboseString($e) . "\nSkipping further processing of $class\n");
            }
        }

        return 0;
    }

}

@d--j
Copy link
Contributor

d--j commented Jun 17, 2019

[...] And if that makes any sense, might be worth deleting/recreating the items one by one instead of clearing the whole table out before starting. That way, any search functionality would still mostly work on the front end if anything needs to happen.

@echantigny the CLI commands craft resave/entries, craft resave/users etc. should update the search index without dropping it. Maybe that's a better alternative for you.

@eric-chantigny
Copy link

@d--j Thanks, but there's no resave commands for Commerce products yet.

I might use your code to get the reindexing going faster. Might also modify it to delete lines one by one based on the element that is reindexing at that point. Will keep most of the search intact while it goes through it.

gtettelaar added a commit to gtettelaar/cms that referenced this issue Jun 17, 2019
brandonkelly added a commit that referenced this issue Jun 18, 2019
@brandonkelly
Copy link
Member

brandonkelly commented Jun 18, 2019

I’ve decided to just remove the Search Indexes utility entirely in Craft 3.2, for a couple reasons:

  • Mainly because 99% of the time it’s used as a troubleshooting measure when an element query search param isn’t yielding the expected results, however 0% of those times will it actually help. (Craft doesn’t randomly forget search results.)

  • When there is a legitimate reason to rebuild indexes, re-saving elements will take just about the same amount of time, and thanks to the resave/* commands introduced in Craft 3.1.15, there’s now an easy way to trigger that, and it can even be granular (just re-save entries in the “News” section with an “Article” entry type, etc.). And as of the just-released Craft 3.2 Beta 3, non-core element types can now register their own resave/* commands as well.

@BenParizek
Copy link
Contributor

BenParizek commented Jun 26, 2019

@brandonkelly What's the recommended way to address Custom Elements that don't display in search indexes for some reason. Should we now be adding our own resave commands too?

@brandonkelly
Copy link
Member

@BenParizek Yes, which is possible in Craft 3.2  – see the new EVENT_DEFINE_ACTIONS:

/**
* @event DefineConsoleActionsEvent The event that is triggered when defining custom actions for this controller.
*
* See [[defineActions()]] for details on what to set on `$event->actions`.
* ---
* ```php
* use craft\events\DefineConsoleActionsEvent;
* use craft\console\Controller;
* use craft\console\controllers\ResaveController;
* use yii\base\Event;
*
* Event::on(ResaveController::class,
* Controller::EVENT_DEFINE_ACTIONS,
* function(DefineConsoleActionsEvent $event) {
* $event->actions['products'] = [
* 'options' => ['type'],
* 'helpSummary' => 'Re-saves products.',
* 'action' => function($params): int {
* // @var ResaveController $controller
* $controller = Craft::$app->controller;
* $query = Product::find();
* if ($controller->type) {
* $query->type(explode(',', $controller->type));
* }
* return $controller->saveElements($query);
* }
* ];
* }
* );
* ```
*/
const EVENT_DEFINE_ACTIONS = 'defineActions';

@bluestormdesign
Copy link

bluestormdesign commented Aug 7, 2019

@brandonkelly

I like that you are removing it from 3.2 but what about 2? I ask because its not possible for me to update about 50 version 2 sites to 3.

I have sites running craft 2 and 3. This is a big problem on my larger sites because if you need to rebuild the search index it removes all previous entries and makes searching the site impossible - this can go on for hours / days depending on the size of the site. For instance if you have a ecommerce site it will become useless until its finished!

@brandonkelly
Copy link
Member

@bluestormdesign We are only releasing bug fixes for Craft 2 at this point (see https://craftcms.com/guides/craft-cms-2-and-craft-commerce-1-end-of-life-information). You would have to write your own console command similar to Craft 2’s resave command, if you want similar functionality there.

iparr added a commit to iparr/cms that referenced this issue Aug 22, 2019
This no longer exists, in settings or anywhere (?) according to craftcms#3698
@CreateSean
Copy link

@d--j

I've run into the same issue after importing entries with feed-me. The updating search index never completes.

How do I use the code you shared above. I put it in file named search-index-faster.php in the modules folder but not sure what to do now.

@d--j
Copy link
Contributor

d--j commented Jan 3, 2020

@CreateSean
I have put the console command in a plugin (in src/console/controllers). When you use a module https://www.yiiframework.com/doc/guide/2.0/en/structure-modules#console-commands-in-modules will probably tell you how to do it.

@jerome2710
Copy link

I’ve decided to just remove the Search Indexes utility entirely in Craft 3.2, for a couple reasons:

  • Mainly because 99% of the time it’s used as a troubleshooting measure when an element query search param isn’t yielding the expected results, however 0% of those times will it actually help. (Craft doesn’t randomly forget search results.)
  • When there is a legitimate reason to rebuild indexes, re-saving elements will take just about the same amount of time, and thanks to the resave/* commands introduced in Craft 3.1.15, there’s now an easy way to trigger that, and it can even be granular (just re-save entries in the “News” section with an “Article” entry type, etc.). And as of the just-released Craft 3.2 Beta 3, non-core element types can now register their own resave/* commands as well.

For anyone trying to fix search issues, please note that resave/* contains:

--update-search-index: boolean, 0 or 1 (defaults to 0)
  Whether to update the search indexes for the resaved elements.

So you will need to explicitly mention that the search indexes need to be updating when running the command:

./craft resave/entries --section=someSection --update-search-index=1

Took me a while to realize 🤓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement improvements to existing features
Projects
None yet
Development

No branches or pull requests