Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

history not working when indexing subdirs #3101

Closed
Z4rc opened this issue Apr 1, 2020 · 22 comments
Closed

history not working when indexing subdirs #3101

Z4rc opened this issue Apr 1, 2020 · 22 comments

Comments

@Z4rc
Copy link

Z4rc commented Apr 1, 2020

Hello,

I am using OpenGrok 1.3.11 with Tomcat 9.0.31.
I have several directories in the source root which are working as individual projects using the indexer -P option.
If I call the indexer the normal way it indexes all projects and the history is also working for all projects. But if I execute the indexer for a single project like java -jar opengrok.jar [options] project1 then the history is not working anymore for all other projects, except "project1". For the other projects I get the following error when I try to open the history page:
image
The history data is still available for all projects under [dataRoot]/historycache.
To get the history working for all projects, I currently have to index all projects, although sometimes there was only one change in a single project.

@vladak vladak added the question label Apr 1, 2020
@vladak
Copy link
Member

vladak commented Apr 1, 2020

How exactly do you run the indexer ?

@vladak vladak added the indexer label Apr 1, 2020
@vladak
Copy link
Member

vladak commented Apr 1, 2020

For inspiration how to run per-project reindex see https://github.com/oracle/opengrok/wiki/Repository-synchronization#opengrok-sync

@Z4rc
Copy link
Author

Z4rc commented Apr 1, 2020

For all projects:
java -jar opengrok.jar -P -H -S --depth 2 -T 48 --progress --renamedHistory on -r on --canonicalRoot project1 --canonicalRoot project2 --canonicalRoot project3 -s /grok/source -d /grok/data -c /grok/FOSS/ctags -W /grok/etc/configuration.xml -R /grok/etc/read-only.xml -U http://my.webserver.com:8080/grok -i <ignore_patterns> -A <analyzer settings>

Only for project1:
java -jar opengrok.jar -P -H -S --depth 2 -T 48 --progress --renamedHistory on -r on --canonicalRoot project1 --canonicalRoot project2 --canonicalRoot project3 -s /grok/source -d /grok/data -c /grok/FOSS/ctags -W /grok/etc/configuration.xml -R /grok/etc/read-only.xml -U http://my.webserver.com:8080/grok -i <ignore_patterns> -A <analyzer settings> project1

@vladak
Copy link
Member

vladak commented Apr 1, 2020

It is definitely undesirable to use -W for per project reindex or -P or -S for the matter. The way per project workflow works is that projects are firstly added (using opengrok-projadm python script or via RESTful API) and then indexed using the configuration retrieved from the web app.

@vladak
Copy link
Member

vladak commented Apr 1, 2020

I revamped https://github.com/oracle/opengrok/wiki/Per-project-management-and-workflow so that hopefully it reads better and makes it clearer what needs to be done when indexing projects separately.

@vladak
Copy link
Member

vladak commented Apr 2, 2020

Are there any SEVERE log entries in your Tomcat log ? I wonder if the reader instance was null in the SearchHelper instance.

@vladak
Copy link
Member

vladak commented Apr 2, 2020

I think the result of prepareExec() should be checked (i.e. see if searchHelper.errorMsg is set) here:

searchHelper.prepareExec(project);

to prevent the exception from bubbling in full to the UI like demonstrated above.

@vladak
Copy link
Member

vladak commented Apr 2, 2020

Of course, the question is how the reader could become null when attempting to run the per project reindex with the toxic options and if we should restrict the options in such case.

@Z4rc
Copy link
Author

Z4rc commented Apr 2, 2020

localhost_access_log.2020-04-02.txt:

10.158.19.130 - zarc [02/Apr/2020:14:42:07 +0200] "GET /grok/api/v1/suggest/config HTTP/1.1" 200 376
10.158.19.130 - zarc [02/Apr/2020:14:42:07 +0200] "GET /grok/js/jquery-ui-1.12.1-draggable.min.js HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/history/project2/repository1 HTTP/1.1" 500 6896
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/style-1.0.0.min.css HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/mandoc-1.0.0.min.css HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/jquery-ui-1.12.1-custom.min.css HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/jquery-ui-1.12.1-custom.theme.min.css HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/jquery-ui-1.12.1-custom.structure.min.css HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/jquery.tooltip.min.css HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/jquery.tablesorter.min.css HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/searchable-option-list-2.0.3.min.css HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/js/jquery-3.4.1.min.js HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/js/jquery-ui-1.12.1-custom.min.js HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/js/jquery-tablesorter-2.26.6.min.js HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/js/tablesorter-parsers-0.0.2.min.js HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/js/utils-0.0.34.min.js HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/js/searchable-option-list-2.0.8.min.js HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/js/jquery.caret-1.5.2.min.js HTTP/1.1" 304 -
10.158.19.130 - zarc [02/Apr/2020:14:42:10 +0200] "GET /grok/default/print-1.0.0.min.css HTTP/1.1" 304 -

Do you mean this log file?

@Z4rc
Copy link
Author

Z4rc commented Apr 2, 2020

The history is also not working if I don't use option -P, -S and -W when indexing projects separately.

@vladak
Copy link
Member

vladak commented Apr 2, 2020

Do you mean this log file?

No, I mean the Tomcat log that is normally written to the catalina.out file.

@vladak
Copy link
Member

vladak commented Apr 2, 2020

The history is also not working if I don't use option -P, -S and -W when indexing projects separately.

I think that's just a fall out from indexes being fubar. The history JSP page needs to look into the index for certain settings.

@Z4rc
Copy link
Author

Z4rc commented Apr 2, 2020

There is no new entry written in file catalina.out when I try to open the history page.
Is there an option to increase the tomcat logging level?

@vladak
Copy link
Member

vladak commented Apr 2, 2020

I am mostly after the SEVERE entries in that log. The default log level is I believe INFO so the SEVERE log entries will certainly appear. Take a look into the historical entries in the file.

@Z4rc
Copy link
Author

Z4rc commented Apr 2, 2020

This is the last SEVERE entry:

31-Mar-2020 14:38:27.726 SEVERE [ForkJoinPool-276-worker-11] org.opengrok.suggest.Suggester.lambda$getInitRunnable$1 Could not initialize suggester data for project3
	java.io.IOException: The file /grok/data/suggester/project3/full_search_count.db the map is serialized from has unexpected length 0, probably corrupted. Data store size is 292032512
		at net.openhft.chronicle.map.ChronicleMapBuilder.openWithExistingFile(ChronicleMapBuilder.java:1807)
		at net.openhft.chronicle.map.ChronicleMapBuilder.createWithFile(ChronicleMapBuilder.java:1647)
		at net.openhft.chronicle.map.ChronicleMapBuilder.createPersistedTo(ChronicleMapBuilder.java:1564)
		at net.openhft.chronicle.map.ChronicleMapBuilder.createOrRecoverPersistedTo(ChronicleMapBuilder.java:1586)
		at net.openhft.chronicle.map.ChronicleMapBuilder.createOrRecoverPersistedTo(ChronicleMapBuilder.java:1575)
		at net.openhft.chronicle.map.ChronicleMapBuilder.createOrRecoverPersistedTo(ChronicleMapBuilder.java:1569)
		at org.opengrok.suggest.popular.impl.chronicle.ChronicleMapAdapter.resize(ChronicleMapAdapter.java:137)
		at org.opengrok.suggest.SuggesterProjectData.initSearchCountMap(SuggesterProjectData.java:337)
		at org.opengrok.suggest.SuggesterProjectData.init(SuggesterProjectData.java:158)
		at org.opengrok.suggest.Suggester.lambda$getInitRunnable$1(Suggester.java:185)
		at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1386)
		at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
		at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
		at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
		at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)

All other SEVERE entries are also of the type Could not initialize suggester data for <project>.
And these are the last Warnings:

02-Apr-2020 12:50:31.303 WARNING [http-nio-8080-exec-10] org.opengrok.indexer.framework.PluginFramework.reload Plugin directory not found or not readable: /grok/data/../plugins. All requests allowed.
02-Apr-2020 12:50:31.621 WARNING [ForkJoinPool-55-worker-9] org.opengrok.suggest.SuggesterProjectData.initFields Fields [hist] will be ignored because they were not found in index directory MMapDirectory@/grok/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@7615743c
02-Apr-2020 13:52:47.235 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [grok] appears to have started a thread named [chronicle-weak-reference-cleaner] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
 java.lang.Object.wait(Native Method)
 java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
 net.openhft.chronicle.core.util.WeakReferenceCleaner$ReferenceProcessor.run(WeakReferenceCleaner.java:93)
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 java.lang.Thread.run(Thread.java:748)
02-Apr-2020 13:53:19.359 WARNING [main] org.opengrok.indexer.framework.PluginFramework.reload Plugin directory not found or not readable: /grok/data/../plugins. All requests allowed.
02-Apr-2020 13:53:30.073 WARNING [ForkJoinPool-1-worker-9] org.opengrok.suggest.SuggesterProjectData.initFields Fields [hist] will be ignored because they were not found in index directory MMapDirectory@/grok/data/index/project6 lockFactory=org.apache.lucene.store.NativeFSLockFactory@56af6116

@Z4rc Z4rc closed this as completed Apr 2, 2020
@Z4rc Z4rc reopened this Apr 2, 2020
@vladak
Copy link
Member

vladak commented Apr 2, 2020

These does not seem to be related to this problem. Anyhow, index all projects and then change the per project indexer options according to the above mentioned wiki.

@Z4rc
Copy link
Author

Z4rc commented Apr 21, 2020

I started over again and removed the old generated data. I am now using newest OpenGrok 1.3.13 with Tomcat 9.0.31 but the history is still not working. But this time I have another error message:

image

The historycache generation definitely worked. For each source file I have a matching .gz file in the historycache directory. But it looks like the webapp is not able to find the historycache file.

When I use strace for the tomcat process, I do not see any attempt to open a .gz file.
In addition I have the following warning messages in my catalina.out (for all my repositories):

21-Apr-2020 09:47:49.123 WARNING [invalidate-repos-7315] org.opengrok.indexer.history.RepositoryFactory.getRepository GitRepository not working (missing binaries?): /grok/source/tools
21-Apr-2020 09:47:49.164 WARNING [invalidate-repos-7315] org.opengrok.indexer.history.RepositoryFactory.getRepository Failed to determineCurrentVersion for /grok/source/tools: java.io.IOException: fatal: unknown date format iso8601-strict

@vladak
Copy link
Member

vladak commented Apr 21, 2020

It could be that the repository is marked as invalid and that somehow makes its history inaccessible.

see #2641 for the possible solution of the git failure. It might be you are running into the same problem as mentioned in #2292 (comment) , i.e. the indexer uses newer git that supports the iso8601-strict format and the web app is using old git that does not support the format.

https://github.com/oracle/opengrok/wiki/How-to-setup-OpenGrok#requirements actually states the minimum Git version requirement.

@Z4rc
Copy link
Author

Z4rc commented Apr 21, 2020

Uff, that seems to be the problem. By default git points to an old git installation (version 1.8). I will try indexing again with java option -Dorg.opengrok.indexer.history.git=<path to the correct git.

Does the tomcat need git too? Do I have to modify $PATH for the tomcat process or something like this so that tomcat is using the new git version?

@tulinkry
Copy link
Contributor

Yes it needs it, it checks the latest commits in the repository so you can see them on the main page.

@Z4rc
Copy link
Author

Z4rc commented Apr 21, 2020

History is now working with java option -Dorg.opengrok.indexer.history.git=<path to the correct git.
History is also working for all repositories when indexing only a single project. That was the original reason why I raised the issue. Unfortunately I am not sure what exactly was the problem. However, a reinstallation helped. Thank you guys!

@vladak
Copy link
Member

vladak commented Apr 21, 2020

Good to know it helped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants