Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building anserini on MacOS #2445

Closed
Nebu0528 opened this issue Apr 6, 2024 · 21 comments
Closed

Building anserini on MacOS #2445

Nebu0528 opened this issue Apr 6, 2024 · 21 comments

Comments

@Nebu0528
Copy link

Nebu0528 commented Apr 6, 2024

image
I am builing anserini using JDK 21 and maven 3.9.6 as mentioned in the updated readme, however it is failing when trying to connect to the link as shown in the screenshot.

@lintool
Copy link
Member

lintool commented Apr 6, 2024

Seems like a transient error... try again?

@Nebu0528
Copy link
Author

Nebu0528 commented Apr 6, 2024

The issue still persists, with the addition of a runtime error, is there something you'd recommend? I can provide an error log if you'd like as well.

image

@lintool
Copy link
Member

lintool commented Apr 6, 2024

Hrm... seems to work for me... can you provide more detailed error log?

@Nebu0528
Copy link
Author

Nebu0528 commented Apr 6, 2024

[ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.013 s <<< FAILURE! - in io.anserini.index.PrebuiltIndexTest
[ERROR] io.anserini.index.PrebuiltIndexTest.testUrls Time elapsed: 0.013 s <<< ERROR!
java.lang.RuntimeException: Error connecting to https://github.com/castorini/anserini-data/raw/master/CACM/lucene-index.cacm.20221005.252b5e.tar.gz
at io.anserini.index.PrebuiltIndexTest.testUrls(PrebuiltIndexTest.java:52)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:316)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:240)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:214)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:155)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
Caused by: java.net.UnknownHostException: example.com
at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:567)
at java.base/java.net.Socket.connect(Socket.java:751)
at java.base/java.net.Socket.connect(Socket.java:686)
at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:183)
at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:531)
at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:636)
at java.base/sun.net.www.http.HttpClient.(HttpClient.java:280)
at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:386)
at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:408)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1304)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1237)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1123)
at java.base/sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1052)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1675)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1599)
at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:531)
at io.anserini.index.PrebuiltIndexTest.testUrls(PrebuiltIndexTest.java:50)
... 26 more

@lintool
Copy link
Member

lintool commented Apr 6, 2024

This works fine for me... the URL does appear to be valid.

wget  https://github.com/castorini/anserini-data/raw/master/CACM/lucene-index.cacm.20221005.252b5e.tar.gz

Try that on your end and see what you get?

@Nebu0528
Copy link
Author

Nebu0528 commented Apr 6, 2024

it works, i tried running the mvn command after that but it is giving the same issue from the screenshot, i found some resource on the runtimeexception so I will try to fix that

@lintool
Copy link
Member

lintool commented Apr 6, 2024

Try removing ~/.cache/anserini and ~/.cache/pyserini. If that still doesn't work, try a fresh clone.

@Nebu0528
Copy link
Author

Nebu0528 commented Apr 7, 2024

Update: My friend below mentioned a fix

Same issue from doing both:

Attached is the textfile of the error report: io.anserini.index.PrebuiltIndexTest.txt
io.anserini.index.PrebuiltIndexTest copy.txt

@Gelardinio
Copy link
Contributor

Pretty sure line 48 of PreBuiltIndexTest.java needs to be changed from

final URL requestUrl = new URI("http://example.com").toURL();

to

final URL requestUrl = new URI(url).toURL();

Otherwise errors are caused within builds since the url being tested isn't used and http://example.com/ is an invalid url.

@lintool
Copy link
Member

lintool commented Apr 7, 2024

Pretty sure http://example.com/ is a valid URL:

$ curl -I https://example.com/
HTTP/2 200 
content-encoding: gzip
accept-ranges: bytes
age: 7987
cache-control: max-age=604800
content-type: text/html; charset=UTF-8
date: Sun, 07 Apr 2024 12:14:26 GMT
etag: "3147526947"
expires: Sun, 14 Apr 2024 12:14:26 GMT
last-modified: Thu, 17 Oct 2019 07:18:26 GMT
server: ECS (dce/26D5)
x-cache: HIT
content-length: 648

... but, try changing it to something like https://uwaterloo.ca/ and see what happens?

@Gelardinio
Copy link
Contributor

Yeah it does look valid thought it wasn't. Also not sure why it would be checking example.com for all urls under IndexInfo as it looks like none of the urls in IndexInfo is checked for from this test:

 // test url validity
  @Test
  public void testUrls() {
    for (IndexInfo info : IndexInfo.values()) {
      for (String url : info.urls) {
        // check each url status code is 200
        try {
          final URL requestUrl = new URI("http://example.com").toURL();
          final HttpURLConnection con = (HttpURLConnection) requestUrl.openConnection();
          assertEquals(200, con.getResponseCode());
        } catch (IOException e) {
          throw new RuntimeException("Error connecting to " + url, e);
        } catch (Exception e) {
          throw new RuntimeException("Malformed URL: " + url, e);
        }
      }
    }
  }

@lintool
Copy link
Member

lintool commented Apr 7, 2024

Yea, good point... maybe that was supposed to be a placeholder that didn't get fixed. It would make sense to check all the URLs, as intended. @Gelardinio would you be willing to send a PR to fix?

Always good to fix janky code, but still doesn't address the original question of why the tests are failing?

@Gelardinio
Copy link
Contributor

Sounds good. I don't think I have the permissions to make a PR yet since I just started the onboarding yesterday. To answer the original question it's probably because some rate limit is probably being reached with http://example.com because we're sending a request to it for every url in IndexInfo.

@lintool
Copy link
Member

lintool commented Apr 7, 2024

re: permissions - clone the repo and send PR that way?

@Gelardinio
Copy link
Contributor

Didn't seem to work when I tried. Will try again once I'm back later today.

@Gelardinio
Copy link
Contributor

Looks like I don't have permissions to make a pr with my branch I think.

@lintool
Copy link
Member

lintool commented Apr 9, 2024

Looks like I don't have permissions to make a pr with my branch I think.

Clone the repo, then branch off your clone, then send PR?

@Gelardinio
Copy link
Contributor

I believe that's what I'm doing:

Screenshot 2024-04-08 at 9 05 03 PM

@lintool
Copy link
Member

lintool commented Apr 9, 2024

No, you're trying to branch off the main anserini repo.

You need to fork first: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo

@Gelardinio
Copy link
Contributor

My bad, this should work #2449

@lintool
Copy link
Member

lintool commented Apr 9, 2024

hi @Gelardinio thanks for your PR. merged.

@lintool lintool closed this as completed Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants