Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.ArrayIndexOutOfBoundsException: -2 at java.util.ArrayList.elementData(ArrayList.java:418) #280

Open
ghost opened this issue Jan 23, 2017 · 3 comments

Comments

@ghost
Copy link

ghost commented Jan 23, 2017

Upon their advice, re-posting EsotericSoftware/kryo#490 here.


So I've hit this error, and all my effort turned futile. It's odd, because it works but then adding all sorts of lines break in exactly the same way.

I have no idea if this has to do with the cluster setup or with dependencies (tracing the dependencies on Maven shows that kyro 2.21 is used).

Any ideas? Pointers? Things to try?

Code

    inputPipe
        .read
        .mapTo( ( 'rowkey, 'row ) -> ( 'rowkey, 'text ) ) {
            kvRow: ( ImmutableBytesWritable, Result ) => {
                val (rowkey, row ) = kvRow
                // Next line works
                //val text = Bytes.toString( row.getValue( Bytes.toBytes( 'content.name ), Bytes.toBytes( 'text.name ) ) )

                // Breaks (see error below)
                val text = HBaseHelpers.getCellValueAsString( row, textCol ) // - Doesn't work for some reason

                // Breaks (same error below)
                // val doc = proc.annotate( text )
                // val ser = new DocumentSerializer
                // val out = ser.save( doc )

                ( rowkey, text )
            }
        }
        .write( output )

Build

Relevant bits:

ext { 
    scalaVersion    = '2.11.7'
    scaldingVersion = '0.15.0'
    hadoopVersion   = '2.7.1'
    hbaseVersion    = '1.1.2'
    processorsVersion = '6.0.1'
}

dependencies {

  compile( group: 'org.scala-lang', name: 'scala-library', version: scalaVersion )
  compile( group: 'org.scala-lang', name: 'scala-compiler', version: scalaVersion )

  compile( group: 'com.twitter', name: 'scalding-core_2.11', version: scaldingVersion ) {
    exclude group: 'cascading', module: 'cascading-hadoop'
  }
  compile( group: 'com.twitter', name: 'scalding-json_2.11', version: scaldingVersion )

  provided( group: 'org.apache.hadoop', name: 'hadoop-common', version: hadoopVersion )
  provided( group: 'org.apache.hadoop', name: 'hadoop-mapreduce-client-core', version: hadoopVersion )

  compile( group: 'org.apache.hbase',  name: 'hbase-client', version: hbaseVersion )
  compile( group: 'org.apache.hbase',  name: 'hbase-server', version: hbaseVersion )

  compile 'com.github.adevuyst:SpyGlass:ab48a58'

  compile( group: 'org.clulab',  name: 'processors-corenlp_2.11', version: processorsVersion )
  compile( group: 'org.clulab',  name: 'processors-main_2.11', version: processorsVersion )
  compile( group: 'org.clulab',  name: 'processors-models_2.11', version: processorsVersion )

}

Error

java.lang.Exception: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 10 more
Caused by: cascading.flow.FlowException: internal error during mapper configuration
        at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:102)
        ... 15 more    
Caused by: com.esotericsoftware.kryo.KryoException: java.lang.ArrayIndexOutOfBoundsException: -2
Serialization trace:
familyMap (org.apache.hadoop.hbase.client.Scan)
scan (FastNLP)
$outer (FastNLP$$anonfun$3)
        at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:626)
        at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
        at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
        at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
        at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
        at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
        at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
        at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
        at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:25)
        at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:19)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
        at com.twitter.chill.SerDeState.readClassAndObject(SerDeState.java:61)
        at com.twitter.chill.KryoPool.fromBytes(KryoPool.java:94)
        at com.twitter.chill.Externalizer.fromBytes(Externalizer.scala:145)
        at com.twitter.chill.Externalizer.maybeReadJavaKryo(Externalizer.scala:158)
        at com.twitter.chill.Externalizer.readExternal(Externalizer.scala:148)
        at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1849)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1806)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
        at java.util.HashMap.readObject(HashMap.java:1404)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
        at cascading.flow.hadoop.util.JavaObjectSerializer.deserialize(JavaObjectSerializer.java:101)
        at cascading.flow.hadoop.util.HadoopUtil.deserializeBase64(HadoopUtil.java:312)
        at cascading.flow.hadoop.util.HadoopUtil.deserializeBase64(HadoopUtil.java:293)
        at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:81)
        ... 15 more   
Caused by: java.lang.ArrayIndexOutOfBoundsException: -2
        at java.util.ArrayList.elementData(ArrayList.java:418)
        at java.util.ArrayList.get(ArrayList.java:431)
        at com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
        at com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:773)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:727)
        at com.esotericsoftware.kryo.serializers.DefaultSerializers$TreeMapSerializer.create(DefaultSerializers.java:529)
        at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:97)
        at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
        at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
        at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
        ... 69 more
17/01/20 15:36:18 WARN flow.FlowStep: [FastNLP] hadoop job job_local462571121_0001 state at FAILED
17/01/20 15:36:18 WARN flow.FlowStep: [FastNLP] failure info: NA
17/01/20 15:36:18 WARN flow.FlowStep: [FastNLP] task completion events identify failed tasks
17/01/20 15:36:18 WARN flow.FlowStep: [FastNLP] task completion events count: 0
17/01/20 15:36:18 INFO flow.Flow: [FastNLP] stopping all jobs
17/01/20 15:36:18 INFO flow.FlowStep: [FastNLP] stopping: (1/1) output1.txt
17/01/20 15:36:18 INFO flow.Flow: [FastNLP] stopped all jobs
17/01/20 15:36:18 INFO util.Hadoop18TapUtil: deleting temp path output1.txt/_temporary
Exception in thread "main" java.lang.Throwable: If you know what exactly caused this error, please consider contributing to GitHub via following link.
https://github.com/twitter/scalding/wiki/Common-Exceptions-and-possible-reasons#javalangarrayindexoutofboundsexception
        at com.twitter.scalding.Tool$.main(Tool.scala:130)
        at com.twitter.scalding.Tool.main(Tool.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: cascading.flow.FlowException: local step failed
        at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:230)
        at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:150)
        at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:124)
        at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:43)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.esotericsoftware.kryo.KryoException: java.lang.ArrayIndexOutOfBoundsException: -2
Serialization trace:
familyMap (org.apache.hadoop.hbase.client.Scan)
scan (FastNLP)
$outer (FastNLP$$anonfun$3)             
        at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:626)
        at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
        at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
        at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
        at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
        at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
        at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
        at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
        at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:25)
        at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:19)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
        at com.twitter.chill.SerDeState.readClassAndObject(SerDeState.java:61)
        at com.twitter.chill.KryoPool.fromBytes(KryoPool.java:94)
        at com.twitter.chill.Externalizer.fromBytes(Externalizer.scala:145)
        at com.twitter.chill.Externalizer.maybeReadJavaKryo(Externalizer.scala:158)
        at com.twitter.chill.Externalizer.readExternal(Externalizer.scala:148)
        at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1849)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1806)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
        at java.util.HashMap.readObject(HashMap.java:1404)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
        at cascading.flow.hadoop.util.JavaObjectSerializer.deserialize(JavaObjectSerializer.java:101)
        at cascading.flow.hadoop.util.HadoopUtil.deserializeBase64(HadoopUtil.java:312)
        at cascading.flow.hadoop.util.HadoopUtil.deserializeBase64(HadoopUtil.java:293)
        at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:81)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        ... 4 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: -2
        at java.util.ArrayList.elementData(ArrayList.java:418)
        at java.util.ArrayList.get(ArrayList.java:431)
        at com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
        at com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:773)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:727)
        at com.esotericsoftware.kryo.serializers.DefaultSerializers$TreeMapSerializer.create(DefaultSerializers.java:529)
        at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:97)
        at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
        at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
        at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
        ... 69 more     
@johnynek
Copy link
Collaborator

This looks like an error with a closure. What is going on inside HBaseHelpers?

See Externalizer in the stack? That is used to serialize closures it tries java serialization first then falls back to kryo. I'm not sure why you get this error, but understanding more about what is getting pulled into the closure can help.

Generally seeing all the code is needed here.

A common strategy to side step these issues is to make things in closures lazy where defined (not inside the closure).

You might also ask on the scalding gitter channel to get more help.

@ghost
Copy link
Author

ghost commented Jan 23, 2017

@johnynek Thanks for responding.

This is HBaseHelpers (it is in a file on the classpath):

package object HBaseHelpers {
    def getCellValueAsString( row: Result, col: HBaseCol ): String =
        Bytes.toString( row.getValue( Bytes.toBytes( col.family.name ), Bytes.toBytes( col.name.name ) ) )
}

Does help with your hypothesis in any way?

@johnynek
Copy link
Collaborator

johnynek commented Jan 23, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant