Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Spark 2 on Altiscale #43

Closed
lintool opened this issue Oct 1, 2017 · 2 comments
Closed

Upgrade to Spark 2 on Altiscale #43

lintool opened this issue Oct 1, 2017 · 2 comments

Comments

@lintool
Copy link
Member

lintool commented Oct 1, 2017

Let's upgrade to Spark 2 on Altiscale, following documentation here.

@ruebot
Copy link
Member

ruebot commented Oct 1, 2017

Once #32 is merged, I'll start working on this.

@ruebot
Copy link
Member

ruebot commented Oct 2, 2017

$ /opt/spark-beta/bin/alti-spark-shell --jars /mnt/ephemeral0/aut/aut-0.9.1-SNAPSHOT-fatjar.jar --conf spark.local.dir=/mnt/ephemeral0/aut/tmp --executor-cores 20 --executor-memory 10240M
/tmp/ruebot-hive-1.2.1-lib.zip: OK
ok - no need to re-generate the same /tmp/ruebot-hive-1.2.1-lib.zip, continuing
mkdir: `/user/ruebot/apps': File exists
put: `/user/ruebot/apps/hive-1.2.1-lib.zip': File exists
/opt/alti-spark-2.1.1 /opt
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2017-10-02 17:53:21,445 WARN  org.apache.spark.SparkConf (Logging.scala:logWarning(66)) - In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
2017-10-02 17:53:27,461 WARN  org.apache.spark.deploy.yarn.Client (Logging.scala:logWarning(66)) - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
Hive history file=/tmp/ruebot/hive_job_log_760e6f67-a72c-4c33-ab6e-4a81b5d34c06_1831998227.txt
Spark context Web UI available at http://10.252.18.87:45100
Spark context available as 'sc' (master = yarn, app id = application_1506640654827_0336).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.1
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information.

scala> :paste
// Entering paste mode (ctrl-D to finish)

import io.archivesunleashed.spark.matchbox._ 
import io.archivesunleashed.spark.rdd.RecordRDD._ 

val r = RecordLoader.loadArchives("/shared/au/wahr/WAHR_womens_march", sc) 
.keepValidPages()
.map(r => r.getUrl)
.take(10)


// Exiting paste mode, now interpreting.

import io.archivesunleashed.spark.matchbox._
import io.archivesunleashed.spark.rdd.RecordRDD._
r: Array[String] = Array(http://1988-unreal.tumblr.com/post/156248858195/bangmybox-miley-at-the-womensmarch, http://1000visions.tumblr.com/post/92161694912/1000-visions-of-global-change-alex-tuai, http://1045thecat.iheart.com/articles/trending-104650/madonna-defends-fiery-speech-at-womens-15494262/, http://1027jackfm.iheart.com/articles/inauguration-2017-501436/live-stream-womens-march-on-washington-15490495/, http://1043myfm.iheart.com/onair/lisa-foxx-32262/articles/15/501436/breaking-womens-march-organizers-say-crowd-15490683/, http://100percentfedup.com/gruesome-video-muslim-mob-tears-27-year-old-woman-apart-killing-false-accusation-burning-quran/, http://1454days.com/index.php/2017/01/21/d...

Looks like we're good using the current version with Spark 2.1.1. I'll update pom.xml, and test out things again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants