-
Notifications
You must be signed in to change notification settings - Fork 7
Compile
FCCNhviana edited this page Dec 28, 2015
·
23 revisions
This code was developed and tested in the Linux environment (Red Hat Enterprise Linux 5).
Maven 2.x:
- download from http://maven.apache.org/download.html
- add bin directory to PATH Apache Ant:
- yum install ant
- (see more at http://ant.apache.org/manual/install.html) install Subversion:
- yum install subversion
- (see more at http://subversion.apache.org/packages.html) Java SE 6.x (at least):
- export JAVA_HOME=/usr/java/default
- (see more at http://www.oracle.com/technetwork/java/javase/downloads/index.html) Apache Tomcat (5.5 at least):
- (see http://tomcat.apache.org/download-55.cgi)
Checkout Hadoop (branch-0.14):
git clone -b branch-0.14 https://github.com/arquivo/hadoop-common
Install Hadoop:
cd hadoop
-
create pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<name>hadoop</name>
<url>http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.14</url>
<groupId>org.apache</groupId>
<artifactId>hadoop</artifactId>
<version>0.14.5-dev-core</version>
<packaging>jar</packaging>
<build>
<directory>build/</directory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.8</version>
<executions>
<execution>
<phase>compile</phase>
<configuration>
<target>
<ant target="jar" inheritRefs="true">
<property name="build.compiler" value="extJavac"/>
<property name="build.sysclasspath" value="last"/>
</ant>
</target>
</configuration>
<goals>
<goal>run</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
mvn install
This version of Hadoop (http://hadoop.apache.org/) must be used for all mapreduce processing.
Checkout PwaLucene + PwaArchiveAccess:
-
git clone https://github.com/arquivo/pwa-technologies.git
Install PwaLucene:
-
cd pwa-technologies/PwaLucene
-
mvn install
Install PwaArchiveAccess:
-
cd pwa-technologies/PwaArchive-access
-
mvn install
- configure (only if you need to change the default configuration)
mvn install
The JAR and WAR files are available in:
-
pwa-technologies/PwaArchive-access/projects/nutchwax/nutchwax-job/target/nutchwax-job-0.11.0-SNAPSHOT.jar
-
pwa-technologies/PwaArchive-access/projects/nutchwax/nutchwax-webapp/target/nutchwax-webapp-0.11.0-SNAPSHOT.war
-
pwa-technologies/PwaArchive-access/projects/wayback/wayback-webapp/target/wayback-1.2.1.war
pwa-technologies/PwaLucene/target/pwalucene-1.0.0-SNAPSHOT.jar
Symbolic link of nutch for nutch-trec:
-
cd pwa-technologies/PwaArchive-access
-
ln -s ../../projects/nutchwax/nutchwax-thirdparty/nutch/ projects/nutch-trec/
This is only necessary if you will use the TREC datasets for tests.