This demo project will fetch all the nodes from Alfresco to Apache Storm to process.(For demonstration purposes, nodes are being fetched and printed.) Then Apache Storm continuously checks for changed nodes using Alfresco indexer webscript. If any change is detected, project will fetch them and process them.
To run this demo, you need a running Alfresco instance with alfresco-indexer AMP. Then follow below instructions. (Instructions were extracted from storm-crawler project.)
To get started with alfresco-apache-storm-demo, it's recommended that you run the CrawlTopology in local mode.
NOTE: These instructions assume that you have Maven installed.
First, clone the project from github:
git clone https://github.com/zaizi/alfresco-apache-storm-demo.git
Then :
cd core
mvn clean compile exec:java -Dstorm.topology=com.digitalpebble.storm.crawler.CrawlTopology -Dexec.args="-conf crawler-conf.yaml -local"
to run the demo CrawlTopology.
Alternatively, generate an uberjar:
mvn clean package
and then submit the topology with storm jar
:
storm jar target/storm-crawler-core-0.5-SNAPSHOT-jar-with-dependencies.jar com.digitalpebble.storm.crawler.CrawlTopology -conf crawler-conf.yaml
to run it in distributed mode.
Alfresco/Apache Storm demo is developed using storm-crawler from @DigitalPebble and alfresco-indexer.