ActiveData

This is a fork of Mozilla's ActiveData

Provide high speed filtering and aggregation over data see ActiveData Wiki Page for project details

https://app.travis-ci.com/github/klahnakoski/ActiveData

Branch	Status	Coverage
master
dev
v1.7

Requirements

Python 3.9
Elasticsearch version 6.x

Upgrading

The requirements.txt is locked to specific versions that work, you may update with

pip install pip-tools
pip-compile --upgrade --output-file requirements.txt requirements.in

see pip-tools for more information

Elasticsearch Configuration

Elasticsearch has a configuration file at config/elasticsearch.yml. You must modify it to handle a high number of scripts

script.painless.regex.enabled: true
script.max_compilations_rate: 10000/1m

We enable compression for faster transfer speeds

http.compression: true

And it is a good idea to give your cluster a unique name so it does not join others on your local network

cluster.name: lahnakoski_dev

then you can run Elasticsearch:

c:\elasticsearch>bin\elasticsearch

Elasticsearch runs off port 9200. Test it is working

curl http://localhost:9200

you should expect something like

{
  "status" : 200,
  "name" : "dev",
  "cluster_name" : "lahnakoski_dev",
  "version" : {
    "number" : "1.7.5",
    "build_hash" : "00f95f4ffca6de89d68b7ccaf80d148f1f70e4d4",
    "build_timestamp" : "2016-02-02T09:55:30Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

Installation

There is no PyPi install. Please clone master branch off of Github:

git clone https://github.com/klahnakoski/ActiveData.git
git checkout master

and install your requirements:

pip install -r requirements.txt

Configuration

The ActiveData service requires a configuration file that will point to the default Elasticsearch index. You can find a few sample config files in resources/config. simple_settings.json is simplest one:

    {
        "flask":{
             "host":"0.0.0.0",
             "port":5000,
             "debug":false,
             "threaded":true,
             "processes":1
         },
        "constants":{
            "mo_http.http.default_headers":{"From":"https://wiki.mozilla.org/Auto-tools/Projects/ActiveData"}
        },
        "elasticsearch":{
            "host":"http://localhost",
            "port":9200,
            "index":"unittest",
            "type":"test_result",
            "debug":true
        }
        ...<snip>...
    }

The elasticsearch property must be updated to point to a specific cluster, index and type. It is used as a default, and to find other indexes by name.

Run

Jump to your git project directory, set your PYTHONPATH and run app.py:

    cd ~/ActiveData
    export PYTHONPATH=.:vendor
    python active_data/app.py --settings=resources/config/simple_settings.json

Verify

If you have no records in your Elasticsearch cluster, then you must add some before you can query them.

Make a table in Elasticsearch, with one record:

curl -XPUT "http://localhost:9200/movies/movie/1" -d "{\"name\":\"The Parent Trap\",\"released\":\"29 July` 1998\",\"imdb\":\"http://www.imdb.com/title/tt0120783/\",\"rating\":\"PG\",\"director\":{\"name\":\"Nancy Meyers\",\"dob\":\"December 8, 1949\"}}"

Assuming you used the defaults, you can verify the service is up if you can access the Query Tool at http://localhost:5000/tools/query.html. You may use it to send queries to your instance of the service. For example:

    {"from":"movies"}

Tests

The Github repo also included the test suite, and you can run it against your service if you wish. The tests will create indexes on your cluster which are filled, queried, and destroyed

Linux

    cd ~/ActiveData
    export PYTHONPATH=.:vendor
    python -m unittest discover -v -s tests

Windows

    cd ActiveData
    SET PYTHONPATH=.:vendor
    python -m unittest discover -v -s tests

Name		Name	Last commit message	Last commit date
Latest commit History 2,926 Commits
.circleci		.circleci
active_data		active_data
docs		docs
resources		resources
tests		tests
vendor		vendor
.coveragerc		.coveragerc
.editorconfig		.editorconfig
.gitignore		.gitignore
.travis.yml		.travis.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
contribute.json		contribute.json
requirements.in		requirements.in
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ActiveData

Requirements

Upgrading

Elasticsearch Configuration

Installation

Configuration

Run

Verify

Tests

About

Releases

Packages

Languages

License

klahnakoski/ActiveData

Folders and files

Latest commit

History

Repository files navigation

ActiveData

Requirements

Upgrading

Elasticsearch Configuration

Installation

Configuration

Run

Verify

Tests

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages