SS22 Project Advanced Web Technologies

Learning Technologies - Competence Extraction via ML / NLP

Project Architecture

Installation (Recommended for `x86` Platform)

Install Docker according to the instructions on the Docker website
Open the terminal and enter:
```
sudo docker -v
```
Check if Docker is installed successfully, if not retry the previous step.
Unzip awt-pj-ss22-learn-tech-2.zip and go to the project directory /awt-pj-ss22-learn-tech-2

Type the following command to build the Docker image from the Dockerfile:
```
sudo DOCKER_BUILDKIT=1 docker build -t competence-extraction:1.0 ./
```
All dependencies will be downloaded automatically, which will last for a while.
Type the following command and you will see the competence-extraction image you just built:
```
sudo docker images
```

Enter the following command to start a container from this image on port 8080 for Jupyter Notebook and port 5000 for RESTful API:

sudo docker run --user root -p 8888:8888 -p 5000:5000 competence-extraction:1.0

You may need to change port if it is ocuppied. For example, using:

sudo docker run --user root -p 8888:8888 -p 5001:5000 competence-extraction:1.0

or

sudo docker run --user root -p 8889:8888 -p 5000:5000 competence-extraction:1.0

Use a browser to open the last link given on the terminal so that you can access the source code.

!!! Please save this link for future use !!!

All existing datasets have been processed, so you can skip this step and directly view the results using the RESTful API in the next step. or follow the intructions if you want to run the source code.
Use Ctrl+C (key combinations on the keyboard) to exit the Jupyter Notebook environment, enter
```
sudo docker ps -a
```
to find the container that just created and already run the Jupyter Notebook, remember the <CONTAINER ID> this container.
Start the container container again:
```
sudo docker start <CONTAINER ID>
```

Use

sudo docker exec -it <CONTAINER ID> bash

to enter the terminal of the container

Enter
```
sudo python awt-pj-ss22-learn-tech-2/src/app.py 
```
to run the RESTful API, and open the first link according to the information on the terminal in the browser.
Use Ctrl+C (key combinations on the keyboard) to close the RESTful API.
You can also use the same link (the one you should save) to start Jupyter Notebook again.
Finally, enter
```
exit
```
to exit container.

Installation (Recommended for `ARM/M1` Platform)

Install Anaconda environment first.

Download and install Conda env for M1

chmod +x ~/Downloads/Miniforge3-MacOSX-arm64.sh
sh ~/Downloads/Miniforge3-MacOSX-arm64.sh
source ~/miniforge3/bin/activate

Some additional installations are also required:

Spacy:

conda install -c conda-forge spacy
python -m spacy download de_core_news_lg

Tensorflow:

Download tensorflow_text

conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos
python -m pip install tensorflow-metal
python -m pip install Downloads/tensorflow_text-2.9.0-cp39-cp39-macosx_11_0_arm64.whl
python -m pip install tensorflow_hub

Neo4J:

pip install neo4j

RESTful API:

pip install Flask~=2.1.2
pip install flask-restx==0.5.1
pip install werkzeug==2.1.2

Jupyter Notebook:

conda install -c conda-forge jupyter jupyterlab -y

Pandas:

conda install pandas

Use Jupyter Notebook to run source code:
```
cd Downloads/awt-pj-ss22-learn-tech-2/src
jupyter notebook
```
All existing datasets have been processed, so you can skip this step and directly view the results using the RESTful API in the next step.

In case if you want to test the source code or you have new dataset, there are three different Jupyter Notebooks under the src/ path. You can find a detailed description of them in AWT_Report_IEEE.pdf. Here are a few points to highlight:
- Run each block of code in the Jupyter Notebook in the order of Preprocessing.ipynb -> NLP.ipynb -> Neo4J.ipynb, you can see all the intermediate steps.
- When importing the libraries, if there is an alarm message, it's just because the GPU is not configured for acceleration, which does not affect normal use.
- By default the control course dataset is used (the input value of the import_course function in Preprocessing.ipynb, it will take more than an hour to use the full course dataset), if you want to test other datasets, please replace the input parameters of the import_course function.
- In Neo4J.ipynb, due to the security settings of the cloud database we use, all local computing data cannot be directly imported into the database. It needs to be uploaded to a publicly accessible HTTP or HTTPS server first. It is recommended to use google drive and create a sharing link. The get_google_file function in Neo4J.ipynb will extract the address of the direct access data file from it and upload it to the cloud database.
- You can also use your own cloud database by replacing uri user password in Neo4J.ipynb or your own network drive, and upload to the cloud database using a similar method.
- Evaluation.ipynb under src/archive is only used as a potential evaluation tool and is not actually used.
Use Ctrl+C (key combinations on the keyboard) to exit the Jupyter Notebook environment, enter
```
flask run --port=5001
```
to run the RESTful API, and open the first link according to the information on the terminal in the browser.

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
.idea		.idea
__pycache__		__pycache__
bin		bin
data		data
images		images
lib/python3.8/site-packages		lib/python3.8/site-packages
paper		paper
src		src
.DS_Store		.DS_Store
AWT_Report_IEEE.pdf		AWT_Report_IEEE.pdf
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
competency.csv		competency.csv
competency_coverd_by.csv		competency_coverd_by.csv
courses.csv		courses.csv
pyvenv.cfg		pyvenv.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SS22 Project Advanced Web Technologies

Learning Technologies - Competence Extraction via ML / NLP

Project Architecture

Installation (Recommended for `x86` Platform)

Installation (Recommended for `ARM/M1` Platform)

Core Components:

Helpful Links:

About

Releases

Packages

Contributors 3

Languages

License

ingastrelnikova/awt-pj-ss22-learn-tech-2

Folders and files

Latest commit

History

Repository files navigation

SS22 Project Advanced Web Technologies

Learning Technologies - Competence Extraction via ML / NLP

Project Architecture

Installation (Recommended for x86 Platform)

Installation (Recommended for ARM/M1 Platform)

Core Components:

Helpful Links:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Installation (Recommended for `x86` Platform)

Installation (Recommended for `ARM/M1` Platform)

Packages