Skip to content
This repository has been archived by the owner on Oct 17, 2024. It is now read-only.

Retrieve decrees and highlight differences.

Notifications You must be signed in to change notification settings

tayflo/gas-scrapLegifrance-discriminate

Repository files navigation

Retrieve decrees and highlight differences

Principles

This repository is part of a wider study about french legislative processes.

Legal documents can be a headache: tons of pages of jargon considerations. From a digital social sciences perspective, we could help ourselves a bit on those matters.

Therefore, this project has two main and consecutive sides:

  • Scrap data from the web
  • Visualize data correctly

A sister project of the present one can be found here.

Description

The idea of the present project is to highlight differences between very similar texts, such as decrees of a specific kind.

For each word, in each text, two metrics are computed: "Banality" (on corpus level) and "Specificity" (on text level). (Note that is similar to a term frequency-inverse document frequency analysis.)

Formulas used here are the following:

Colors are given accordingly:

text_colors_legend

Which gives us, for an extract of some censorship decrees:

text_example

Licence and Re-use

This personal project was part of a broader specific work, hence it has not been designed for easy straightforward re-use, neither for further developments.

  • If it can ever serve educational purposes as it is, it's great.

  • If you plan to draw inspiration from the present work and its methods, citation in your own work is always appreciated. If needed, you can also open an issue or contact me.

  • If you are interested in analyzing lawmaking, the GitHub organization Regards Citoyens could be the right place for you.

All underlying dependencies of the project can be found in package.json, and when needed references are within the source code.

About

Retrieve decrees and highlight differences.

Resources

Stars

Watchers

Forks