Case Law Cite

An analysis project that inspects citations in rulings from Illinois.

Analysis

As mentioned in the project report we seek to answer the following questions:

Who is the attorney that has had the most participation in cases?, from private parties?, from the government?
How much the work in which an attorney is involved is cited, e.g. how influential was the work.
What is the page count of cases in which an attorney has participated?

Those questions are answered by the following respective Jupyter Notebooks, and the findings presented in the project report:

Obtaining the Data

Bulk case data

Bulk case data can be downloaded from the following URL:

mkdir Data && cd Data
curl https://api.case.law/v1/bulk/22341/download/

Case citations

Citation can be found here:

cd Data/Illinois-20200302-text/data
curl https://case.law/download/citation_graph/2020-04-28/citations.csv.gz

Preparing the Data for analysis

First be sure to install the required Tools as listed here:

Tools

After downloading the data into the Data directory we can use the python script included in ./ETL/hcapetl.py directory to transform, clean and insert the data into a SQLite database that will simplify our analysis.

The data must be extracted first with these commands:

DATA=Data/Illinois-20200302-text/data
DPROC=Data/Processed

xzcat $DATA/data.jsonl.xz > data.jsonl
jq -s $DATA/data.jsonl > $DPROC/data.json

The database will be named hcap.sqlite and it can be created by the following commands:

dbpath=./hcap.sqlite

./ETL/hcapetl.py create tables "$dbpath" ./Database/*.ddl.sql
./ETL/hcapetl.py create attorneys "$dbpath" ./Data/Processed/data.json
./ETL/hcapetl.py create cases "$dbpath" ./Data/Processed/data.json
./ETL/hcapetl.py create citations "$dbpath" ./Data/Processed/data.json

Running the full ETL pipeline should take about 10 minutes (excluding data download).

The previous commands can be found in the gendata.sh script at the root of this project.

The ETL directory has all of the python source necessary to work with the data. To aid with the exploration and cleanup we have the following Jupyter Notebooks:

The Data Exploration file has information about the commands used to gain insights to fragments of the data and to determine a SQL db schema.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.vscode		.vscode
Database		Database
Documents		Documents
ETL		ETL
Visualization		Visualization
.gitignore		.gitignore
Attorney Career Overview.ipynb		Attorney Career Overview.ipynb
Attorney Name Parsing.ipynb		Attorney Name Parsing.ipynb
Attorney Record Parsing.ipynb		Attorney Record Parsing.ipynb
Data Exploration.md		Data Exploration.md
Influential Case Attorneys.ipynb		Influential Case Attorneys.ipynb
Legal Glossary.md		Legal Glossary.md
Q1 Most Influential Attorneys.ipynb		Q1 Most Influential Attorneys.ipynb
Q2 Attorney Case Citations.ipynb		Q2 Attorney Case Citations.ipynb
Q3 Average page count for cases.ipynb		Q3 Average page count for cases.ipynb
README.md		README.md
README.pdf		README.pdf
Tools.md		Tools.md
gendata.sh		gendata.sh
genreadmepdf.sh		genreadmepdf.sh
genzip.sh		genzip.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Case Law Cite

Analysis

Obtaining the Data

Bulk case data

Case citations

Preparing the Data for analysis

About

Releases

Packages

Languages

triztian/caselawcite

Folders and files

Latest commit

History

Repository files navigation

Case Law Cite

Analysis

Obtaining the Data

Bulk case data

Case citations

Preparing the Data for analysis

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages