Repository for origin-destination datasets
-
Create a directory with you
your-project
name -
Create a
your-project.json
file that describes your projet (see examples in other directories) -
Format your dataset into a CSV file, add link to this dataset in
your-project.json
-
Add link to
your-project
directory in the master filedataset.json
(list of all datasets) -
Your dataset should appear in https://observablehq.com/d/188f3eb2bb17b279
The file dataset.json links to all those datasets
- Each directory contains a dataset
- A
.json
files in each of those directory describes the dataset (attributes, ..) - A
.csv
file contains the raw data
You may online change the dataset.json
once.
This file is a classical CSV file, preferably with commas (,
) as separator. Each line represents one O/D trajectory. The column names are referenced in the data.json file.
Example (from random/random-data.csv:
time,group,x1,x2,y1,y2,group_x1,group_x2,group_y1,group_y2,distance,distance_category,orientation,hour,minute,second,year,month,day
Mon Jan 1 20:56:01 2018,2,752,542,899,30,3,2,0,4,894.0139819935704,long,S,20,56,1,2018,1,1
Mon Jan 1 21:41:05 2018,0,677,418,886,186,3,2,0,4,746.3785902610015,long,S,21,41,5,2018,1,1
Mon Jan 1 06:28:10 2018,2,225,380,53,562,1,1,4,2,532.0770620878145,medium,N,6,28,10,2018,1,1
This data describes the attributes (columns) of the .csv file.
Complete example:
{
"file": "random/random-data.csv",
"name": "Random XY Data",
"header": 1,
"separator": ",",
"meta": {
"date": "start_time",
"group": "group",
"timeParse": "%c",
"cumul": "distance"
},
"grids": [
{
"title": "random",
"tree": [
{ "group": "orientation", "gridding": "grid", "padding": 5 },
{ "group": "start_time" }
]
},
{
"title": "random-od",
"tree": [
{
"group": "cell_group_destination",
"gridding": "grid",
"padding": 5
},
{
"group": "start_time"
}
]
},
{
"title": "random-group-color",
"tree": [
{ "group": "orientation", "gridding": "grid", "padding": 5 },
{ "group": "group", "gridding": "grid", "padding": 5 },
{ "group": "group" }
]
}
],
"attributes": [
{
"name": "x1",
"type": "quantitative"
},
{
"name": "x2",
"type": "quantitative"
},
{
"name": "y1",
"type": "quantitative"
},
{
"name": "y2",
"type": "quantitative"
},
{
"name": "distance",
"type": "quantitative"
},
{
"name": "distance_category",
"type": "categorical"
},
{
"name": "orientation_4",
"type": "categorical"
},
{
"name": "start_time",
"type": "categorical"
},
{
"name": "start_year",
"type": "categorical"
},
{
"name": "start_month",
"type": "categorical"
},
{
"name": "start_day",
"type": "categorical"
},
{
"name": "start_hour",
"type": "categorical"
},
{
"name": "start_minute",
"type": "categorical"
},
{
"name": "start_second",
"type": "categorical"
},
{
"name": "end_time",
"type": "categorical"
},
{
"name": "end_year",
"type": "categorical"
},
{
"name": "end_month",
"type": "categorical"
},
{
"name": "end_day",
"type": "categorical"
},
{
"name": "end_hour",
"type": "categorical"
},
{
"name": "end_minute",
"type": "categorical"
},
{
"name": "end_second",
"type": "categorical"
},
{
"name": "duration",
"type": "quantitative"
},
{
"name": "speed",
"type": "quantitative"
},
{
"name": "speed_category",
"type": "categorical"
},
{
"name": "orientation_8",
"type": "categorical"
},
{
"name": "duration_category",
"type": "categorical"
},
{
"name": "cell_group_origin",
"type": "categorical"
},
{
"name": "cell_group_destination",
"type": "categorical"
},
{
"name": "bi_start_time",
"type": "categorical"
},
{
"name": "bi_start_year",
"type": "categorical"
},
{
"name": "bi_start_month",
"type": "categorical"
},
{
"name": "bi_start_day",
"type": "categorical"
},
{
"name": "bi_start_hour",
"type": "categorical"
},
{
"name": "bi_start_minute",
"type": "categorical"
},
{
"name": "bi_start_second",
"type": "categorical"
}
],
"author": "Romain Vuillemot",
"description": "Random data",
"source": ""
}
The meta
object describes the well-known data fields: origin and destination's coordinates, dates, groups…
The attributes
object describes the secondary fields: duration, price, age… that will be used to color maps or for statistical analysis.
Dates must be formatted in a way that moment.js can parse. It is possible to specify the date format as a dateformat
attribute.
Separator is, by default, the comma ",". It is passed to d3.dsv.
Header is unused (yet).
Author is the author or maintainer of the dataset.
Description describes the dataset.
Source is the source of the dataset.
This Observable notebook shows how to use this set of datasets in a unified manner.