Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similarity Module (trajectroy distances) #13

Merged
merged 54 commits into from
Nov 17, 2020
Merged

Conversation

svenruf
Copy link
Collaborator

@svenruf svenruf commented Sep 16, 2020

  • Similarity Module to calculate tajectory distances (DTW, EDR, ...)
  • Storing the ids of extracted triplegs also in the original positionfixes -> pfs can be used for further calculations

svenruf and others added 30 commits March 11, 2020 16:26
Updated the CRS extraction from the GeoDataFrame to GeoPandas 0.7
Write Tripleg_id into positionfixes and return pfs DataFrame in the extract_triplegs method. (only case 1&2)
This reverts commit 83e1e2b.
from console_progressbar import ProgressBar


def e_dist_tuples(c1,c2):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be replaced with a real euclidean distance function

Copy link
Member

@henrymartin1 henrymartin1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work but we need to add some changes so that we can integrate it.

@@ -0,0 +1,2 @@
from .measures import *
from .detection import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dominikbucher Is this something that we should do in every submodule?

def e_dist_tuples(c1,c2):
return np.sqrt((c1[0]-c2[0])**2 + (c1[1]-c2[1])**2)

def similarity_matrix(data, method, field='tripleg_id', trsh=None, eps=None, dist=False, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some critical adjustments that have to be made before we can merge this.

  • I don't think that tripleg similarity should be based on positionfixes with an ID column but only on the tripleg data (and then on the query points of the tripleg). This will allow us to cover use cases where we have triplegs but no positionfixes (e.g., GreenClass).
  • I don't think that the matrix assignment should be ID based. At the moment we are calculating a full distance matrix (meaning a distance between all pairs) but store it in a sparse matrix. I can see the need to maintain the relation between IDs and rows but it should be done by either returning an ID vector or by telling the user in the documentation that the order in the distance matrix is the same as the order in the dataframes that were given to the function.
  • The function should either call or be merged with the trackintel.geogr.distances.calculate_distance_matrix function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: The function actually supports sparse calculation




def ses_extract(tpl):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc missing

@@ -0,0 +1,215 @@
"""
Algorithms e_dtw and e_edr adapted from https://github.com/bguillouet/traj-dist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of having a collection of trajectory distance metrics as it is done in measures.py. In calculate_distance_matrix we tried to be as compatible as possible to scikit-learn. I think here it would also be a good idea to try to implement valid metrics so that we can _piggyback _ on the pairwise_distances implementation of scikit (maybe it is even possible to use their parallelization).
I think this is mostly already done here.

@henrymartin1
Copy link
Member

Update for merge plan:

  • We use a single calculate_distance_matrix function (that we might rename at some point...) that is a wrapper around the scikit pairwise function with some extra options (such as vectorized haversine).
  • We will have a seperate file for trajectory-trajectory and point-point metrics
  • calculate_distance_matrix needs to distinguish the following cases:
    • XX distance and XY distance
    • check if metric is suitable for points or if metric is suitable for trajectories
  • The default will be to calculate the trajectory distances on the query points of the trajectory however in the future it should be possible to also do it based on positionfixes to be able to use the timestamp information.

@codecov
Copy link

codecov bot commented Nov 17, 2020

Codecov Report

Merging #13 (8030076) into master (b2a3fe6) will increase coverage by 2.05%.
The diff coverage is 60.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #13      +/-   ##
==========================================
+ Coverage   48.91%   50.96%   +2.05%     
==========================================
  Files          27       29       +2     
  Lines        1147     1238      +91     
  Branches      174      197      +23     
==========================================
+ Hits          561      631      +70     
- Misses        528      543      +15     
- Partials       58       64       +6     
Impacted Files Coverage Δ
setup.py 0.00% <0.00%> (ø)
trackintel/io/postgis.py 13.59% <0.00%> (ø)
trackintel/model/triplegs.py 79.16% <ø> (+1.89%) ⬆️
trackintel/geogr/trajectory_distances.py 35.71% <35.71%> (ø)
trackintel/model/positionfixes.py 80.00% <50.00%> (-1.82%) ⬇️
trackintel/geogr/distances.py 80.00% <78.78%> (+38.33%) ⬆️
trackintel/geogr/point_distances.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2a3fe6...8030076. Read the comment docs.

requirements.txt Outdated Show resolved Hide resolved
@henrymartin1 henrymartin1 self-requested a review November 17, 2020 10:51
@henrymartin1
Copy link
Member

@hong2223 : I'd merge it now if you have no other comments

Copy link
Member

@henrymartin1 henrymartin1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@henrymartin1 henrymartin1 merged commit f9b97e1 into mie-lab:master Nov 17, 2020
@svenruf svenruf deleted the exp_sr branch January 7, 2021 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants