-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an API endpoint for retrieving a route's shape #201
Conversation
Codecov Report
@@ Coverage Diff @@
## master #201 +/- ##
=========================================
- Coverage 95.8% 95.8% -0.01%
=========================================
Files 115 117 +2
Lines 2504 2526 +22
Branches 275 275
=========================================
+ Hits 2399 2420 +21
- Misses 103 104 +1
Partials 2 2
Continue to review full report at Codecov.
|
lib/gtfs/shape_point.ex
Outdated
shape_id: Shape.id(), | ||
lat: float(), | ||
lon: float(), | ||
sequence: non_neg_integer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having sequence
is needed in the csv because you don't have order guarantees, but in elixir, I think we can just put these in an ordered list and not store this field.
If we don't store this field, we might be able to have a much simpler format in shape.ex
which could let us get rid of shape_point.ex
entirely:
# shape.ex
@type t :: %__MODULE__{
id: id(),
points: [{float(), float()}]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know that the points coming in are in proper order?
Even if sequence
goes away I wouldn't want to lose the naming of lat
and lon
, I think that is very valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could sort by sequence
during the loading process, but then not store it. Similar to how we check route_type
in routes when loading but then don't store it.
Ah, yeah I see your point about the names. Could the struct/map with lat/lon naming live in shape.ex
? It's so related to Shape
that I don't think it really stands well enough on its own to have its own separate file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is stymied by the fact that we have to pull in one point at a time, and only later are able to assemble them into a shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing parse helper in csv.ex
uses a simple map, so wouldn't be able to use it, but we can do the sort if we're okay some more custom parsing code.
...
|> CSV.decode!(headers: true) # [map()]
|> Enum.group_by(fn row -> row["shape_id"] end) # %{shape_id() => map()}
|> Map.new(fn {shape_id, rows_in_shape} ->
points =
rows_in_shape
|> Enum.sort_by(fn row -> Integer.parse(row["sequence"]) end)
|> Enum.map(fn row -> %{lat: ..., lon: ...} end)
{shape_id, %Shape{
id: shape_id,
points: points
}}
end)
It is more complicated parsing than the simple one struct per row, but I think that having a cleaner data type is more important than simple parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. I think the existence of that property is causing no harm and having custom parsing code would make things more complicated and less extensible. Plus I think it would be important to have a type for that intermediate step anyway so you'd still have the same description of the sequence property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an unnecessary complicated data model does do harm, in that it adds complexity to every place that uses shapes. We'd have to add the sequence field whenever we make test data. We have to use more bandwidth and memory to store it. There are new meaningless/impossible states, like %{points: %{sequence: 2}, %{sequence: 1}}
. It'd be better to clear out that complexity once in the parsing and have a data model that represents what we want, rather than keep it around for every other place that uses shapes.
The point sorting function is actually wrong. Will fix momentarily. |
948431d
to
588974f
Compare
588974f
to
b1c157d
Compare
b1c157d
to
52bc50a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple small non-blocking comments.
lib/gtfs/data.ex
Outdated
end | ||
|
||
@spec shape_for_trip(Trip.t(), shapes_by_id()) :: Shape.t() | nil | ||
defp shape_for_trip(trip, shapes_by_id), do: Map.get(shapes_by_id, trip.shape_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of shape-specific code to have in data.ex
. Is there any of it that would make sense to move to shape.ex
so that data.ex
can stay a little smaller?
lib/gtfs/shape.ex
Outdated
@@ -0,0 +1,22 @@ | |||
defmodule Gtfs.Shape do | |||
alias Gtfs.ShapePoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have ShapePoint
as a submodule inside this file, so it's more closely associated with the Shape
module? I don't think we'd ever want to use a ShapePoint
outside of the context of a Shape
.
Save shape data from GTFS.
Preliminary step for Shuttle Map | Show lines (no stops) for selected shuttle routes (shapes on the map)