Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HexTiles does behave strangely when used with hv.Dataset #3811

Closed
anderl80 opened this issue Jul 9, 2019 · 20 comments
Closed

HexTiles does behave strangely when used with hv.Dataset #3811

anderl80 opened this issue Jul 9, 2019 · 20 comments
Milestone

Comments

@anderl80
Copy link

anderl80 commented Jul 9, 2019

I prepare a map using

tile_opts = dict(height=plot_height, xaxis=None, yaxis=None, show_grid=False,
                 sizing_mode="scale_both", responsive=True)
map_tiles = gts.CartoLight.opts(style=dict(alpha=1), plot=tile_opts)

Then I do a plot with lon lat given in mercator coors

points = (hv.HexTiles(data[["lon", "lat", "change"]], kdims=["lon", "lat"],
                     vdims=["value"])
          .options(active_tools=['wheel_zoom'],tools=[hover])
          .opts(colorbar=False, alpha=0.3, aggregator=np.mean, gridsize=250))

map_tiles * points

which works perfectly and plots the hexbins beautifully where they are located (Germany). However, as I am using a hv.Dataset (I want a dropdown with an additional key) like this

points_ds = gv.Dataset(data[["lon", "lat", "value", "date"]])
points = (points_ds.to(hv.HexTiles, kdims=["lon", "lat"], vdims=["value"])
          .options(active_tools=['wheel_zoom'], tools=[hover])
          .opts(colorbar=False, alpha=0.3, gridsize=250))

map_tiles * points

the results are pretty strange. The plot does not appear most of times and when I click through the dropdown it's always on some other point in the map.

image

@philippjfr
Copy link
Member

philippjfr commented Jul 9, 2019

In the .to example are the coordinates in mercator coordinates or lat/lon? If the latter you should change it to:

points = (points_ds.to(gv.HexTiles, kdims=["lon", "lat"], vdims=["value"])
          .options(active_tools=['wheel_zoom'], tools=[hover])
          .opts(colorbar=False, alpha=0.3, gridsize=250))

@anderl80
Copy link
Author

anderl80 commented Jul 9, 2019

All coords are already projected to mercator. I.e. "lat" and "lon" are also mercator. Maybe I'm wrong, isn't your code example the same as above?

@philippjfr
Copy link
Member

Yes, sorry, edited, I changed it to gv.HexTiles, but that won't help if it's already Mercator.

@poplarShift
Copy link
Collaborator

Just wanted to leave a pointer to holoviz/geoviews#314, unsure if this issue is related to differences between hv.Dataset and gv.Dataset.

Also, it would be easier playing with this if you provided a Minimal Working Example.

@anderl80
Copy link
Author

anderl80 commented Jul 10, 2019

from cartopy import crs as ccrs
import geoviews as gv
import holoviews as hv
import datashader as ds
from colorcet import bgy
import geoviews.tile_sources as gts
from holoviews.operation.datashader import datashade
gv.extension("bokeh")
import random
import numpy as np
import pandas as pd
import datetime

# map tiles
plot_width = 800
plot_height = 400
tile_opts = dict(height=plot_height, xaxis=None, yaxis=None, show_grid=False,
                 sizing_mode="scale_both", responsive=True)
map_tiles = gts.CartoLight.opts(style=dict(alpha=1), plot=tile_opts)

honolulu_coords = {"lat":21.455832, "lon":-157.974411}

def random_date(start, end):
    """Generate a random datetime between `start` and `end`"""
    return start + datetime.timedelta(
        # Get a random amount of seconds between `start` and `end`
        seconds=random.randint(0, int((end - start).total_seconds())),
    )
	
lon = [random.randint(0,100)/10000+honolulu_coords["lon"] for _ in range(0,1000)]
lat = [random.randint(0,100)/10000+honolulu_coords["lat"] for _ in range(0,1000)]
val = [random.randint(0,100) for _ in range(0,1000)]
date = [str(random_date(datetime.datetime(2019,1,1),datetime.datetime(2019,1,3)).date()) for _ in range(0,1000)]
data_gps = pd.DataFrame(list(zip(lon, lat, val, date)), columns=['lon', 'lat', 'val', 'date'])

# datashader and bokeh work with mercator proj
from datashader.geo import lnglat_to_meters
data_mercator = data_gps.copy(deep=True)
data_mercator.loc[:, 'lon'], data_mercator.loc[:, 'lat'] = lnglat_to_meters(data_mercator["lon"], data_mercator["lat"])
				 
# plot 1
points_ds = gv.Dataset(data_mercator[["lon", "lat", "val", "date"]])
points = (points_ds.to(hv.HexTiles, kdims=["lon", "lat"], vdims=["val"])
          .opts(colorbar=False, alpha=0.3, gridsize=250))
map_tiles * points

# plot 2 - same results as first plot
points_ds = gv.Dataset(data_gps[["lon", "lat", "val", "date"]])
points = (points_ds.to(gv.HexTiles, kdims=["lon", "lat"], vdims=["val"])
          .opts(colorbar=False, alpha=0.3, gridsize=250))
map_tiles * points

# plot 3 - different result as the other plots
points_ds = gv.Dataset(data_mercator[["lon", "lat", "val", "date"]], crs=ccrs.Mercator())
points = (points_ds.to(gv.HexTiles, kdims=["lon", "lat"], vdims=["val"])
          .opts(colorbar=False, alpha=0.3, gridsize=250))
map_tiles * points

The existance of lnglat_to_meters function suggests that is an appropriate function for this, but maybe it's the problem that it converts to web mercator and not mercator? But than it's strange that plot 1 and 2 are the same and plot 3 is different.

@anderl80
Copy link
Author

anderl80 commented Jul 10, 2019

This is my actual issue
(unfortunately I cannot share the raw data)
To make things easy I switched completely to gps (wgs84) coords in the dataframe.
Doing a hextile map works fine with the overall data, so I guess I did everything right. But now I want a dropdown menu for the "date" column, so I create a dataset and so on (see below).

# beautifully shows the map
points = (gv.HexTiles(data_gps[["geohash_lon", "geohash_lat", "change"]], kdims=["geohash_lon", "geohash_lat"],
                     vdims=["change"])
          .options(active_tools=['wheel_zoom'],tools=[hover])
          .opts(colorbar=False, alpha=0.3, aggregator=np.mean, gridsize=250))

image

but

points_ds = gv.Dataset(data_gps[["geohash_lon", "geohash_lat", "change", "date"]])
points = (points_ds.to(gv.HexTiles, kdims=["geohash_lon", "geohash_lat"], vdims=["change"])
          .options(active_tools=['wheel_zoom'], tools=[hover])
          .opts(colorbar=False, alpha=0.3, gridsize=250))

which is the same but I want a dropdown with the date, then it get's messed up

image

and it jumps around when I select another date in the dropdown

image

I presume the error to be somewhere in hextiles because when I switch to Points everything works as expected:

points = (points_ds.to(gv.Points, kdims=["geohash_lon", "geohash_lat"], vdims=["change"])
          .options(active_tools=['wheel_zoom'], tools=[hover])
          .opts(colorbar=False))

image

@poplarShift
Copy link
Collaborator

poplarShift commented Jul 10, 2019

Your example wasn't very minimal, but OK :) If I understand you correctly, part of your problem is that

  1. gv.Points(data_gps, ['lon', 'lat']), which is the same as gv.Points(data_gps, ['lon', 'lat'], crs=ccrs.PlateCarree()) , and
  2. gv.Points(data_mercator, ['lon', 'lat'], crs=ccrs.Mercator())

do not agree. That means that lnglat_to_meters is different from the ccrs.Mercator() projection.

Is there any reason why you use the lnglat_to_meters function? Why not use the first option, where you are sure that it's lon/lat? Also, if you have to transform the lon/lat columns, is there a reason for not using ccrs.Mercator().transform_points?

@philippjfr
Copy link
Member

lnglat_to_meters is equivalent to ccrs.GOOGLE_MERCATOR I believe.

Separately though, there's no reason to manually project anything when you use GeoViews. So I'd recommend that you simply let GeoViews handle the projecting.

@anderl80
Copy link
Author

anderl80 commented Jul 11, 2019

Yes that's one thing that the crs do not seem to match, but, the problem exist with all crs, even when I use vanilla gps (wgs84) coordinates as suggestede with gv.
Everything does fine when using gv.HexTiles, gv.Points, ... But, when using a Dataset and the .to method, things go weird in combination with gv.HexTiles (with gv.Points, everything is fine).
To show this, I needed a bit more than a minimal example as seen above.

@anderl80
Copy link
Author

anderl80 commented Jul 11, 2019

lnglat_to_meters is equivalent to ccrs.GOOGLE_MERCATOR I believe.

Separately though, there's no reason to manually project anything when you use GeoViews. So I'd recommend that you simply let GeoViews handle the projecting.

That's what I did (please see my second example #3811 (comment)) --> data_gps are vanilla gps coordinates in lon/lat.

@poplarShift
Copy link
Collaborator

Yes that's one thing that the crs do not seem to match, but, the problem exist with all crs, even when I use vanilla gps (wgs84) coordinates as suggestede with gv.
Everything does fine when using gv.HexTiles, gv.Points, ... But, when using a Dataset and the .to method, things go weird in combination with gv.HexTiles (with gv.Points, everything is fine).
To show this, I needed a bit more than a minimal example as seen above.

First, your map_tiles don't render for me with the options you give. So when I comment out the options, then your three plots have the same axis ranges for plot 1 and 2, but different for plot 3, and the axis ranges for each particular plot do not depend at all on whether I use Points or HexTiles.

Second, are the map_tiles really crucial for your minimal example? I cannot see a difference in axis ranges of the hextiles with or without.

Third, could you post a screenshot of what the below code gives for you?

Screen Shot 2019-07-11 at 9 15 36 AM

from cartopy import crs as ccrs
import geoviews as gv
import holoviews as hv
import datashader as ds
from colorcet import bgy
import geoviews.tile_sources as gts
from holoviews.operation.datashader import datashade
gv.extension("bokeh")
import random
import numpy as np
import pandas as pd
import datetime

# map tiles
plot_width = 800
plot_height = 400
tile_opts = dict(height=plot_height, xaxis=None, yaxis=None, show_grid=False,
                 sizing_mode="scale_both", responsive=True)
map_tiles = gts.CartoLight#.opts(style=dict(alpha=1), plot=tile_opts)

honolulu_coords = {"lat":21.455832, "lon":-157.974411}

def random_date(start, end):
    """Generate a random datetime between `start` and `end`"""
    return start + datetime.timedelta(
        # Get a random amount of seconds between `start` and `end`
        seconds=random.randint(0, int((end - start).total_seconds())),
    )
	
lon = [random.randint(0,100)/10000+honolulu_coords["lon"] for _ in range(0,1000)]
lat = [random.randint(0,100)/10000+honolulu_coords["lat"] for _ in range(0,1000)]
val = [random.randint(0,100) for _ in range(0,1000)]
date = [str(random_date(datetime.datetime(2019,1,1),datetime.datetime(2019,1,3)).date()) for _ in range(0,1000)]
data_gps = pd.DataFrame(list(zip(lon, lat, val, date)), columns=['lon', 'lat', 'val', 'date'])

# datashader and bokeh work with mercator proj
from datashader.geo import lnglat_to_meters
data_mercator = data_gps.copy(deep=True)
data_mercator.loc[:, 'lon'], data_mercator.loc[:, 'lat'] = lnglat_to_meters(data_mercator["lon"], data_mercator["lat"])
				 
# plot 1
points_ds = gv.Dataset(data_mercator[["lon", "lat", "val", "date"]])
points = (points_ds.to(hv.HexTiles, kdims=["lon", "lat"], vdims=["val"])
          .opts(colorbar=False, alpha=0.3, gridsize=20))
p1 = map_tiles * points

# plot 2 - same results as first plot
points_ds = gv.Dataset(data_gps[["lon", "lat", "val", "date"]])
points = (points_ds.to(gv.HexTiles, kdims=["lon", "lat"], vdims=["val"])
          .opts(colorbar=False, alpha=0.3, gridsize=20))
p2 = map_tiles * points

# plot 3 - different result as the other plots
points_ds = gv.Dataset(data_mercator[["lon", "lat", "val", "date"]], crs=ccrs.Mercator())
points = (points_ds.to(gv.HexTiles, kdims=["lon", "lat"], vdims=["val"])
          .opts(colorbar=False, alpha=0.3, gridsize=20))
p3 = map_tiles * points

hv.output(p1)
hv.output(p2)
hv.output(p3)

@anderl80
Copy link
Author

image

@philippjfr
Copy link
Member

So as long as I use ccrs.GOOGLE_MERCATOR instead of ccrs.Mercator() I can't reproduce any issues with your example.

@anderl80
Copy link
Author

anderl80 commented Jul 12, 2019

So maybe we should now try to add a map, maybe this is related with the map.

Still my example with using raw gps coordinates in combination with geoviews and no declaration of a ref crs does not work. I will redo the example later.

I have problems with raw gps coordinates (wgs84) and giving no crs in a few other examples and gv. Got an error related to PlateCaree...

@poplarShift
Copy link
Collaborator

poplarShift commented Jul 12, 2019

So it looks like everything is fine then, right?

(Not giving the crs argument, or a wrong one, may lead to wrong results.)

@anderl80
Copy link
Author

No, see the message above, the original problem still persists.

Now we got the (a) crs thing solved (what is the crs for wgs84?), although I thought that plain gps coordinates do not need crs explicitly referenced. (b) we got the thing solved without using a background map. However if we use the map then the original problem persists.

I will give an updated code example tomorrow.

@poplarShift
Copy link
Collaborator

I see, sorry, I got sidetracked by the CRS issue. (CRS for plain lat/lon is indeed PlateCarree.)

@anderl80
Copy link
Author

anderl80 commented Jul 18, 2019

No worries, thanks for the support and sorry for distracting from the main problem.

I was able do "isolate" (still sorry for the over-engineered example :-)) the problem more. Please try to execute the following.

# generate random data points for GitHub issue
import random
import datetime
import numpy as np
import pandas as pd

muc_coords = {"lat": 48.137154, "lon": 11.576124}

# generate data
def random_date(start, end):
    """Generate a random datetime between `start` and `end`"""
    return start + datetime.timedelta(
        # Get a random amount of seconds between `start` and `end`
        seconds=random.randint(0, int((end - start).total_seconds())),
    )
lon = [random.randint(0,100)/10000+muc_coords["lon"] for _ in range(0,1000)]
lat = [random.randint(0,100)/10000+muc_coords["lat"] for _ in range(0,1000)]
val = [random.randint(0,100) for _ in range(0,1000)]
date = [str(random_date(datetime.datetime(2019,1,1),datetime.datetime(2019,1,10)).date()) for _ in range(0,1000)]
data = pd.DataFrame(list(zip(lon, lat, val, date)), columns=['lon', 'lat', 'change', 'date'])

# [x] check that there's no day without data
data.date.value_counts().sort_index()

import geoviews as gv
gv.extension("bokeh")

hexs = (gv.HexTiles(data, kdims=["lon", "lat"],
                     vdims=["change", "date"])
          .options(active_tools=['wheel_zoom'],tools=["hover"])
          .opts(colorbar=False, alpha=0.3, aggregator=np.mean))
		  
# everything works fine
hexs

# now we try to add a dropdown menu
hexs_ds = gv.Dataset(data)

hexs = (hexs_ds.to(gv.HexTiles, kdims=["lon", "lat"], vdims=["change"])
          .options(active_tools=['wheel_zoom'], tools=["hover"])
          .opts(colorbar=False, alpha=0.3, aggregator=np.mean))
		  
# click through dropdown menu and find shifted data (zoom out)
hexs

# to double-check try points instead; everything works as expected
points = (hexs_ds.to(gv.Points, kdims=["lon", "lat"], vdims=["change"])
          .options(active_tools=['wheel_zoom'], tools=["hover"])
          .opts(colorbar=False))

No 1 gives me
image

No 2 gives me (already zoomed out because the starting graph was empty)
image

Exactly the same happens if I use holoviews only instead of geoviews.

@philippjfr
Copy link
Member

Appears to have been fixed in #4182

@philippjfr philippjfr added this to the v1.13.0 milestone Jan 15, 2020
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants