-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
weights function enhancement #1751
Comments
add adaptive kernel weights for social weights creation
v179 |
@jkoschinsky I would leave it open, since I don't get feedback from Luc yet, and I am not sure if @Ashitacarl only verify the UI part, or did a throughout check at all the possible combinations and outputs of social weights creation. |
sounds good, @lixun910. |
@lixun910 and @jkoschinsky Sure I will test functionalities beyond the UI. Is there a routine/list to follow to check all the possible combinations using weights (GeoDa workbook?)? |
@Ashitacarl -- great, thanks. There's no script to follow since this is new functionality, so just try to set it up so you increase the likelihood of breaking it (e.g. use grouped variables, table with missing values and any other data characteristics that aren't sample/text book data). |
GeoDa 1.12.1.181 (macOS Mojave, 10.14.1). Dec. 17 build. @lixun910
No problem with grouped variables or table with missing values. No problem creating a project file. No problem with Moran scatter plots, Cluster Maps, Cluster Analysis. I tested using datasets from external sources. |
The distance metric specifies how the distance in attribute space is calculated, nothing to do
with the “social” aspect of the distance. So should stay as is.
… On Dec 19, 2018, at 4:09 PM, Ashitacarl ***@***.***> wrote:
GeoDa 1.12.1.181 (macOS Mojave, 10.14.1). Dec. 17 build.
@lixun910 <https://github.com/lixun910>
Tested social distance weights. A few observations:
When creating the weight, GeoDa requires a distance metric specified: either Manhattan or Euclidean distance. I guess they are just "labels" or units in this case. Do we need to create another "distance metric" name for this social-distance purpose? Ex. Custom distance or social distance.
<https://user-images.githubusercontent.com/45187310/50249668-45147800-03a4-11e9-87f0-675da39b4084.png>
In the weights manager dialogue, the "distance var" is not specified. Shall we include that? "Type" is specified as "threshold". I am not sure if this name is clear enough.
Also, in the same image, the min neighbor is 0, which should not be the case since I use the default distance bandwidth/threshold (guarantees at least one neighbor). Relatedly, when I change the bandwidth downward, I did not receive any warning message about creating isolates.
<https://user-images.githubusercontent.com/45187310/50250143-9f620880-03a5-11e9-9936-e1c864d8fc83.png>
In Multivariate Local Geary's C Cluster Map, one cannot select multiple time points from a single group. Could this be a problem? The same is also found in Multivariate Local Join Count.
<https://user-images.githubusercontent.com/45187310/50250999-0e406100-03a8-11e9-9cc9-1b17ea8a4e01.png>
No problem with grouped variables or table with missing values. No problem creating a project file. No problem with Moran scatter plots, Cluster Maps, Cluster Ana I tested using datasets from external sources.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#1751 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AD0pn7NbvueGUeVIuzZFMm_Pm_VagNDUks5u6rkGgaJpZM4YlMFM>.
|
GeoDaCenter#1751 (socio-weights creating) when change the bandwidth downward to cause min neighbor is 0, GeoDa should raises warning message about creating isolates.
Social-weights creation: In the weights manager dialogue, the "distance var" is not displayed.
In Multivariate Local Geary's C Cluster Map, one cannot select multiple time points from a single group.
apply same fix of GeoDaCenter#1751 to multi variate local join count
@Ashitacarl can you help to verify it? Thanks! |
GeoDa 1.12.1.189 (macOS Mojave, 10.14.1). Jan. 15 build. Datasets: Guerry sample and external. One problem remains: when there are outliers in the distance var, GeoDa produces isolates. Also, missing values are problematic here. I don't know the underlying algorithm but observations with missing values seem to have a lot of neighbors. See figs and explanations below. All other fixed verified. In the first figure, the selected polygons have missing values (except the 1 obs) in the distance var. They seem to have a lot of neighbors (see column NUM_NBRS). The first selected polygon, as an outlier, has 0 neighbors. I used the default bandwidth, and no message pop up warning me of creating potential isolates (and it actually creates isolates). The figure below is another example of creating isolates without warning (default bandwidth used). The second example uses the Guerry sample data and 'Area' variable as the distance var. |
I need to check more carefully, but this could have something to do with the scale of the variable. Since all the distances are based on squared values and if the original values are large to begin with, there may be overflow issues. One option would be to standardize all the variables before computing the distances, similar to what is done in the cluster modules. |
@Ashitacarl Thanks! I was able to replicate this and found this bug: it happens when "manhanttan distance" is selected, the bandwidth is computed with square root, which is not correct and should only be applied when "Euclidean distance" is selected. This will be fixed in next build V191. @lanselin Yes, this current implementation, the "Transformation:" options, which are the same with in cluster methods, have been added to allow user to scale the variables. |
@Ashitacarl Empty values are not treated in current implementation. This is not a problem when creating spatial weights since all geometries are valid, but a unique case in socio-weights creation. I think we simply treat observations with empty value "islands" when creating a socio-weights (will be in V191). Let me know if I am wrong @lanselin . Thanks! |
GeoDaCenter#1751 (socio-weights creation) when "manhanttan distance" is selected, the bandwidth is computed with square root, which is not correct and should only be applied when "Euclidean distance" is selected
treat observations with empty value "islands" when creating a socio-weights raise warning if islands detected
GeoDa 1.12.1.191 (macOS Mojave, 10.14.1). Jan. 16 build. Fix verified. @lixun910 |
Fix a bug: if there are many similar values among obs, KNN weights will generete neighbors with more than K neighbors.
GeoDa Windows 1.14 Fix verified. No errors detected in functionality. Warning appears to indicate the presence of isolates. One potential quality of usage update: when selecting multiple distance weight variables, it would be nice to have it display more than three variables at a time. When the list of variables is lengthy, it becomes tedious to scroll up and down while having to remember the variables already selected. Perhaps have it turn into a drop-down list or allow the user to increase the size of the box to display more than three variables? |
Fixed verified. The UI enhancement suggested by @bsalas11 is not supported by wxWidgets. Will keep an eye on possible enhancement. |
from Luc:
1.extend the distance weights functionality to higher dimensions, beyond 2
right now, we have an x and y coordinate, which should remain the default, but
it would be nice to be able to add more variables, so that a distance in multiattribute
space could be computed
include Manhattan distance?
The text was updated successfully, but these errors were encountered: