Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fixes #4025] Regression with uploading a shapefile with no ascii characters #4026

Merged
merged 6 commits into from
Oct 31, 2018

Conversation

afabiani
Copy link
Member

No description provided.

@codecov
Copy link

codecov bot commented Oct 26, 2018

Codecov Report

Merging #4026 into master will decrease coverage by 0.01%.
The diff coverage is 18.88%.

@@            Coverage Diff             @@
##           master    #4026      +/-   ##
==========================================
- Coverage   54.19%   54.17%   -0.02%     
==========================================
  Files         235      235              
  Lines       15186    15224      +38     
  Branches     2263     2270       +7     
==========================================
+ Hits         8230     8248      +18     
- Misses       6224     6241      +17     
- Partials      732      735       +3

@capooti
Copy link
Member

capooti commented Oct 26, 2018

Hi @afabiani thanks for helping with this.
I have tested this with two shapefiles and unfortunately I got this error:
Error occurred creating table ERROR: column "__" specified more than once

I believe this is happening because the no ASCII character are replaced with an "_"?
Note that this is happening with shapefiles which could be succesfully uploaded and downloaded to and from GeoNode 2.7.x. I add one them in case you want to test (you may even add it to the test suite):

zhejiang_yangcan_yanyu.zip

@afabiani
Copy link
Member Author

I'll try, but I'm afraid this is caused by the GeoServer Importer.

In the meantime, can you try to switch to geoserver.rest UPLOADER and test it again? I'll do tests also on my side as soon as possible.

@capooti
Copy link
Member

capooti commented Oct 26, 2018

You are correct, it is caused by the importer (I have tested it without using GeoNode).
Unfortunately I cannot switch to REST as we need geojson support. Any idea if this will be fixed in GS 2.15? I remember you wrote that there should be a lot of improvement on that.

I tried to use the REST for providing you better information, and it is also broken, got this error:

('GeoServer gave non-XML response for [GET http://localhost:8080/geoserver/rest/workspaces/geonode/datastores/wmdata/featuretypes/zhejiang_yangcan_yanyu_1.xml]: \n zhejiang_yangcan_yanyu_1\n zhejiang_yangcan_yanyu_1\n \n geonode\n \n \n \n \n features\n zhejiang_yangcan_yanyu_1\n \n GEOGCS["Xian 1980", \n DATUM["Xian 1980", \n SPHEROID["IAG 1975", 6378140.0, 298.257, AUTHORITY["EPSG","7049"]], \n AUTHORITY["EPSG","6610"]], \n PRIMEM["Greenwich", 0.0, AUTHORITY["EPSG","8901"]], \n UNIT["degree", 0.017453292519943295], \n AXIS["Geodetic longitude", EAST], \n AXIS["Geodetic latitude", NORTH], \n AUTHORITY["EPSG","4610"]]\n EPSG:4610\n \n 119.57\n 121.55\n 28.0\n 31.02\n EPSG:4610\n \n \n 119.57000000000001\n 121.55\n 27.999999805135044\n 31.019999792430077\n EPSG:4326\n \n FORCE_DECLARED\n true\n \n geonode:wmdata\n \n \n 0\n 0\n false\n false\n false\n \n \n the_geom\n 0\n 1\n true\n org.locationtech.jts.geom.Point\n \n \n ID\n 0\n 1\n true\n java.lang.Long\n \n \n _{+\n 0\n 1\n true\n java.lang.String\n \n \n _�\xef\xbf\xbd\n 0\n 1\n true\n java.lang.String\n \n \n name\n 0\n 1\n true\n java.lang.String\n \n \n X\n 0\n 1\n true\n java.lang.Double\n \n \n Y\n 0\n 1\n true\n java.lang.Double\n \n \n _\xef\xbf\xbd\n 0\n 1\n true\n java.lang.String\n \n \n', ParseError(ExpatError('reference to invalid character number: line 71, column 13',),))

@afabiani
Copy link
Member Author

@capooti I've just committed on this PR (branch ISSUE_4025) few updates which should solve the issue, at least when using the REST importer.

GeoServer Importer requires changes on GeoServer side, unfortunately.

Please give it a try when you can.

@afabiani afabiani force-pushed the ISSUE_4025 branch 2 times, most recently from 009a3bb to fbe526c Compare October 29, 2018 12:32
Copy link
Member

@capooti capooti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @afabiani
I tested this with the REST importer, and the shapefile is correctly uploaded to GeoServer.
I see a few anomalies in GeoNode, though:

  • layer attributes seem missing
  • when I identify a feature there are only fields which have ascii characters, the other fields are discarded
    It is anyway an improvement :)

@afabiani
Copy link
Member Author

@capooti roger; going to take a look

Not sure I can do something for "identify feature". This is a request which goes directly to GeoServer.

I will try to understand what's going on with attributes, though.

@capooti
Copy link
Member

capooti commented Oct 30, 2018

yes I agree with you @afabiani
I think the only way would be having the uploader to rename the fields

@afabiani
Copy link
Member Author

@capooti I did more tests on this and I have found that the problem is in the following method

https://github.com/GeoNode/geonode/blob/master/geonode/utils.py#L1011

basically the encoding fails but the exception is hidden and no renaming is done. However, trying to expose the exception, I cannot currently find a way to decode the field names. At least none of the charsets available on the upload form actually works with the sample files you provided.

Any hint on this?

@capooti
Copy link
Member

capooti commented Oct 30, 2018

I haven't time to look into this in the next two days, but thanks for having a look at it for now!

@afabiani
Copy link
Member Author

@capooti with this last commit I was finally able to correctly convert the shapefile columns before ingesting it, accordingly to the specified CHARSET.

In particular with your shapefiles I was successful with "Windows CP1258"

image

@afabiani afabiani force-pushed the ISSUE_4025 branch 2 times, most recently from 184909e to 2af5eaf Compare October 31, 2018 11:06
@afabiani
Copy link
Member Author

@capooti by the way, it works with GeoServer Importer too now.

@capooti
Copy link
Member

capooti commented Oct 31, 2018

@afabiani with both rest and importer I am getting this error:

Could not decode SHAPEFILE attributes by using the specified charset 'UTF-8'.
Traceback (most recent call last):
  File "/home/ubuntu/camp-2.10.x/geonode/geonode/layers/views.py", line 211, in layer_upload
    metadata_upload_form=form.cleaned_data["metadata_upload_form"])
  File "/home/ubuntu/camp-2.10.x/geonode/geonode/layers/utils.py", line 595, in file_upload
    defaults=defaults
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/modeltranslation/manager.py", line 413, in get_or_create
    return super(MultilingualQuerySet, self).get_or_create(**kwargs)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 466, in get_or_create
    return self._create_object_from_params(lookup, params)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 500, in _create_object_from_params
    obj = self.create(**params)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/modeltranslation/manager.py", line 405, in create
    return super(MultilingualQuerySet, self).create(**kwargs)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 394, in create
    obj.save(force_insert=True, using=self.db)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/polymorphic/models.py", line 82, in save
    return super(PolymorphicModel, self).save(*args, **kwargs)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/db/models/base.py", line 808, in save
    force_update=force_update, update_fields=update_fields)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/db/models/base.py", line 833, in save_base
    update_fields=update_fields,
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/dispatch/dispatcher.py", line 193, in send
    for receiver in self._live_receivers(sender)
  File "/home/ubuntu/camp-2.10.x/geonode/geonode/layers/models.py", line 547, in pre_save_layer
    base_file, info = instance.get_base_file()
  File "/home/ubuntu/camp-2.10.x/geonode/geonode/layers/models.py", line 244, in get_base_file
    self)
  File "/home/ubuntu/camp-2.10.x/geonode/geonode/utils.py", line 974, in check_shp_columnnames
    return fixup_shp_columnnames(inShapefile, layer.charset)
  File "/home/ubuntu/camp-2.10.x/geonode/geonode/utils.py", line 1056, in fixup_shp_columnnames
    "Could not decode SHAPEFILE attributes by using the specified charset '{}'.".format(charset))
GeoNodeException: Could not decode SHAPEFILE attributes by using the specified charset 'UTF-8'.
Internal Server Error: /layers/upload
Traceback (most recent call last):
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/core/handlers/exception.py", line 41, in inner
    response = get_response(request)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 249, in _legacy_get_response
    response = self._get_response(request)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 187, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 185, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/home/ubuntu/camp-2.10.x/env/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py", line 23, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/home/ubuntu/camp-2.10.x/geonode/geonode/layers/views.py", line 294, in layer_upload
    out[_k] = out[_k].decode(saved_layer.charset).encode("utf-8")
AttributeError: 'NoneType' object has no attribute 'charset'
"POST /layers/upload HTTP/1.1" 500 39294

do I need to change any settings? thanks

@afabiani
Copy link
Member Author

@capooti you must select the correct charset from the upload form

In your case Windows CP 1259

@afabiani
Copy link
Member Author

@capooti try live instance here

http://dev.geonode.geo-solutions.it/

@capooti
Copy link
Member

capooti commented Oct 31, 2018

I didn't realize I need to select the charset. By doing that is working indeed! thanks @afabiani

@capooti capooti merged commit cc94e82 into master Oct 31, 2018
@afabiani afabiani deleted the ISSUE_4025 branch November 5, 2018 09:23
frafra pushed a commit to frafra/geonode that referenced this pull request Dec 19, 2018
…cii characters (GeoNode#4026)

* [Fixes GeoNode#4025] Regression with uploading a shapefile with no ascii characters

* [Fixes GeoNode#4025] Regression with uploading a shapefile with no ascii characters

*  - Rename columns of non-UTF-8 shapefiles attributes before ingesting

 - Fix test cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants