Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom types are not being recognized/ignored #564

Closed
FirefoxMetzger opened this issue Jul 23, 2021 · 7 comments · Fixed by #566
Closed

Custom types are not being recognized/ignored #564

FirefoxMetzger opened this issue Jul 23, 2021 · 7 comments · Fixed by #566
Labels
enhancement New feature or request

Comments

@FirefoxMetzger
Copy link

FirefoxMetzger commented Jul 23, 2021

Let me first apologize in advance in case this issue is more of a support request than a bug. I've been trying to figure out how this works, and perhaps I'm just doing things wrong and everything is fine. In this case - depending on the solution - I might be able to turn this issue into a doc PR 🚀

I'm trying to generate bindings for SDFormat using the CLI tool. SDFormat is XML but has some quirks to it. For example, custom types based on xsd:string and a regex. In my case, they are defined in a types.xsd (found here). The problem is that, when I generate the stubs, any element that is of such a custom type does not show up in the generated bindings and I can't figure out why. I also don't get any warning about missing/dropped/ignored elements. My hunch is that this is a severe case of user error because I am a first-time user of xsdata.

I can provide a MVE if it helps, once I find a place to upload the .xsd files; they are quite interdependent. Unfortunately, they are not part of the official SDFormat repo, because it is actually a C++ project, and .xsd is generated from templates during preprocessing/building with CMake. Theoretically, there is a SDFormat schema server that serves .xsd and that is being referred to in the generated files, but that one is actually outdated and doesn't seem to be actively maintained :( Also, iirc, xsdata doesn't fetch includes from remote, so this won't make much of a difference.

In the meantime, I can share two representative files; maybe that is already enough to identify the problem. In world.xsd the fields gravity and magnetic_field (among others) are the ones that don't show up in the bindings.

types.xsd
<?xml version='1.0' encoding='UTF-8'?>
<xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
  <xsd:simpleType name="vector3">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){2}((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*"/>
    </xsd:restriction>
  </xsd:simpleType>

  <xsd:simpleType name="quaternion">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){3}((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*"/>
    </xsd:restriction>
  </xsd:simpleType>

  <xsd:simpleType name="vector2d">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+)((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*"/>
    </xsd:restriction>
  </xsd:simpleType>

  <xsd:simpleType name="vector2i">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="\s*(-|\+)?\d+\s+(-|\+)?\d+\s*"/>
    </xsd:restriction>
  </xsd:simpleType>

  <xsd:simpleType name="pose">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){5}((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*"/>
    </xsd:restriction>
  </xsd:simpleType>

  <xsd:simpleType name="time">
    <xsd:restriction base="xsd:double">
    </xsd:restriction>
  </xsd:simpleType>

  <xsd:simpleType name="color">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="(\s*\+?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){3}\+?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s*"/>
    </xsd:restriction>
  </xsd:simpleType>

</xsd:schema>
world.xsd
<?xml version='1.0' encoding='UTF-8'?>
<xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
  <xsd:annotation>
    <xsd:documentation xml:lang='en'>
      <![CDATA[The world element encapsulates an entire world description including: models, scene, physics, and plugins.]]>
    </xsd:documentation>
  </xsd:annotation>
  <xsd:include schemaLocation='http://sdformat.org/schemas/types.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/atmosphere.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/gui.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/physics.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/scene.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/light.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/frame.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/model.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/actor.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/plugin.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/road.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/spherical_coordinates.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/state.xsd'/>
  <xsd:include schemaLocation='http://sdformat.org/schemas/population.xsd'/>
  <xsd:element name='world'>
    <xsd:complexType>
      <xsd:choice maxOccurs='unbounded'>
        <xsd:choice  minOccurs='0' maxOccurs='1'>
        <xsd:element name='audio'>
          <xsd:annotation>
            <xsd:documentation xml:lang='en'>
              <![CDATA[Global audio properties.]]>
            </xsd:documentation>
          </xsd:annotation>
          <xsd:complexType>
            <xsd:choice maxOccurs='unbounded'>
              <xsd:choice  minOccurs='1' maxOccurs='1'>
              <xsd:element name='device' type='xsd:string'>
                <xsd:annotation>
                  <xsd:documentation xml:lang='en'>
                    <![CDATA[Device to use for audio playback. A value of "default" will use the system's default audio device. Otherwise, specify a an audio device file"]]>
                  </xsd:documentation>
                </xsd:annotation>
              </xsd:element>
              </xsd:choice>
            </xsd:choice>
          </xsd:complexType>
        </xsd:element>
        </xsd:choice>
        <xsd:choice  minOccurs='0' maxOccurs='1'>
        <xsd:element name='wind'>
          <xsd:annotation>
            <xsd:documentation xml:lang='en'>
              <![CDATA[The wind tag specifies the type and properties of the wind.]]>
            </xsd:documentation>
          </xsd:annotation>
          <xsd:complexType>
            <xsd:choice maxOccurs='unbounded'>
              <xsd:choice  minOccurs='0' maxOccurs='1'>
              <xsd:element name='linear_velocity' type='vector3'>
                <xsd:annotation>
                  <xsd:documentation xml:lang='en'>
                    <![CDATA[Linear velocity of the wind.]]>
                  </xsd:documentation>
                </xsd:annotation>
              </xsd:element>
              </xsd:choice>
            </xsd:choice>
          </xsd:complexType>
        </xsd:element>
        </xsd:choice>
        <xsd:choice  minOccurs='0' maxOccurs='unbounded'>
        <xsd:element name='include'>
          <xsd:annotation>
            <xsd:documentation xml:lang='en'>
              <![CDATA[
        Include resources from a URI. Included resources can only contain one 'model', 'light' or 'actor' element. The URI can point to a directory or a file. If the URI is a directory, it must conform to the model database structure (see /tutorials?tut=composition&cat=specification&#defining-models-in-separate-files).
    ]]>
            </xsd:documentation>
          </xsd:annotation>
          <xsd:complexType>
            <xsd:choice maxOccurs='unbounded'>
              <xsd:choice  minOccurs='1' maxOccurs='1'>
              <xsd:element name='uri' type='xsd:string'>
                <xsd:annotation>
                  <xsd:documentation xml:lang='en'>
                    <![CDATA[URI to a resource, such as a model]]>
                  </xsd:documentation>
                </xsd:annotation>
              </xsd:element>
              </xsd:choice>
              <xsd:choice  minOccurs='0' maxOccurs='1'>
              <xsd:element name='name' type='xsd:string'>
                <xsd:annotation>
                  <xsd:documentation xml:lang='en'>
                    <![CDATA[Override the name of the included entity.]]>
                  </xsd:documentation>
                </xsd:annotation>
              </xsd:element>
              </xsd:choice>
              <xsd:choice  minOccurs='0' maxOccurs='1'>
              <xsd:element name='static' type='xsd:boolean'>
                <xsd:annotation>
                  <xsd:documentation xml:lang='en'>
                    <![CDATA[Override the static value of the included entity.]]>
                  </xsd:documentation>
                </xsd:annotation>
              </xsd:element>
              </xsd:choice>
              <xsd:choice  minOccurs='0' maxOccurs='1'>
              <xsd:element name='placement_frame' type='xsd:string'>
                <xsd:annotation>
                  <xsd:documentation xml:lang='en'>
                    <![CDATA[The frame inside the included entity whose pose will be set by the specified pose element. If this element is specified, the pose must be specified.]]>
                  </xsd:documentation>
                </xsd:annotation>
              </xsd:element>
              </xsd:choice>
            </xsd:choice>
          </xsd:complexType>
        </xsd:element>
        </xsd:choice>
        <xsd:choice  minOccurs='1' maxOccurs='1'>
        <xsd:element name='gravity' type='vector3'>
          <xsd:annotation>
            <xsd:documentation xml:lang='en'>
              <![CDATA[The gravity vector in m/s^2, expressed in a coordinate frame defined by the spherical_coordinates tag.]]>
            </xsd:documentation>
          </xsd:annotation>
        </xsd:element>
        </xsd:choice>
        <xsd:choice  minOccurs='1' maxOccurs='1'>
        <xsd:element name='magnetic_field' type='vector3'>
          <xsd:annotation>
            <xsd:documentation xml:lang='en'>
              <![CDATA[The magnetic vector in Tesla, expressed in a coordinate frame defined by the spherical_coordinates tag.]]>
            </xsd:documentation>
          </xsd:annotation>
        </xsd:element>
        </xsd:choice>
        <xsd:element ref='atmosphere'/>
        <xsd:element ref='gui'/>
        <xsd:element ref='physics'/>
        <xsd:element ref='scene'/>
        <xsd:element ref='light'/>
        <xsd:element ref='frame'/>
        <xsd:element ref='model'/>
        <xsd:element ref='actor'/>
        <xsd:element ref='plugin'/>
        <xsd:element ref='road'/>
        <xsd:element ref='spherical_coordinates'/>
        <xsd:element ref='state'/>
        <xsd:element ref='population'/>
      </xsd:choice>
      <xsd:attribute name='name' type='xsd:string' use='required' >
        <xsd:annotation>
          <xsd:documentation xml:lang='en'>
            <![CDATA[Unique name of the world]]>
          </xsd:documentation>
        </xsd:annotation>
      </xsd:attribute>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>

Any help or pointers are highly appreciated 💯

@tefra
Copy link
Owner

tefra commented Jul 23, 2021

Hi @FirefoxMetzger only complex types and elements are generated, everything else with the exception of enumerations are flattened.

<?xml version='1.0' encoding='UTF-8'?>
<xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
  <xsd:simpleType name="vector3">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){2}((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*"/>
    </xsd:restriction>
  </xsd:simpleType>
  <xsd:simpleType name="quaternion">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){3}((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*"/>
    </xsd:restriction>
  </xsd:simpleType>

  <xsd:element name="First" type="quaternion" />
  <xsd:complexType name="Second">
    <xsd:sequence>
      <xsd:element name="vector3" type="vector3" />
      <xsd:element name="quaternion" type="quaternion" />
    </xsd:sequence>
  </xsd:complexType>
</xsd:schema>
@dataclass
class First:
    value: Optional[str] = field(
        default=None,
        metadata={
            "required": True,
            "pattern": r"(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){3}((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*",
        }
    )


@dataclass
class Second:
    vector3: Optional[str] = field(
        default=None,
        metadata={
            "type": "Element",
            "namespace": "",
            "required": True,
            "pattern": r"(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){2}((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*",
        }
    )
    quaternion: Optional[str] = field(
        default=None,
        metadata={
            "type": "Element",
            "namespace": "",
            "required": True,
            "pattern": r"(\s*(-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+)\s+){3}((-|\+)?(\d+(\.\d*)?|\.\d+|\d+\.\d+[eE][-\+]?[0-9]+))\s*",
        }
    )

@FirefoxMetzger
Copy link
Author

only complex types and elements are generated, everything else with the exception of enumerations are flattened.

Thank you for this pointer, it got me on the right track.

I tripped over the error messages stating that the element with custom type was not found, and - after I manually added it - the next element with custom type was not found, and so on. With your pointer in mind, I checked again and found that other elements (with basic types) were missing, too. After digging through the schema, I found a bug over in SDFormat that causes all of this, because the generated xsd files are faulty (types.xsd isn't present in the output directory, and their include order/namespacing inside the xsd files causes some tags to alias 🤷).

With this in mind, I have two more questions:

How are include tags handled if the resource is not found?

I have xsdata warnings of the form warning: Resource not found http://sdformat.org/schemas/pose.xsd originating from (I think) <xsd:include schemaLocation='http://sdformat.org/schemas/pose.xsd'/>. In this case, will the build ignore this include?

Second, is there a way to make xsdata warn/abort in the event of a naming conflict in the schema? Thanks to the bug in SDFormat I get .xsd of the form

...
<xsd:include schemaLocation='model.xsd'/>  <!-- defines <xsd:element name='model'> -->
 <xsd:include schemaLocation='model_state.xsd'/> <!-- defines <xsd:element name='model'> -->
...

which is expanded into a naming conflict:

...
<xsd:element name='model'> ... </xsd:element>
<xsd:element name='model'> ... </xsd:element>
...

xsdata never creates the bindings for model.xsd's <model> and all the elements point to model_state.xsd's <model> instead. This is clearly because of the naming conflict in the source .xsd and I wonder if it is possible to warn when such things happen.

@tefra
Copy link
Owner

tefra commented Jul 23, 2021

The generator is very lenient, with missing types or imports, it's actually accepted in the specification that types can be "absent", in these cases the generator is substituting the missing types with xs:anySimpleType or with xs:string depending the case.

The generator emits warnings on these cases (#1, #2), but they are lost in the whole output, but I am planning like a warnings summary like pytest (#534)

Naming conflicts are also accepted in the specification but the handling depends on the case, there are various scenarios

  • if they belong to different namespaces, they are treated as unique
  • complexType vs Element (the complex type is renamed)
  • simpleType vs complexType: simple types are flattened anyway
  • same element tag that's tricky, because it's not supposed to happen except under the xs:redefine or xs:override, I think the first occurrence prevails in that scenario when the analyzing process is trying to find a matching type, and you are correct there should be a warning in that scenario

def process(self):
"""
Remove if possible classes with the same qualified name.
Steps:
1. Remove invalid classes
2. Handle duplicate types
3. Merge dummy types
"""
for classes in self.container.data.values():
if len(classes) > 1:
self.remove_invalid_classes(classes)
if len(classes) > 1:
self.handle_duplicate_types(classes)
if len(classes) > 1:
self.merge_global_types(classes)
def remove_invalid_classes(self, classes: List[Class]):

@tefra tefra added the enhancement New feature or request label Jul 23, 2021
@FirefoxMetzger
Copy link
Author

FirefoxMetzger commented Jul 23, 2021

The generator emits warnings on these cases (#1, #2), but they are lost in the whole output

Nice. That explains what's happening on my end. The schema server is outdated and doesn't have certain files; xsdata ignores these includes and correctly emits warnings. At the same time, any missing elements/types are defined by local copies of what the server should have. I call xsdata generate on a folder, and hence all these files are globbed in. Since all files are read first and then turned into dataclasses xsdata never misses a type and correctly never warns about missing types. Interesting behavior.


I think the first occurrence prevails in that scenario

Actually the last one prevails. Without knowing the codebase, I guess existing elements are stored in a dict? If so, the solution could be as simple as adding a if new_element_tag in element_dict.keys(): emit warning before the assignment happens.

@FirefoxMetzger
Copy link
Author

def add(self, item: Class):
"""Add class item to the container."""
self.data.setdefault(item.qname, []).append(item)

Perhaps this is the spot where naming conflicts are silently resolved/eaten?

tefra added a commit that referenced this issue Jul 25, 2021
tefra added a commit that referenced this issue Jul 25, 2021
tefra added a commit that referenced this issue Jul 25, 2021
@tefra
Copy link
Owner

tefra commented Jul 25, 2021

Thanks for the suggestion @FirefoxMetzger

 xsdata xsdata-w3c-tests/w3c/msData/particles/particlesJs001.xsd
Parsing schema particlesJs001.xsd
Parsing schema particlesJs001.imp
Compiling schema particlesJs001.imp
Builder: 4 main and 0 inner classes
Compiling schema particlesJs001.xsd
Builder: 3 main and 0 inner classes
Analyzer input: 7 main and 0 inner classes
warning: Duplicate types (2) found: {http://xsdtesting}B, will keep the last defined!
Analyzer output: 6 main and 0 inner classes
Generating package: init
Generating package: generated.particles_js001
Generating package: generated.particles_js001_imp

tefra added a commit that referenced this issue Jul 25, 2021
@FirefoxMetzger
Copy link
Author

FirefoxMetzger commented Jul 26, 2021

Wow, so quick! Thank you for looking into this so fast.

One suggestion:

warning: Duplicate types (2) found: {http://xsdtesting}B, will keep the last defined!

Should this be rephrased to

warning: Duplicate types (2) found for {http://xsdtesting}B; will keep the last defined one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants