Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support nested associative arrays #417

Open
dantleech opened this issue Feb 10, 2014 · 46 comments
Open

Support nested associative arrays #417

dantleech opened this issue Feb 10, 2014 · 46 comments
Labels

Comments

@dantleech
Copy link
Contributor

It should be possible to have an array type which automatically creates a structure, e.g

class Site {
    /**
     * @PHPCR\AssocArray()
     */
     protected $preferences;
}

Then automatically map that to a PHPCR structure:

/site
    preferences/
         - jcr:primaryType  = phpcr:associative_array
         - key1 = value 
         key2/
             - key3: = value
             - key4: = value
             key5/
@dbu
Copy link
Member

dbu commented Feb 10, 2014

can php array keys be anything else than strings? if not, that should
actually be doable. i think we would need to find a better name, as we
already have multivalue with assoc=true which is used for flat key-value
hashmaps. something like NestedArray?

but anyways, lets not start this before the cmf 1.1 is released.

@lsmith77
Copy link
Member

php array keys can be integers and strings, nothing else

@ElectricMaxxx
Copy link
Contributor

I would suggest an handling equal to the type setting in the JMSSerialzerBundle. It should be important to keep strings for the key to set them as localnames by default.

If the phpcr-odm can handle this type setting it should be possible to persist complete document-structures in an equal way We serialize objects to XML for example. The only difference should be: the structures have to contain Nodes as Wrapper for other properties.

I think this is a little bit more than the issue started, but goes in the same way. Cause if We start to add deeper array there can be a way of setting deeper documents instead of reference them as children. This can be a Second way.

@ElectricMaxxx
Copy link
Contributor

Oh i wrote nonesense.
But can not delete it at the momentfrom my mobile phone.

Thought he wants to set the deeper arrays as real child documents.

@danrot
Copy link

danrot commented Feb 11, 2014

On which level would this feature be implemented? Only in the ODM, or would this already be placed in the PHPCR-Specification?

@dbu
Copy link
Member

dbu commented Feb 11, 2014

Only odm, the phpcr spec is linked to jcr. And its not a basic thing, its just syntactic sugar to automatically store data without objects.

----- Reply message -----
Von: "Daniel Rotter" [email protected]
An: "doctrine/phpcr-odm" [email protected]
Cc: "David Buchmann" [email protected]
Betreff: [phpcr-odm] Support nested associative arrays (#417)
Datum: Di., Feb. 11, 2014 08:05
On which level would this feature be implemented? Only in the ODM, or would this already be placed in the PHPCR-Specification?


Reply to this email directly or view it on GitHub.

@dantleech
Copy link
Contributor Author

Could we not do this as a BC break? As it would otherwise overlap with the functionality of assoc=true

@dbu
Copy link
Member

dbu commented Feb 11, 2014

we could add another thing like assoc, calling it nested and then it would persist differently, but its still an attribute field and not a children mapping in phpcr-odm terms.

@dantleech
Copy link
Contributor Author

Not sure mapping it as an attribute field would make sense .. i.e. @Integer(nested=true), @String(nested=true). Would we then cast all the elements as integer or string depending on the field type?

I think a dedicated field type would make more sense, (like Doctrines @Field(type=phparray))

We could then either unconditionally map all the property values as string or dynamically cast them.

@dbu
Copy link
Member

dbu commented Feb 11, 2014

yeah you are right, a field like @array resp @field(type=array) sounds
good. afaik we call the other thing multivalue, and multivalue is just
that, while array maps to phpcr nodes.

@ElectricMaxxx
Copy link
Contributor

Is it possible to get an @field(type=<integer,string>) to get valdation on keys and values for their types? Sounds cool for Annotation, but imposible for xml config. By this it should be easy to go one Level deeper and so on.

@ElectricMaxxx
Copy link
Contributor

Oh my signs are gone. Ment @field(type=<integer,string>)

@dbu
Copy link
Member

dbu commented Feb 14, 2014

if we don't specify it, the types will be autodetermined by phpcr and come back in the right type on loading (string / integer / float / boolean)

@uwej711
Copy link
Contributor

uwej711 commented Feb 14, 2014

Recently I had some issues with that when there were null values in an array (using assoc=true) forcing me to remove all null velues before persisting.

@dbu
Copy link
Member

dbu commented Feb 14, 2014

uh, i think indeed in phpcr setting something to null means removing it. so your problem was that the keys get shifted because values disapear when reloading the property? maybe we should fix persisting assoc arrays to clean null values before storing the keys and values. otherwise we would need to store the indexes that where null in a 3rd field and on restoring the array put them back in the right places. that sounds tricky, but would probably the least wtf. do you want to open a separate issue for this? can we address this in like the next couple of days to get it into the 1.1 release? there is potential BC break here, if people somehow worked around the bogus behaviour.

@dantleech
Copy link
Contributor Author

Does JCR store property type in the property (i.e. so something that is dropped in PHPCR) or is it the responsability of the nodeType definition?

Having metadata on the properties sounds awkward indeed.

@dbu
Copy link
Member

dbu commented Feb 14, 2014

@dantleech the type of a property is stored in the property itself (the
type is just there to enforce what type a property may have). i think
uwe was talking about something different - null values are simply not
stored in phpcr. assuming the 'test' property exists,
$node->setPropertyValue('test', null) is the same as
$node->getProperty('test')->remove() . this seems to hold true for
arrays as well (though only in jackalope-jackrabbit probably.)

@dantleech
Copy link
Contributor Author

@dbu so the type is stored with the property? Looking at a PHPCR dump I don't see any data other than the value on the property, I thought it was dynamically inferred.

If the type is stored with the property, can we not add a custom type for null?

@dbu
Copy link
Member

dbu commented Feb 14, 2014

@dantleech look at an export in the system view, you will see the type. the types of properties are a limited set, there are no custom types and there is no type for null - the doc is quite explicit on the semantics of null: http://www.day.com/specs/jcr/2.0/10_Writing.html#10.4.2.4%20No%20Null%20Values

@uwej711
Copy link
Contributor

uwej711 commented Feb 14, 2014

@dbu: I can't remember from heart but I think Jackalope or phpcr-utils throw an exception somewhere when you try to store an array with assoc=true and null values.

@dbu
Copy link
Member

dbu commented Feb 21, 2014

i can't store null values. what i get is

[PHPCR\ValueFormatException]
Can not determine type of property with value "NULL"

not an extremly helpful message, but at least it does not work to create
messed up data.

@dantleech
Copy link
Contributor Author

Shall we do this for 1.1? I might be able to look at it at the weekend if nobody else is interested.

@dbu
Copy link
Member

dbu commented Feb 21, 2014

you mean the validation? that would be nice. for the original topic of nested associative arrays, i am -1 as this is a totally new thing with quite some effort involved, and we really need to get 1.1 out and the cmf bundles 1.1 out. we are behind the roadmap by almost 2 months...

@dbu dbu added the feature label Mar 1, 2014
@uwej711
Copy link
Contributor

uwej711 commented Mar 19, 2014

This would also need some special handling for null values in the array, as properties with null values are not stored by phpcr, so you will loose the keys!

@dbu
Copy link
Member

dbu commented Mar 19, 2014

see #457 for how uwe solved this for the hashmaps case - the solution here would probably look similar.

@dantleech
Copy link
Contributor Author

Hmm. Just had a look at this and one issue is how or if we should handle translations here?

@dbu
Copy link
Member

dbu commented Oct 27, 2014

i would handle them for the whole property, as with non-nested arrays. if the property is translated, the whole thing is translated. if you use children translation strategy, there is no problem. for attribute strategy, if the idea is to create subnodes, we should use the same naming schema as for properties for the top level node of the structure.

@dantleech
Copy link
Contributor Author

ok, i thought the same but it struck me as being maybe non-trivial - would this open a box of refactorings do you think?

@dbu
Copy link
Member

dbu commented Oct 27, 2014

maybe the assumptions of the parameter to a translation strategy needs to be altered a bit - maybe not even that, not sure. apart from that i expect it to be rather straightforward.

@dantleech
Copy link
Contributor Author

I just wrote a an event listener class for SuluCMF which does this. It uses a dot-delimited property name to store multi-dimensional arrays.

Name Type Vaue
foo.bar.bar STRING Value 1
foo.boo.baz LONG 1234
foo.car STRING Value 2

Would give

array(
    'foo' => array(
        'bar' => => array(
            'bar' => 'Value 1',
            'baz' => 1234,
        ),
        'car' => 'Value 2',
    ),

Will try and port it here.

@dbu
Copy link
Member

dbu commented Jan 12, 2015

so you use a child node for the array, but only one instead of a nested node tree? i guess that works, but why not go with nested nodes while you are at it? this scheme would break if i do

array(
    'foo.bar' => 'x',
);

@dantleech
Copy link
Contributor Author

No everything is in the same node in this strategy.

Regarding the breaking of the scheme, if we rewrote the entire array every time it shouldn't be a problem I think.

@dbu
Copy link
Member

dbu commented Jan 12, 2015

if everything is in the same node, does that mean there can only be one array? and how do you know what belongs into the array or not?

with rewriting, you mean encode . in hashmap keys with something else?

@dantleech
Copy link
Contributor Author

No, say you have a property foo which is mapped as an array in PHPCR-ODM.

If you set its value to array('foo' => array('bar' => 'baz'), 'car' => array('boat', 'plane')) then the following properties would be created in the node:

foo.foo.bar STRING "baz"
foo.car.0 STRING "boat"
foo.car.1 STRING "plane"

We would know it is an array because there would be a mapping indicating that it is so.

@dantleech
Copy link
Contributor Author

with rewriting, you mean encode . in hashmap keys with something else?

Ah, yeah I see what you mean. We would need to encode/escape the delimiter.

@dbu
Copy link
Member

dbu commented Jan 12, 2015

ah ok. i think your first example was missing the property name, that got me confused ;-)

and then to rebuild the array, you would scan the node for all properties starting with "foo.". but be aware that properties do not have guaranteed order, only nodes can have that. this means we can not guarantee to rebuild the exact same php array. but this is no argument against your architecture, as the same problem holds for nested nodes for child arrays.

the order would only be guaranteed if we would use a single multivalue field with a list of keys and a second one with a list of values. but then it would become unsearchable again. we might use a separate multivalue property just for the order.

@dantleech
Copy link
Contributor Author

Hmm yeah. But is order really important with associative arrays?

In the case of non-associative arrays they would have indexes so would be orderable. I think unordered associative arrays is an acceptable compromise (although if pushed we could store the order as you say).

@dbu
Copy link
Member

dbu commented Jan 12, 2015

well, i could do a foreach on the assoc array... i think while we are at it we should make it clean, or we risk unexpected things. the separate multivalue field for the order should be ok. you could then restore the array by doing foreach on that multivalue field and get the values out of the property values array of the node.

@dantleech
Copy link
Contributor Author

maybe we could make it an option?

/**
 * @PHPCR\Array(ordered=true)
 */
protected $options;

Storing the order would increase the node size, and this strategy is already pretty inefficient in terms of size -- although perhaps this is a moot point anyway as we need to store the keys in order to have the null values.

Also .. this could replace the existing array implementation(s) what do you think?

@dantleech
Copy link
Contributor Author

(regarding the null values we could also represent them as a string e.g. __null__ and make sure that that value is escaped when the user uses it) /cc @uwej711

@dbu
Copy link
Member

dbu commented Jan 12, 2015

i would not make keeping the order optional - keep complexity manageable. if anything, we could support either expanding the way you do, or serializing an array (if you are concerned about efficiency).

and yes, this should probably replace what uwe did, no need to have two slightly different things for the same. this however is a BC break for existing data...

can you search the same way as with the uwe-solution? right now if i had a field $f, it was f = 'foo', now it would be f.* = 'foo' right?

@dantleech
Copy link
Contributor Author

I guess for searching we should make it transparent (and therefore maintain BC). I guess we need to do some research about the best way to implement the search syntax/API. But I think the basic case f = 'foo' should be the same.

@dbu
Copy link
Member

dbu commented Jan 12, 2015

in the query builder, yes (though, you could also want to search for a specific key?)

i was thinking about SQL2 queries too, there must be a representation that works.

@dantleech
Copy link
Contributor Author

well, I guess SQL2 is to PHPCR-ODM what SQL is to Doctrine ORM.. i.e. it breaks the abstraction so you shouldn't really use it to query for documents.

@dantleech
Copy link
Contributor Author

Had a look at this. Basically there is a pain point with the translation strategies because the they would need to set a property value for each element in the array instead of a single value. So we would probably have to push more responsiblity into those (although I guess it could be handled in an abstract class and we would have a "serialization" helper).

@dbu
Copy link
Member

dbu commented Mar 19, 2015

for child strategy, we need not do anything at all, as each translation has its own node. for attribute, we should probably just translate each property, if you go with the approach of splitting each array entry into its own property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants