Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic implementation of the ElementWriter concept #278

Merged
merged 2 commits into from
Mar 30, 2021

Conversation

dralley
Copy link
Collaborator

@dralley dralley commented Mar 20, 2021

@tafia Here's a basic implementation of the idea. See the tests I added for examples of how using it would look.

I can add documentation and improved tests once the general idea is solidified. There's probably things to be improved :)

@dralley dralley force-pushed the simple-nested-write branch 2 times, most recently from 018b26c to a205039 Compare March 20, 2021 05:07
@dralley dralley mentioned this pull request Mar 20, 2021
src/writer.rs Outdated Show resolved Hide resolved
@dralley dralley force-pushed the simple-nested-write branch from a205039 to f11d580 Compare March 20, 2021 18:40
src/writer.rs Show resolved Hide resolved
@dralley
Copy link
Collaborator Author

dralley commented Mar 21, 2021

I was also thinking about a design that would allow arbitrary depth, but I wasn't sure if it could be done without allocations.

Copy link
Owner

@tafia tafia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
I would suggest returning the &mut Writer<W> in write_text/write_empty for people to write more things in the same expression.

src/writer.rs Outdated

pub struct NodeWriter<'a, W: Write> {
writer: &'a mut Writer<W>,
tag_name: &'a [u8],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to keep a separate field for tag_name as you can get it from the BytesStart::name.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry didn't read your comment on lifetimes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. It's possible there's a way around it that I don't know about though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only was there a way, it was actually documented perfectly.

/// Converts the event into a borrowed event. Most useful when paired with [`to_end`].
///
/// # Example
///
/// ```
/// # use quick_xml::{Error, Writer};
/// use quick_xml::events::{BytesStart, Event};
///
/// struct SomeStruct<'a> {
/// attrs: BytesStart<'a>,
/// // ...
/// }
/// # impl<'a> SomeStruct<'a> {
/// # fn example(&self) -> Result<(), Error> {
/// # let mut writer = Writer::new(Vec::new());
///
/// writer.write_event(Event::Start(self.attrs.to_borrowed()))?;
/// // ...
/// writer.write_event(Event::End(self.attrs.to_end()))?;
/// # Ok(())

src/writer.rs Show resolved Hide resolved
src/writer.rs Show resolved Hide resolved
src/writer.rs Show resolved Hide resolved
src/writer.rs Outdated
.create_node(b"paired")
.with_attribute(("attr1", "value1"))
.with_attribute(("attr2", "value2"))
.write_text(BytesText::from_plain_str("text"))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a write_cdata be added?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes ... ideally all types of events

Copy link
Collaborator Author

@dralley dralley Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All events, or just the of types which would make sense to be nested inside a single pair of tags?

In my mind, this is just an optional high level API with a focus on being easy to use, and difficult to misuse. If you wanted to reproduce "weird" or invalid XML perfectly you would probably want to use the existing API instead, which provides total control.

I think of all the event types the only ones that make sense inside a pair of tags are Text, CData, and maybe PI? Comment, Start, End, Decl, and EOF would all be invalid.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decl probably doesn't make any sense indeed

@dralley
Copy link
Collaborator Author

dralley commented Mar 22, 2021

I would suggest returning the &mut Writer in write_text/write_empty for people to write more things in the same expression.

Could you give a code example of what this would look like? ("writing more things in the same expression")

@tafia
Copy link
Owner

tafia commented Mar 22, 2021

Could you give a code example of what this would look like? ("writing more things in the same expression")

 writer
            .create_node(b"paired")
            .with_attribute(("attr1", "value1"))
            .with_attribute(("attr2", "value2"))
            .write_text(BytesText::from_plain_str("text"))?
            .create_node(b"paired")  // another node
            .with_attribute(("attr1", "value1"))
            .with_attribute(("attr2", "value2"))
            .write_text(BytesText::from_plain_str("text"))?
            .write_event(...)? // directly write an event

It is not really better. On the other hand we're not losing anything from returning a &mut self.

@dralley dralley force-pushed the simple-nested-write branch from f11d580 to 96057a0 Compare March 22, 2021 19:23
@dralley dralley changed the title Basic implementation of the NodeWriter concept Basic implementation of the ElementWriter concept Mar 22, 2021
@dralley dralley force-pushed the simple-nested-write branch 2 times, most recently from 1614465 to d8a4797 Compare March 22, 2021 20:02
src/writer.rs Outdated
}

#[test]
fn element_writer_scope() {
Copy link
Collaborator Author

@dralley dralley Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to try this experiment, but I'm not strongly attached to it. What do you think? It seems useful for really simple cases of nesting, but less so more complex ones.

Multiple levels of nesting would get messy.

@dralley dralley force-pushed the simple-nested-write branch 11 times, most recently from 79e1ac1 to 4c11c60 Compare March 24, 2021 03:02
@dralley
Copy link
Collaborator Author

dralley commented Mar 24, 2021

@tafia This is ready for re-review, although I should add more tests before it actually merges. Just let me know if there's anything objectionable about the API.

Here's what a larger piece of code looks like when rewritten to use this strategy: dralley/rpmrepo_metadata@0498173

@tafia
Copy link
Owner

tafia commented Mar 25, 2021

Looks good to me, let me know when you want to merge.
Thanks!

@dralley dralley force-pushed the simple-nested-write branch 4 times, most recently from cdd8ba7 to 404e87d Compare March 25, 2021 04:49
}

/// Write some text inside the current element.
pub fn write_text_content(self, text: BytesText) -> Result<&'a mut Writer<W>> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a pair of tags with no text always get represented as [Start] [Text ""] [End]?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? If possible I think we could optimize it and write an empty event but this is really not very important.

Copy link
Collaborator Author

@dralley dralley Mar 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to be sure that what I'm doing is correct - if it was ever necessary to write two separate start and end events with no text event (even an empty one) in the middle, this wouldn't be convenient. But if there needs to always be one (unless writing an empty tag, in which case you have better options than this function), then this is fine.

@dralley
Copy link
Collaborator Author

dralley commented Mar 25, 2021

@tafia Apart from the question, and your approval of the docstrings I just added, I think this is good to go.

src/writer.rs Outdated
}

/// Create a new scope for writing XML inside the current element.
pub fn write_nested_content<F>(mut self, closure: F) -> Result<&'a mut Writer<W>>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would inner or scoped or any other word make more sense here?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inner seems better. Not a fan of scoped I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

src/events/mod.rs Outdated Show resolved Hide resolved
@dralley dralley force-pushed the simple-nested-write branch from 2dd8dae to e0602af Compare March 26, 2021 02:20
@@ -315,10 +315,13 @@ impl<'a> BytesStart<'a> {
}

/// Try to get an attribute
pub fn try_get_attribute(&'a self, attr_name: &[u8]) -> Result<Option<Attribute<'a>>> {
pub fn try_get_attribute<N: AsRef<[u8]> + Sized>(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attribute names don't need to be escaped right? This would let users pass &str, &[u8], or Vec<u8>.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, names don't need escaping.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed one difference between a library I'm writing and an existing one that uses libxml2 - the one based on libxml2 seems to strip tab characters \t out of attribute values. I can't find anything in the spec that mentions that nor can I find it in libxml2 though - do you know whether that might be required?

@dralley dralley force-pushed the simple-nested-write branch from e0602af to b24cb9a Compare March 27, 2021 14:05
dralley added 2 commits March 27, 2021 10:24
Add an API that simplifies writing basic tag pairs and small amounts of
content nesting in a more declarative fashion.
@dralley dralley force-pushed the simple-nested-write branch from b24cb9a to 1e02176 Compare March 27, 2021 14:24
@tafia tafia merged commit d00f9ca into tafia:edition2018 Mar 30, 2021
@tafia
Copy link
Owner

tafia commented Mar 30, 2021

Great! Thanks a lot for your time.

@dralley dralley deleted the simple-nested-write branch March 30, 2021 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants