Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Get Element Text #413

Closed
andreastt opened this issue Oct 4, 2016 · 13 comments
Closed

Proposal for Get Element Text #413

andreastt opened this issue Oct 4, 2016 · 13 comments
Assignees
Labels
Milestone

Comments

@andreastt
Copy link
Member

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29530

clmartin:

"12.5 getElementText()"

John Jansen and I were reading the specification for getElementText() and he remembers discussing it at a face to face but can't find the minutes. Below is what we think was agreed upon:

Get Element Text

GET /session/{session id}/element/{element id}/text

The Get Element Text command retrieves the textContent value of the given web element.

  1. If the current top-level browsing context is no longer open, return error with error code no such window.
  2. Handle any user prompts and return its value if it is an error.
  3. Let element result be the result of getting a known element by UUID parameter element id.
  4. If element result is a success, let element be element result's data. Otherwise, return element result.
  5. If element is stale, return error with error code stale element reference.
  6. Let element text be the result of calling textContent on the specified element.
  7. Let body be a JSON Object with the "value" memeber set to the element text.
  8. Return success with data body.

What do you think?

@andreastt andreastt added this to the Level 1 milestone Oct 4, 2016
@andreastt
Copy link
Member Author

Daniel Wagner-Hall:

As I recall, the reasons we bothered to specify our own algorithm were:

  • textContent will include any text which is hidden by CSS/other styling; i.e. <div>foo<span style="display: none;">bar</span></div>'s textContent will return "foobar" where we specify it to return "foo".
  • textContent doesn't perform any whitespace normalisation, so looks different to how the text will look in the browser; i.e. <div>foo bar</div>'s textContent will return "foo bar" where we specify it to return "foo bar".
  • We forcibly insert newlines at the start of new block-level elements; i.e. <div>foo<div>bar</div></div>'s textContent will return foobar where we specify it to return "foo\nbar".

The thing we really wanted to defer to was much closer to innerText, but innerText isn't standardised (and wasn't supported in Firefox until very recently).

@andreastt
Copy link
Member Author

David Burns :automatedtester:

We discussed in Sapporo that we want innerText[1](I know this isnt an official specification) but what it gives us most of the end state that we want.

There is an outstanding bug[2] for Roc's spec[1] to be incorporated into the html spec

[1] http://rocallahan.github.io/innerText-spec/
[2] whatwg/html#465

@andreastt
Copy link
Member Author

clmartin:

Just tested to verify, the major advantage of textContent is that all browsers support it and return the same value for it in most cases.

I tested some sample markup: https://jsfiddle.net/whqptr65/

textContent returned the exact same string for foo.textContent:
Edge: "\r\n bar\r\n \r\n Foo bar\r\n \r\n foo\r\n "
Chrome: "\r\n bar\r\n \r\n Foo bar\r\n \r\n foo\r\n "
Firefox: "\r\n bar\r\n \r\n Foo bar\r\n \r\n foo\r\n "

innerText failed in IE (not supported) and returned a different string for Chrome/Firefox/Edge as seen below for foo.innerText:

Edge: "bar \r\n\r\nFoo bar\r\nfoo "
Chrome: "bar\r\nFoo bar\r\n\r\nfoo"
Firefox: "bar\r\n\r\nFoo bar\r\n\r\nfoo"

I would argue in this case having something that works in all browsers the same way would be more valuable than something that works completely differently in each (and is unsupported in IE) not to mention not a spec.

I would also argue that a tester would know what content they can ignore and what is valuable to them, so hidden elements can be circumvented.

@andreastt
Copy link
Member Author

John Jansen:

We should follow HTML definition here. Need tests to make sure nothing is broken...

@andreastt
Copy link
Member Author

Simon Stewart:

We should avoid breaking existing selenium tests --- this method is used extensively, and we can "break the web tests" of many users if we're not extremely cautious.

@andreastt
Copy link
Member Author

andreastt commented Oct 4, 2016

juangj:

Reopened pending further discussion of the test results.

In summary, the only atoms tests that fail are these two tests about <title> elements: https://github.com/SeleniumHQ/selenium/blob/c10e8a955883f004452cdde18096d70738397788/javascript/webdriver/test/atoms/element_test.html#L151-L161

36 of ~800-ish tests from the Selenium Java suite failed, largely because of extra leading or trailing whitespace, or differing numbers of internal newlines.

For example, TextHandlingTest#testShouldHandleNestedBlockLevelElements fails:

Expected: is "Cheese\nSome text\nSome more text\nand also\nBrie"
     but: was "Cheese\n\nSome text\n\nSome more text\n\nand also\n\nBrie"

@andreastt
Copy link
Member Author

juangj:

We could also run this across a much broader suite of "real" tests if that seems helpful, though obviously Google isn't totally representative of WebDriver's user base.

@andreastt
Copy link
Member Author

We decided at the F2F in Lisbon to put the Get Text Selenium atom in the appendix and use that for the time being, as innerText as defined in HTML is not yet consistently implemented.

We should probably add some prose about the weakness of the algorithm we are choosing and the known edge cases we know we cannot support.

@andreastt andreastt added the bug label Oct 15, 2016
@shs96c
Copy link
Contributor

shs96c commented Jan 10, 2017

@AutomatedTester and @andreastt are we "happy" with the current state of the get element text atom in the selenium project? If so, I'll create that appendix.

I'm assigning this issue to you both so you see it. Feel free to unassign yourselves :)

@andreastt
Copy link
Member Author

@shs96c Not happy with it, but that’s what the WG decided to do.

@andreastt andreastt removed their assignment Jan 10, 2017
@shs96c
Copy link
Contributor

shs96c commented Jan 11, 2017 via email

shs96c added a commit to shs96c/webdriver-spec that referenced this issue Feb 3, 2017
The atom here is aware of the ShadowDOM and so should produce
better results if that's used. It's functionally equivalent
to the version used in Selenium 3.1

Closes w3c#413
shs96c added a commit to shs96c/webdriver-spec that referenced this issue Feb 3, 2017
The atom here is aware of the ShadowDOM and so should produce
better results if that's used. It's functionally equivalent
to the version used in Selenium 3.1

Closes w3c#413
shs96c added a commit to shs96c/webdriver-spec that referenced this issue Feb 3, 2017
The atom here is aware of the ShadowDOM and so should produce
better results if that's used. It's functionally equivalent
to the version used in Selenium 3.1

Closes w3c#413
@shs96c
Copy link
Contributor

shs96c commented Feb 10, 2017

I've started a conversation with the SFC about relicensing the compiled atoms for use in the spec.

shs96c added a commit to shs96c/webdriver-spec that referenced this issue Feb 24, 2017
The atom here is aware of the ShadowDOM and so should produce
better results if that's used. It's functionally equivalent
to the version used in Selenium 3.1

Closes w3c#413
shs96c added a commit to shs96c/webdriver-spec that referenced this issue Mar 1, 2017
The atom here is aware of the ShadowDOM and so should produce
better results if that's used. It's functionally equivalent
to the version used in Selenium 3.1

Closes w3c#413
shs96c added a commit to shs96c/webdriver-spec that referenced this issue Mar 8, 2017
The atom here is aware of the ShadowDOM and so should produce
better results if that's used. It's functionally equivalent
to the version used in Selenium 3.1

Closes w3c#413
shs96c added a commit that referenced this issue Mar 8, 2017
The atom here is aware of the ShadowDOM and so should produce
better results if that's used. It's functionally equivalent
to the version used in Selenium 3.1

Closes #413
@andreastt
Copy link
Member Author

Thanks for doing the hard work on this, @shs96c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants