Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(table): improve HTML validation for table extraction #874

Merged
merged 2 commits into from
Nov 6, 2024

Conversation

myhloli
Copy link
Collaborator

@myhloli myhloli commented Nov 6, 2024

  • Add lxml dependency for HTML parsing
  • Update test case to use XPath and HTML parser for structure and content validation
  • Check for presence of essential HTML elements like <table>, <thead>, <tbody>, <tr>, and <td>
  • Validate column headers and specific row contents

- Remove outdated version options (0.6.x, 0.7.x, 0.8.x)- Add current version option (0.9.x)
- Add lxml dependency for HTML parsing
- Update test case to use XPath and HTML parser for structure and content validation
- Check for presence of essential HTML elements like <table>, <thead>, <tbody>, <tr>, and <td>
- Validate column headers and specific row contents
@myhloli myhloli merged commit 1ae7a93 into opendatalab:dev Nov 6, 2024
1 of 2 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Nov 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant