Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do you support using screen image as the screen parameter? #3

Open
felicitia opened this issue Jul 2, 2021 · 4 comments
Open

Do you support using screen image as the screen parameter? #3

felicitia opened this issue Jul 2, 2021 · 4 comments

Comments

@felicitia
Copy link

Hello @tobyli, we're trying to use the pre-trained model to get the vectors, following your instruction under Quick Start.
Regarding -s/--screen, the path to the screen to encode, can you explain more about what this screen parameter should be? Looks like it should be a JSON file that contains the UI layout? If so, do you support using screenshot images directly? Thank you very much!

@tobyli
Copy link
Owner

tobyli commented Jul 3, 2021

Thanks for the question, Yixue! That option takes the JSON hierarchical representation of a screen in the format of screens in the RICO dataset (https://interactionmining.org/rico). We don't support screenshot images as our model doesn't really use (pixel-based) visual information from the screens.

@felicitia
Copy link
Author

Got it! Thanks for confirming this @tobyli. :) BTW, have you ever tried using the UI hierarchy reverse-engineered from the screenshots (e.g., using REMAUI, UIED)? since sometimes the UI hierarchy code isn't available (like during the mock-up phase etc) Wondering if you have any insights on how well that might work (i.e., how "good" the output vectors are if using the UI hierarchy obtained from reverse engineering tools). And just to clarify, I'm only talking about the testing phase to get the vectors using the pre-trained model, not the training phase (it does make a lot of sense to use RICO's data for training).

@tobyli
Copy link
Owner

tobyli commented Jul 3, 2021

I haven't tried it, but it sounds like an intriguing idea! I think as long as the reverse-engineering generates reasonable meta-data for each view (e.g., text, className) as well as reasonable hierarchical structures, it should work without a problem. Let me know if you decided to try it that way -- really curious about the result.

@felicitia
Copy link
Author

sure will keep you posted if we end up trying this route :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants