Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoTVM] Add support for text buffers to ApplyHistoryBest #12521

Merged
merged 7 commits into from
Aug 23, 2022

Conversation

guberti
Copy link
Member

@guberti guberti commented Aug 21, 2022

Currently, AutoTVM's ApplyHistoryBest class does not support loading tuning logs from memory. This is a pet peeve of mine, as it requires you to work with a tempfile whenever writing autotuning tests. This is also just strange, as the rest of AutoTVM has support for text buffers (e.g. tvm.autotvm.callback.log_to_file supports passing in a text buffer, letting us write to but not read from them).

Additionally, ApplyHistoryBest handles input arguments very unintuitively. Before this PR, it allowed users to pass string filepaths, a list of string filepaths, or an Iterable (such as a list) of input and result tuples. However, it did not support taking in StringIO objects as mentioned above, nor pathlib.Path objects, nor combinations of a filepath and an Iterable of tuples.

In a perfect world, we would change ApplyHistoryBest to take as input a path-like object, file-like object, or an Iterable of input and result tuples (similar to what ApplyGraphBest takes as an argument). However, this would break the existing functionality to take as input a list of filepaths.

To be backwards compatible, while fixing this issue, this pull request defines a new type inside dispatcher.py:

Records = Union[
    Union[str, bytes, Path],  # Path-like objects
    TextIOBase,  # File-like objects
    Iterable[Tuple[MeasureInput, MeasureResult]],
]

It then rewrites ApplyHistoryBest.load so it takes the following arguments:

def load(self, records: Union[Records, Iterable[Records]]):

This PR also adds unit tests for this new functionality, and fixes a relevant bug in tests/micro/common/test_autotune.py in which a StringIO object was passed to apply_history_best, causing it to appear to pass but not actually read any data.

Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @guberti

python/tvm/autotvm/task/dispatcher.py Outdated Show resolved Hide resolved
Records = Union[
Union[str, bytes, Path], # Path-like objects
TextIOBase, # File-like objects
Iterable[Tuple[MeasureInput, MeasureResult]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this use case supported currently? i'm not sure whether we should support bypassing the load path here (this is technically "defensive programming" which i recognize is not great but i also think it might be prudent here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ApplyHistoryBest currently supports taking an input of type Iterable[Tuple[MeasureInput, MeasureResult]]. This functionality is basically never used though, as autotuning (AFAIK) does not support exporting logs with this type. Do you think this functionality can be removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, if we remove it then let's do it not in this PR.

If is str, then it should be the filename of a records log file.
Each row of this file is an encoded record pair. If it is a list, it can either be
a list of paths to log files that will be loaded jointly or an iterator or records.
records : Records or iterator of Records objects, where a Records
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's reasonable to allow for the variety of ways to read in records, but just wondering if there's a use case for supporting Iterator[Records]?

Copy link
Member Author

@guberti guberti Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think supporting Iterable[Records]makes any sense, and would love to get rid of this functionality. However, the function ApplyHistoryBest currently supports taking as input an iterator of str paths, so removing this would break previously existing functionality. Thoughts @areusch?

python/tvm/autotvm/task/dispatcher.py Outdated Show resolved Hide resolved
@guberti
Copy link
Member Author

guberti commented Aug 23, 2022

@areusch I addressed your minor comments, but I think there is a broader discussion to be had about what types of arguments ApplyHistoryBest should take. Taking Union[Records, Iterable[Records]] is the "least bad" solution IMO that preserves all existing functionality, such as the ability to pass Iterable[str] (e.g. a list of filepaths) and Iterable[Tuple[MeasureInput, MeasureResult]].

If we're OK with breaking some rarely used functionality, though, this could be simplified a lot. autotvm.callback.log_to_file supports writing to and only to file-like and path-like objects, so if we wanted ApplyHistoryBest and ApplyGraphBest to match, I would change the function signature to:

class ApplyHistoryBest(DispatchContext):
    ...
    def load(self, records: Union[str, bytes, PathLike, TextIOBase]):

What's your opinion here?

@guberti guberti force-pushed the add-load-history-best-typing branch from 0ba0c15 to 131e56a Compare August 23, 2022 12:55
Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree there's a broader conversation to be had around the intefaces here. i'd like to avoid breaking autotvm functionality for now, so i'm ok with this.

Records = Union[
Union[str, bytes, Path], # Path-like objects
TextIOBase, # File-like objects
Iterable[Tuple[MeasureInput, MeasureResult]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, if we remove it then let's do it not in this PR.

@areusch areusch merged commit da5836f into apache:main Aug 23, 2022
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
)

Currently, AutoTVM's ApplyHistoryBest class does not support loading tuning logs from memory. This is a pet peeve of mine, as it requires you to work with a tempfile whenever writing autotuning tests. This is also just strange, as the rest of AutoTVM has support for text buffers (e.g. tvm.autotvm.callback.log_to_file supports passing in a text buffer, letting us write to but not read from them).

Additionally, ApplyHistoryBest handles input arguments very unintuitively. Before this PR, it allowed users to pass string filepaths, a list of string filepaths, or an Iterable (such as a list) of input and result tuples. However, it did not support taking in StringIO objects as mentioned above, nor pathlib.Path objects, nor combinations of a filepath and an Iterable of tuples.

In a perfect world, we would change ApplyHistoryBest to take as input a path-like object, file-like object, or an Iterable of input and result tuples (similar to what ApplyGraphBest takes as an argument). However, this would break the existing functionality to take as input a list of filepaths.

To be backwards compatible, while fixing this issue, this pull request defines a new type inside dispatcher.py:

Records = Union[
    Union[str, bytes, Path],  # Path-like objects
    TextIOBase,  # File-like objects
    Iterable[Tuple[MeasureInput, MeasureResult]],
]
It then rewrites ApplyHistoryBest.load so it takes the following arguments:

def load(self, records: Union[Records, Iterable[Records]]):
This PR also adds unit tests for this new functionality, and fixes a relevant bug in tests/micro/common/test_autotune.py in which a StringIO object was passed to apply_history_best, causing it to appear to pass but not actually read any data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants