Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay] Support for Tensorflow 2.0 #4102

Closed
ic opened this issue Oct 11, 2019 · 27 comments
Closed

[Relay] Support for Tensorflow 2.0 #4102

ic opened this issue Oct 11, 2019 · 27 comments
Assignees

Comments

@ic
Copy link
Contributor

ic commented Oct 11, 2019

TF2 has been released this week, and becomes the stable version. Some APIs are breaking or not supported in Relay.

[Note: I may have been wrong on the resize op thing. Same problem with TF1.14]
For example, the half_pixel_centers attribute in the resize op has been introduced in TF2 (separate issue #4101), and deprecated namespaces like tf.gfile, tf.graph_util, tf.GraphDef, etc.

Walking through the TF tutorial, here are some elements to support TF2. Definitely insufficient, so just for getting started on the support.

@srkreddy1238 srkreddy1238 self-assigned this Oct 11, 2019
@srkreddy1238
Copy link
Contributor

@ic that was good start to upgrade TF.
You could raise a formal PR and invite for review to take it to completion.

We could use this issue to track TF upgrade.

@ic
Copy link
Contributor Author

ic commented Oct 11, 2019

I'll be working on #4104, usually on Fridays.

@srkreddy1238
Copy link
Contributor

How about the backward compatibility to 1.12 or 1.14 while upgrading to 2.0 ?

One easy solution is to use TF upgrade script and convert to 2.0 and use it, but it may not be more native to 2.0 features.

  • Do we drop old version and go to 2.0 ?
  • Maintain both
    1. Move old frontend to compatibility (relay.frontend.compatibility.from_tensorflow)? What would be CI default version ? Validating both test cases here may be a challenge !
    2. Selectively change behaviour based on tf.__version__

Any ideas @tqchen @tmoreau89 @yongwww @soiferj ?

@yongwww
Copy link
Member

yongwww commented Oct 24, 2019

IMHO, both 1.x and 2.0 should be maintained for a while, most of the working tf models were built on top of 1.x. Another option is to maintain the same from_tensorflow for both 1.x and 2.0, it is possible to maintain it for both since savedmodel is the universal format, not sure how much work is needed. For TVM CI, maybe we can build different container for this kind of backward compatability testing. @tqchen

@ic
Copy link
Contributor Author

ic commented Oct 25, 2019

Google seems to push for using TF2.x, perhaps to "catch up" on popularity against PyTorch and the like on developer happiness? Still I support @yongwww's point: Existing work outside Google is on TF 1.x (and 0.x :-)), and they will linger for some time. So this PR, and the sub-PR about fixing the tutorial target full backward compatibility, using the tf.compat.v1 APIs for now. Having said and done so, a couple comments:

  • The new annotations like @tf.function really change a lot of syntax, and maintaining a single script for both 1.x and 2.x would be a pain for developers and users. We can detect at runtime what TF version is packaged, and switch script accordingly (e.g. a facade from_tensorflow module that loads the right implementation) (well, same as @srkreddy1238's point (2)).
  • Does it matter anyway? TVM users might be looking ahead, as the community is growing and penetrating systems in production. It may be bold but saner to concentrate all strengths on what is clearly the close future: TF2.x.

@srkreddy1238
Copy link
Contributor

Thanks for the comments @yongwww & @ic

As @yongwww said, front end implementation may not get affected much as the savedmodel is universal except the operator specific API changes in 2.0. TF 2.0 has major modification in terms of the way we build the graph and execute (session, eager execution, tf.function ..etc.). Test cases will demand more effort here.

I would suggest dual compatibility by

  • Selectively switching at frontend implementation.
  • Individual test cases to 1.x and 2.0.
  • Necessary documentation or tutorial to promote TF 2.0 in coming days.

@tqchen
Copy link
Member

tqchen commented Oct 29, 2019

It seems that we have reached consensus for dual compatibility for now. In terms of implementation details, TF2 diverges too much, we can also create a separate code path and do switch at the highest level

@yongwww
Copy link
Member

yongwww commented Nov 9, 2019

@ic @srkreddy1238 if you guys are working on this support, would you mind sharing the status?

@cchung100m
Copy link
Contributor

Hi @ic @srkreddy1238

Would you mind sharing the latest status?

@ic
Copy link
Contributor Author

ic commented Dec 20, 2019

Sorry for the delay. The intention is to support TF1 and TF2 separately, trying to see if partial migrations can help. No progress since last update, although I hope to have some time soon to get back on it.

@FrozenGene
Copy link
Member

Maybe we could boost the progress. I am told by many folks they are interested in Mobilenet V3 model. However, HardSwish is not defined in the r1.13 version of tensorflow so that we can not run Mobilenet V3 end2end test.

@tqchen
Copy link
Member

tqchen commented Apr 29, 2020

NOTE: the Mainline CI now uses tensorflow 2.0

@tqchen tqchen closed this as completed Apr 29, 2020
@srkreddy1238
Copy link
Contributor

Good to see CI moving to 2.0.

On this context.

  • I have upgraded TF test cases to use pure 2.0 API (with out compat v1, tf.graph, tf.session ...etc.).
  • Trying to simplify the frontend with support to use model signatures unlike explicit arguments of shapes and out info (In the interest of SavedModel & TF Serving signatures being officially used export formats for TF).

Most of this work is completed (except few operators failing with 2.0 API).
Hoping to upstream soon once I am cleared with contribution policy approval.

@yongwww
Copy link
Member

yongwww commented Nov 12, 2020

Good to see CI moving to 2.0.

On this context.

  • I have upgraded TF test cases to use pure 2.0 API (with out compat v1, tf.graph, tf.session ...etc.).
  • Trying to simplify the frontend with support to use model signatures unlike explicit arguments of shapes and out info (In the interest of SavedModel & TF Serving signatures being officially used export formats for TF).

Most of this work is completed (except few operators failing with 2.0 API).
Hoping to upstream soon once I am cleared with contribution policy approval.

Hi @srkreddy1238 do you have any update? it would be super cool to see a pr.

@yongwww
Copy link
Member

yongwww commented Dec 4, 2020

We have looked into the models like image classification, object detection, segmentation, etc. Seems we need to enable the support for TensorList, control-flow, function, and some missing operations. We would like to work on this project from scratch. As we discussed above, we might create a new tf frontend for tf2.x (and I believe the parser for tf1.x will be deprecated in the future as tf community stops the support for 1.x).

Please comment here or ping me/us if anyone is interested in working with us on this. I synced with siva @srkreddy1238 in the past few weeks, we will keep him and the community updated if we work together on this.

@srkreddy1238
Copy link
Contributor

TF 2.x differs at the front facing API level but the end graph doesn't change a lot here.
Doing a fresh parser will duplicate most of the layers we have till today. I think we may end up with a lot of redundant work.

My understanding and advice:

  • TF2.x official format is saved model.
  • TF2.x parser in my understanding should support inputs as
    (saved_model, signature name, input_spec) : Saved model input
    (concrete_function, input_spec) : Manually built TF Function.
    (TF Hub, input_spec) : A TFHub URL directly
  • Parsing any of these input options finally goes through grappler and results in a grapph_def.
  • There is no difference in this graph_def here and the existing parser does the job here.
  • The additional layers (TensorList, VarHandles, etc.) could follow the existing implementation at node level.

TF2.x brings in standardization without the need of additional arguments like inputs, outputs, shapes(few cases) ..etc to be given by the user manually like we earlier asked for.
I think TF2.x is an enhancement or amendment over 1.x parser than a replacement.

Ref. https://github.com/srkreddy1238/tvm/tree/tf2.0 has the implementation.

  • This branch is outdated and I need to rebase it.
  • The 2.0 parser is part of the test script here.
  • I will need to clean it bit and will send a PR soon as a baseline.

@yongwww
Copy link
Member

yongwww commented Dec 10, 2020

@srkreddy1238 yeah, I agree that GraphDef protocol buffer definition just changed a little, creating a new parser will end up with some redundant work (copy from the existing parser). I tried to run the test cases in your shared branch, I found the existing parser works for most of the TF 2.x ops (previously I assume we have to modify most of the mapping in the parser). we can just update the existing parser for TF2.x, and remove unnecessary part in the future once we stop supporting 1.x support.

Please rebase your work, it would be good to have it work on top of TF 2.4(latest version is 2.4.0rc4). We can work on variable and tensorlist related ops at the same time.

@wx3000
Copy link

wx3000 commented Feb 12, 2021

@srkreddy1238: I am working on a new TF parser from scratch, to be introduced as ~/python/tvm/relay/frontend/tensorflow2.py. I will share this work soon. The new parser will support control flow v2 op(While/If), instead of control flow v1 op (Enter/Exist/Merge). A lot of the existing parser code which deal with control flow v1 are not needed in the new TF parser. The new parser will support TensorList* ops instead of TensorArray* ops. TensorArray* ops are no longer needed in TF2.x models. The new parser will also handle stateful LSTM models (and similar stateful TF models). My goal this year include leveraging the TF XLA infrastructure to simplify the parser. TF2.x has introduced many significant changes and deprecated some old ways of doing things. So I think a fresh parser is a better way. We can still maintain the current parser for a while for TF1.x models. @yongwww : fyi.

@srkreddy1238
Copy link
Contributor

@wx3000 thanks for the inputs.

I see TF2.x as an amendment rather a replacement to 1.x parser. There are new operator support needed in 2.x which can go as an amend over existing operator/parser support. The control flow ops and List ops can be extended on existing parser infra.

Ref. #7498

  • Extends parser to accept concrete_function of TF2.x from which it extracts graph.
  • Once we have the graph it's all about operators support.
  • I can see most of the ops are supported out of the existing parser.

Couldn't find a concrete motivation for a brand new parser which overlaps existing parser mostly.

@yongwww any thoughts ?

@wx3000
Copy link

wx3000 commented Feb 23, 2021

@srkreddy1238 : I took a brief look at PR # 7498. This is useful for simple TF2 models where most of the operators did not change much from TF1 to TF2. I am still not convinced that extending the current TF parser will be good in the long term, considering the complexity of supporting TF control flow V1 and V2 in the same code base. A clean start may provide a better foundation to build future work on.

@srkreddy1238
Copy link
Contributor

@wx3000 understand the complexity around control flow ops. But, I am worried about two copies of independent ops across parsers. Understanding the complexity of existing parser and to avoid redundant ops across parsers here are my thoughts.

  • Version independent ops implementation can move to a common file (tf_op.py)
  • Common entry point for the parser from_tensorflow with graphdef or concrete_function (tensorflow.py).
  • Quick inspection of graph can decide the parser version to use (TF1.x / TF2.x).
  • Parser and corresponding exclusive ops can go to individual parsers (tf_parser_1_x.py & tf_parser_2_x.py)
    Or the tf_parser_x_x. py can hold only the parser and tf_op.py can hold all operator implementations.

@srkreddy1238 srkreddy1238 reopened this Feb 23, 2021
@wx3000
Copy link

wx3000 commented Feb 23, 2021

@srkreddy1238 : I agree we should avoid duplication in ops implementation. This will speed things up for TF2.x model support and keep the code clean. Great idea!

I think we can have a series of small PR roughly in this order:

  1. new entry point from_tensorflow() for v2 style control flow handling, in a new file tensorflow2.py
  2. common ops moving from tensorflow.py to tensorflow_ops.py. examples are: _concatV2, _transpose, etc. ops in this file are tested for both versions of the parser.
  3. new ops such as TensorList* will be in tensorflow2_ops.py first. Later we may move it to tensorflow_ops.py if there is a need for this op in the v1 parser.
  4. common parser entry point for v1/v2 switch based on inspection of the graph.
  5. convenient entry point which takes saved_model format or concrete_function, in addition to GraphDef.

We will soon submit item 1-3 from the above list. Rest I don't know who will be working on at this point but hopefully someone will pick it up.

@yongwww
Copy link
Member

yongwww commented Mar 10, 2021

I agree with the main ideas you guys proposed above. @srkreddy1238 @wx3000 Having a common .py for op mapping for both v1 and v2, exclusive ops go to individual parsers (v1, v2).

@mippos
Copy link

mippos commented Aug 23, 2021

Hi! What is the status of Tensorflow2 support or schedule for it?

We use Tensorflow 2.x. Should TVM users still convert models back to TF1.12 or can we go with TF2 models?
(At least https://tvm.apache.org/docs/dev/frontend/tensorflow.html still refers to TF1.12...)

@masahi
Copy link
Member

masahi commented Jan 9, 2022

Please open a new issue if TF 2 support is desired. I'm not aware of anyone working actively on this now.

@masahi masahi closed this as completed Jan 9, 2022
@yongwww
Copy link
Member

yongwww commented Jan 12, 2022

Hi! What is the status of Tensorflow2 support or schedule for it?

We use Tensorflow 2.x. Should TVM users still convert models back to TF1.12 or can we go with TF2 models? (At least https://tvm.apache.org/docs/dev/frontend/tensorflow.html still refers to TF1.12...)

TensorFlow 2.x is supported in TVM, no need to convert models back to TF 1.x.

@akash2826
Copy link

Hi Team,

What is the status of supporting TF 2.x in TVM . I don't see any example for parsing TF 2.x model in TVM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants