Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error after upgrade from 3.6 to 3.7 #20686

Open
ThorvaldAagaard opened this issue Dec 24, 2024 · 4 comments
Open

Error after upgrade from 3.6 to 3.7 #20686

ThorvaldAagaard opened this issue Dec 24, 2024 · 4 comments
Assignees
Labels

Comments

@ThorvaldAagaard
Copy link

I have a model, that use combined input.

Using version prior to 3.7 it works very fine, when using this to predict

image

Dropping the tf.function and using a normal predict also works fine

Output is typical something like this
image

Upgrading to keras 3.7, using the same model result in this output
image

Do you see the difference?

shape[0] for x and b is now None

But the real problem is the prediction, that is no longer correct

Just to be sure, I print the shape of x and b before clalling the tf.function
(1238, 42)
(1238, 15)

It looks like there is a problem finding the batch size for combined input, so for now I will have to stay at 3.6

@sonali-kumari1
Copy link
Contributor

Hi @ThorvaldAagaard,

Thanks for reporting this issue. Within tf.function context, not all dimensions may be known until execution time. So, you can use dynamic keras.ops.shape(x) from keras or tf.shape(x) from tensorflow over the static x.shape.
Attaching gist for your reference.

@ThorvaldAagaard
Copy link
Author

Thank you for looking into this, I have added the printing using keras.ops.shape.

But then we are back to my real problem

prediction with keras 3.6
image

prediction with keras 3.7
image

Using the same model and input.

Could it be a change that must be implemented in Tensorflow?

@sonali-kumari1
Copy link
Contributor

Hi @ThorvaldAagaard,

I tried to replicate this issue and I am attaching two files where I ran the same model using both versions of Keras(3.6.0 and 3.7.0). Could you please provide more information about the model architecture you're using so that I can fully replicate the issue. Thanks!

@ThorvaldAagaard
Copy link
Author

If just looking for model architecture this is what I used for training the model
https://github.com/lorserker/ben/blob/main/scripts/training/opening%20lead/keras/lead_nn_keras.py

During my testing I found that training the model in 3.7 and predicting in 3.7 is fine
Training in 3.6 and predicting in 3.6 is also fine.

But if you mix we have the problem.

Based on the changes between 3.6 and 3.7 there should be no differences.

So basically upgrading to 3.6 require retraining of models

If you unzip the file below, and first try

python test_lead_nn.py

and note the number of predictions - should be around 50 %

In the zip-file there is the data (as numpy array) used for training, and the script to train the model

Normally I will use 100 epochs, but just to verify there is a change 10 is fine.

I have attached two models both trained on the same data one for keras 3.6 and one for keras 3.7

You can just switch the model name in test_lead_nn.py while switching keras version, and you will have the problem.

test.zip
model.zip

(Split in two files due to github file size limit)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants