-
Notifications
You must be signed in to change notification settings - Fork 74.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batches handled differently in the int reference fully connected kernel #53501
Comments
We are also seeing this issue. As a workaround, conv layers with 1x1 filter kernels can substitute for fully connected layers. Note that for quantized models this will result in per-channel quantization being applied, which may impact your model size and output. There's a related issue here tracking fixing the problem for optimized kernels on Arm embedded devices that use TensorFlow Lite for Microcontrollers: ARM-software/CMSIS_5#1398 |
With #54223 merged the problem should now be solved. The fix was also imported in tflite-micro (see commit tensorflow/tflite-micro@1501b57). |
@mansnils , Could you please try it in Nightly version and let us know if the problem is fixed as per the above comment. Thanks! |
This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you. |
Closing as stale. Please reopen if you'd like to work on this further. |
3D output shape support for Fully_Connected layer under xtensa. The fix to the reference kernels was made with tensorflow/tensorflow#54223 (and the corresponding import from TF Lite to TFLM). TFLM currently does not have a test case for this (corresponding to what was added with tensorflow/tensorflow#54223), BUG=tensorflow/tensorflow#53501
System information
-- TensorFlow installed from (source or binary):
Describe the current behavior
Almost all fully connected kernels handle batches differently than the int8 reference fully connected kernel. Is this really intentional or a bug?
We have seen models (with keep_num_dims=True for fully_connected) where batches can be different in fully connected when a model has been converted from float to int8.
find . -name "fully_connected.h" |xargs grep "const int batches"|grep -v "//"
./lite/kernels/internal/reference/fully_connected.h: const int batches = FlatSizeSkipDim(output_shape, output_dims_count - 1);
./lite/kernels/internal/reference/fully_connected.h: const int batches = FlatSizeSkipDim(output_shape, output_dim_count - 1);
./lite/kernels/internal/reference/fully_connected.h: const int batches = FlatSizeSkipDim(output_shape, output_dim_count - 1);
./lite/kernels/internal/reference/fully_connected.h: const int batches = FlatSizeSkipDim(output_shape, output_dim_count - 1);
./lite/kernels/internal/reference/fully_connected.h: const int batches = FlatSizeSkipDim(output_shape, output_dim_count - 1);
./lite/kernels/internal/reference/integer_ops/fully_connected.h: const int batches = output_shape.Dims(0);
./lite/kernels/internal/reference/integer_ops/fully_connected.h: const int batches = FlatSizeSkipDim(output_shape, output_dim_count - 1);
./lite/kernels/internal/optimized/sparse_ops/fully_connected.h: const int batches = FlatSizeSkipDim(output_shape, output_dims_count - 1);
./lite/kernels/internal/optimized/sparse_ops/fully_connected.h: const int batches = thread_end - thread_start;
./lite/kernels/internal/optimized/sparse_ops/fully_connected.h: const int batches =
./lite/kernels/internal/optimized/integer_ops/fully_connected.h: const int batches = FlatSizeSkipDim(output_shape, output_dim_count - 1);
The typical case is const int batches = FlatSizeSkipDim(output_shape, output_dim_count - 1)
In the int8 ref case: const int batches = output_shape.Dims(0);
Describe the expected behavior
Batches are handled the same way for int and float fc kernels.
Contributing
Change const int batches = output_shape.Dims(0);
To
const int output_dims_count = output_shape.DimensionsCount();
const int batches = FlatSizeSkipDim(output_shape, output_dims_count - 1);
The text was updated successfully, but these errors were encountered: