Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google apis might return standard version of Base64 but current generator only support URL-safe version #22

Open
xiaoniu-578fa6bff964d005 opened this issue Feb 21, 2020 · 6 comments

Comments

@xiaoniu-578fa6bff964d005

Current generator only support URL-safe version. ref

It will panic when some apis(like text-to-speech) return standard version of base64 data.

@ggriffiniii
Copy link
Contributor

Thanks for the report. Sadly the text to speech api Discovery document says that resource is a type: "string" and format: "bytes". Google's documentation claims that should always be a urlsafe base64 encoding. Without any other indicator in the discovery document I'm not sure how to generate the correct logic. I'll take a look at what some of the other language binding generators do with this api to see if they've solved this in a reasonable way

@xiaoniu-578fa6bff964d005
Copy link
Author

Seems a lot of libraries under googleapis uses Protobuf, so they don't need to handle base64 problem.

@xiaoniu-578fa6bff964d005
Copy link
Author

I am not quite familiar with the design of this repo. Is there any specific design consideration that discourage us from using grpc interface or using async interface of reqwest?

@ggriffiniii
Copy link
Contributor

A lot of google apis' now support both grpc and rest. grpc uses protocol buffers as the serialization mechanism, while rest apis use json. This generated is only concerned with rest api's (at least for now).

Now back to the issue of base64 encoding. I looked at python and golang and they both treat the data as just a series of bytes and the user is responsible for decoding it themselves. I wanted to be a little more convenient and handle the decoding internally since it was documented clearly that it should be urlsafe. I've thought for a while that we will need to have a mechanism to manually tweak some of the api's during generation to give us a way of handling these non-conforming api's. An alternative in this case is that we could provide a base64 decoder that seamlessly handles both the standard alphabet as well as the urlsafe alphabet, but that doesn't feel like the correct way to handle this to me.

As a stopgap measure to allow you to make progress using the texttospeech api, I would recommend that you use the generator to produce your own version of the texttospeech bindings. You can then hand edit the part that uses google_api_bytes::Bytes to instead just treat it as a Vec and you can handle the decoding yourself just like all the other language bindings do.

Creating the bindings should be simple:

git clone https://github.com/google-apis-rs/generator.git
cd generator
cargo run --example any_api -- --name=texttospeech --version=v1 --output_dir=/tmp

The above commands will put a new crate in /tmp/google_texttospeech_v1/ (you can specify a different directory from /tmp using --output_dir or simply move the directory where you want it after creation). Open /tmp/google_texttospeech_v1/src/lib.rs in your favorite editor and substitute ::google_api_bytes::Bytes with Vec<u8>.

That's it. Now just use the local version of the crate in your app and your version of SynthesizeSpeechResponse will have audio_content that's simply a Vec that you'll need to decode.

@mwilliammyers
Copy link
Member

I agree, handling both alphabets seamlessly seems tricky and not the correct way.

I see the value in doing it both of the ways you suggested—configure or even hard-code per library vs leaving it up to the user. Although, it might make sense to just punt and be consistent with the Python and Go libraries?

@bes
Copy link

bes commented Nov 15, 2020

I followed @ggriffiniii's instructions, and created a library for datastore1. However, I did not have any luck with Vec<u8>.

AFAIK serde can only (de)serialize Vec<u8> from an array of numbers e.g. { "myField": [1, 0, 2, 5] }.

I chose to replace ::google_api_bytes::Bytes with String. Since I don't care about the actual transaction id (yet, anyway) I can just ignore that fact and let it be a String all the time. If I need the actual data later I will pull in Base 64 decoding.

Thanks for the step by step, it worked (almost) flawlessly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants