Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking loading JSON file vs YML file #2897

Merged

Conversation

salochara
Copy link
Contributor

@salochara salochara commented Jan 26, 2024

Motivation / Background

Continues with #2851

This Pull Request has been created because we are comparing loading times between JSON and YML files, as suggested by @thdaraujo.
It turns out that i18n already supports JSON files!

Here's a screenshot on the execution times.
While loading with YML file, took 0.320253 seconds, loading the translation with JSON only took 0.019719 seconds. Meaning, it is waay faster, approximately ~%175 percent faster!

CleanShot 2024-01-25 at 17 40 27@2x

Additional information

Also, I did some testing on my machine and ran the TestEsMxLocale test by loading the YML file and then ran it again with the JSON file, and the execution time is significantly faster when passing a JSON.

As a note, the reasoning for choosing YAML.load_file() and JSON.load_file() methods in the task, is because those are the methods that i18n is using.

load_yaml method: https://github.com/ruby-i18n/i18n/blob/d1762c6c09e50cb1ed32cf2b4f0f525a0531094b/lib/i18n/backend/base.rb#L253

load_json method: https://github.com/ruby-i18n/i18n/blob/d1762c6c09e50cb1ed32cf2b4f0f525a0531094b/lib/i18n/backend/base.rb#L271

I believe this path can lead us to make faker much much faster!
Please, let me know what you guys think! @thdaraujo @stefannibrasil

All the best!
Salo.

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug, refactor something, or add a feature.
  • Tests and Rubocop are passing before submitting your proposed changes.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@stefannibrasil
Copy link
Contributor

Just wanted to say that this is in my todo, will get to it soon. Thanks for the PR description!

@thdaraujo
Copy link
Contributor

Hey @salochara , sorry for the late reply!

This is pretty exciting, thanks for doing the investigation. It looks very promising. I'll play around with this idea and get back to you soon.

I've been thinking about how we could approach this, and the potential ways are:
1- maybe convert all yaml locales to json, and only work with json from now on (could break other apps that depend on faker locales)
2- adding a build step that does the conversion when we build the gem and get rid of yaml before packaging the gem (would need to run tests against the package to make sure it all still works)

There are pros and cons to both approaches, so I have to think about it.

@psibi
Copy link
Member

psibi commented Feb 7, 2024

I have written some details on how I made the Haskell fakedata (which uses the same yaml file) faster here: https://psibi.in/posts/2019-09-23-fakedata-perf.html IIRC, the overhead involved was in parsing the yaml file and the IO overhead of reading yaml file was negligible.

@salochara
Copy link
Contributor Author

Hello! 👋
Hope you guys are doing great @thdaraujo @stefannibrasil

Let me know what are your thought on this PR,
I'm looking forward to keep on contributing to faker!

Copy link
Contributor

@thdaraujo thdaraujo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing the investigation, this is very cool!

Let's move the json file and use it as a test fixture for now.

lib/locales/es-MX.json Outdated Show resolved Hide resolved
tasks/benchmark.rake Outdated Show resolved Hide resolved
@salochara
Copy link
Contributor Author

Thanks for your review @thdaraujo
I'm currently traveling 🌴... I'll work on this next week!

@salochara
Copy link
Contributor Author

@thdaraujo
Hello! 👋 I'm back.

I've addressed your comments.
Let me know what you think!

@salochara salochara requested a review from thdaraujo March 7, 2024 23:59
Copy link
Contributor

@thdaraujo thdaraujo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thank you! The performance improvement is quite impressive.

@thdaraujo thdaraujo merged commit 6754225 into faker-ruby:main Mar 17, 2024
8 checks passed
@salochara
Copy link
Contributor Author

Hi @thdaraujo
Hope you're doing great.

Just wondering if you're thinking on moving forward with this?
I believe that the time performance is quite good!

Let me know what you think.
All the best!
Salo.

@thdaraujo
Copy link
Contributor

Hey @salochara , sorry for the late reply!

Do you want to investigate how much work it would take to convert all the yaml files to json?
Maybe we could have a script that converts all yaml files to json.

If we have that, I think we could potentially package the gem without the yaml files and only keep the json files, or set up i18n to only load json files. I haven't investigated this idea yet, just throwing it out there to see what you think.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants