Skip to content
This repository was archived by the owner on Jan 15, 2024. It is now read-only.

[MODEL] BERT conversion scripts, SciBERT, BioBERT, ClinicalBERT #735

Merged
merged 18 commits into from
Jun 8, 2019

Conversation

leezu
Copy link
Contributor

@leezu leezu commented May 29, 2019

This refactors the tf Bert conversion scripts based on #732.

@eric-haibin-lin

@leezu leezu requested a review from eric-haibin-lin May 29, 2019 15:58
@leezu leezu requested a review from szha as a code owner May 29, 2019 15:58
@codecov
Copy link

codecov bot commented May 29, 2019

Codecov Report

❗ No coverage uploaded for pull request head (refactorbertconversioncode@ef74887). Click here to learn what that means.
The diff coverage is n/a.

@codecov
Copy link

codecov bot commented May 29, 2019

Codecov Report

Merging #735 into master will increase coverage by 0.65%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #735      +/-   ##
==========================================
+ Coverage   89.95%   90.61%   +0.65%     
==========================================
  Files          64       64              
  Lines        6064     6064              
==========================================
+ Hits         5455     5495      +40     
+ Misses        609      569      -40
Impacted Files Coverage Δ
src/gluonnlp/data/utils.py 74.14% <ø> (ø) ⬆️
src/gluonnlp/model/utils.py 76.72% <100%> (ø) ⬆️
src/gluonnlp/model/bert.py 99.27% <100%> (ø) ⬆️
src/gluonnlp/data/dataloader.py 83.62% <0%> (-5.18%) ⬇️
src/gluonnlp/data/stream.py 89.61% <0%> (+0.54%) ⬆️
src/gluonnlp/model/sequence_sampler.py 92.22% <0%> (+15.9%) ⬆️

@leezu
Copy link
Contributor Author

leezu commented May 29, 2019

@eric-haibin-lin this PR only touches code in the script folder. All other changes are due to #732 and can be ignored while reviewing this.

@leezu leezu force-pushed the refactorbertconversioncode branch 3 times, most recently from 2c7a7ba to 501189c Compare June 4, 2019 19:20
@mli
Copy link
Member

mli commented Jun 4, 2019

Job PR-735/4 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-735/4/index.html

@szha szha added the release focus Progress focus for release label Jun 5, 2019
@leezu leezu force-pushed the refactorbertconversioncode branch from 501189c to d3a9c68 Compare June 5, 2019 15:28
@mli
Copy link
Member

mli commented Jun 5, 2019

Job PR-735/5 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-735/5/index.html

@leezu leezu force-pushed the refactorbertconversioncode branch from d3a9c68 to 20950a7 Compare June 6, 2019 18:50
@leezu leezu force-pushed the refactorbertconversioncode branch from 20950a7 to c32f538 Compare June 6, 2019 18:55
@mli
Copy link
Member

mli commented Jun 6, 2019

Job PR-735/7 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-735/7/index.html

@mli
Copy link
Member

mli commented Jun 7, 2019

Job PR-735/8 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-735/8/index.html

leezu added 3 commits June 7, 2019 13:40
Confirmed that the scibert_scivocab_uncased loaded from PyTorch produces the
same output as the tensorflow version (based on the compare_tf_gluon_model.py).
@szha szha changed the title Refactor tf bert conversion scripts based on flexible vocab [MODEL] BERT conversion scripts, SciBERT, BioBERT, ClinicalBERT Jun 7, 2019
@leezu leezu force-pushed the refactorbertconversioncode branch from d229a41 to 5614e5b Compare June 7, 2019 20:11
@mli
Copy link
Member

mli commented Jun 7, 2019

Job PR-735/14 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-735/14/index.html

Copy link
Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work!

@mli
Copy link
Member

mli commented Jun 8, 2019

Job PR-735/15 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-735/15/index.html

@mli
Copy link
Member

mli commented Jun 8, 2019

Job PR-735/16 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-735/16/index.html

@mli
Copy link
Member

mli commented Jun 8, 2019

Job PR-735/17 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-735/17/index.html

@szha szha merged commit 847c415 into dmlc:master Jun 8, 2019
@leezu leezu deleted the refactorbertconversioncode branch June 9, 2019 10:18
This was referenced Jun 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
release focus Progress focus for release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants