-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Summarization #9
base: summarization
Are you sure you want to change the base?
Conversation
Mainly about data preprocessing. |
Add data2idx |
|
||
|
||
|
||
# NotImplemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the unused code
|
||
return article, abstract | ||
|
||
def write2file(url_file, out_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the processed datasets in the PR. eventually it will goes to S3.
class Seq2SeqEncoder(Block): | ||
pass | ||
|
||
class SUMEncoder(S2Seq2SeqEncoder): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is the seq2seq+attention, right?
Already deleted used codes and add loss.py as well as decode.py. |
Removed datafile and seq2seq debug |
change data transformer |
fixed context vector shape and beamsearch |
Update seq2seq + attention |
|
||
|
||
__all__ = ['Seq2SeqEncoder', 'Seq2Seq2SeqDecoder', 'SUMEncoder', 'SUMDecoder', 'get_summ_encoder_decoder'] | ||
class Seq2SeqEncoder(Block): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove Seq2SeqEncoder
prefix = None, params = None): | ||
super(SUMEncoder, self).__init__(prefix = prefix, params = params) | ||
self.hidden_size = hidden_size | ||
with self.name_scope(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | ||
_, length, _ = inputs.shape | ||
|
||
outputs, new_state = self.rnn_cells[0].unroll( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
layout = 'TNC'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of unroll, use rnn forward, and the output and output_states will be generated accordingly
|
||
return [outputs, new_state] | ||
|
||
class Attention(HybridBlock): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore convolution in the encoder for attention for now
|
||
|
||
|
||
class Seq2SeqDecoder(Block): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove Seq2SeqDecoder
def forward(self, step_input, states): | ||
raise NotImplementedError | ||
|
||
class SUMDecoder(Seq2SeqDecoder): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decode function: use attention and encoder output to compute output
|
||
print("vocab_lenght: ", len(self.vocab)) | ||
self.attention_cell = MLPAttentionCell(units=2*self._hidden_size, normalized=False, prefix= 'attention_') | ||
with self.name_scope(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to LSTM layer similar to encoder
) | ||
) | ||
|
||
with self.name_scope(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge the name_scope together
@@ -0,0 +1,43 @@ | |||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove sequence loss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to improve the code
|
||
loss_function = SoftmaxCELoss() | ||
loss_function.initialize(init = mx.init.Uniform(0.02), ctx = ctx) | ||
loss_function.hybridize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove hybridize
loss_function.hybridize() | ||
# print "#56" | ||
model.initialize(init=mx.init.Uniform(0.02), ctx=ctx) | ||
model.hybridize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove hybridize
model.save_params(save_path) | ||
# raise Exception("Save Model!") | ||
|
||
# ## TODO: evaluation and rouge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add evaluation
|
||
model = SummarizationModel(vocab = my_vocab, encoder = encoder, decoder = decoder, hidden_dim = args.hidden_dim, embed_size = args.embedding_dim, prefix = 'summary_') | ||
|
||
loss_function = SoftmaxCELoss() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to SoftmaxCrossEntropy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove bucketing, use own dataloader
@@ -0,0 +1,343 @@ | |||
import mxnet as mx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test cases for encoder and decoder following https://github.com/dmlc/gluon-nlp/blob/master/tests/unittest/test_convolutional_encoder.py
update |
* Loader (#9) * use DatasetLoader * fix lint * fix bug * fix lint * fix bug * fix bug * fix lint * fix argument * skip test * Update test_scripts.py * fix bug * fix a bug * move glob to utils * remove amp monkey patch * remove .DS_Store from repo * fix glue test filename * remove root option in the Glue Task interface * lint fix * fix lint
Description
(Brief description on what this PR is about)
Checklist
Essentials
Changes
Comments