Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Fix decoding in prepend mode #1726

Merged
merged 6 commits into from
Nov 21, 2019
Merged

Conversation

senarvi
Copy link

@senarvi senarvi commented Oct 18, 2019

Changes

  1. As far as I can tell, prepending inputs to the targets is otherwise functional, except that during decoding, inputs are not saved to the "partial_inputs" feature. This pull request attempts to fix that.

  2. There are a few extra calls to update_hparams_for_universal_transformer() that cause some Universal Transformer hyperparameter sets to crash. I removed them.

Some background

During training, preprocess_example_common() sets targets to the concatenation of inputs + zero token + targets:

https://github.com/tensorflow/tensor2tensor/blob/v1.14.1/tensor2tensor/data_generators/problem.py#L150

It looks like during inference the function sets the feature "partial_targets" to the concatenation of inputs + zero token. However, this function is not called during decoding!

The Transformer models support reading the "partial_targets" feature, so what was left is to save the inputs to "partial_targets" in the _*_input_tensor_to_features_dict() functions in decoding.py. Or let me know if I've missed something and there's a better way to do it.

@googlebot googlebot added the cla: yes PR author has signed CLA label Oct 18, 2019
@senarvi
Copy link
Author

senarvi commented Oct 18, 2019

I can see testFactoredTensorImplicitConversion failing. I doubt that my small change could have caused that.

The dimension of the multiplication of the partial targets was wrong:  (a, b, c, d) --> (a, b, c, d, a, b, c, d)
Correct multiplication needs to be: (a, b, c, d) --> (a, a, b, b, c, c, d, d)
This is because it is (batch_size * beam_size) instead of (beam_size * batch_size).

Basically, tf.tile needs to be replaced by tf.repeat which is introduced in tf 1.15. This is a workaround for tf 1.14.
@senarvi
Copy link
Author

senarvi commented Nov 13, 2019

The same problem with CommonLayersTest.testFactoredTensorImplicitConversion appears in all merge requests, so I assume it doesn't come from these changes.

# Workaround for: tf.one_hot(
# tf.repeat(partial_targets[:, i], [beam_size]), vocab_size, 0.0,
# -1e9)
# Can be replaced by the above in future versions (from tf 1.15).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @senarvi -- thanks for the PR, I'm going to be releasing on TF 1.15 really soon.

Can you "flip" the comments, i.e. make the 1.15 one main and put the other in a comment?

Also a brief explanation of why this workaround is needed would be great

@@ -957,6 +964,13 @@ def _decode_input_tensor_to_features_dict(feature_map, hparams):
features["decode_length"] = (
IMAGE_DECODE_LENGTH if input_is_image else tf.shape(x)[1] + 50)
features["inputs"] = x
# Save inputs to "partial_targets" when prepending inputs to targets. Also
# keep "inputs" as some models crash if they don't exist.
if getattr(hparams, "prepend_mode", "none") != "none":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if hasattr(hparams, "prepend_mode"):
  ...

seems more idiomatic python, here and above

Copy link
Contributor

@afrozenator afrozenator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @senarvi -- Thanks for the PR, a few minor things and then I'll be able to merge this in. Thanks!

@afrozenator
Copy link
Contributor

Thanks @senarvi -- I'll fix the changes on my end!

@afrozenator afrozenator merged commit c825d12 into tensorflow:master Nov 21, 2019
afrozenator added a commit that referenced this pull request Nov 21, 2019
afrozenator added a commit that referenced this pull request Nov 21, 2019
@afrozenator
Copy link
Contributor

Hi @senarvi -- Actually this makes both transformer_test and evolved_transformer_test fail on my end with the same reason, can you PTAL and fix?

[  FAILED  ] TransformerTest.testSlowVsFastNoInput
[ RUN      ] TransformerTest.test_session
[  SKIPPED ] TransformerTest.test_session
======================================================================
ERROR: testSlowVsFastNoInput (__main__.TransformerTest)
testSlowVsFastNoInput (__main__.TransformerTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../absl/third_party/unittest3_backport/case.py", line 37, in testPartExecutor
    yield
  File ".../absl/third_party/unittest3_backport/case.py", line 162, in run
    testMethod()
  File ".../tensor2tensor/models/transformer_test.py", line 180, in testSlowVsFastNoInput
    fast_result = model._greedy_infer(features, decode_length)["outputs"]
  File ".../tensor2tensor/models/transformer.py", line 327, in _greedy_infer
    return self._fast_decode(features, decode_length)
  File ".../tensor2tensor/models/transformer.py", line 877, in _fast_decode
    cache=att_cache)
  File ".../tensor2tensor/models/transformer.py", line 1237, in fast_decode
    tf.TensorShape([None]),
  File "...//tensorflow/python/ops/control_flow_ops.py", line 2754, in while_loop
    return_same_structure)
  File "...//tensorflow/python/ops/control_flow_ops.py", line 2246, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "...//tensorflow/python/ops/control_flow_ops.py", line 2171, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File ".../tensor2tensor/models/transformer.py", line 1192, in inner_loop
    logits, cache = symbols_to_logits_fn(next_id, i, cache)
  File ".../tensor2tensor/models/transformer.py", line 856, in symbols_to_logits_fn
    tf.less(i, partial_targets_length), forced_logits, lambda: ret)
  File "...//tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "...//tensorflow/python/ops/control_flow_ops.py", line 1225, in cond
    orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
  File "...//tensorflow/python/ops/control_flow_ops.py", line 1062, in BuildCondBranch
    original_result = fn()
  File ".../tensor2tensor/models/transformer.py", line 853, in forced_logits
    partial_targets[:, i], [beam_size]), vocab_size, 0.0, -1e9)
  File "...//tensorflow/python/ops/array_ops.py", line 5372, in repeat
    return repeat_with_axis(input, repeats, axis, name)
  File "...//tensorflow/python/ops/array_ops.py", line 5246, in repeat_with_axis
    data.shape.dims[axis].assert_is_compatible_with(repeats.shape[0])
  File "...//tensorflow/python/framework/tensor_shape.py", line 277, in assert_is_compatible_with
    (self, other))
ValueError: Dimensions 3 and 1 are not compatible

afrozenator added a commit that referenced this pull request Nov 21, 2019
afrozenator added a commit that referenced this pull request Nov 21, 2019
@afrozenator
Copy link
Contributor

Sorry, reverted the revert -- this is good to go!

tensorflow-copybara pushed a commit that referenced this pull request Nov 21, 2019
PiperOrigin-RevId: 281806525
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes PR author has signed CLA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants