Fix decoding in prepend mode #1726

senarvi · 2019-10-18T11:03:14Z

Changes

As far as I can tell, prepending inputs to the targets is otherwise functional, except that during decoding, inputs are not saved to the "partial_inputs" feature. This pull request attempts to fix that.
There are a few extra calls to update_hparams_for_universal_transformer() that cause some Universal Transformer hyperparameter sets to crash. I removed them.

Some background

During training, preprocess_example_common() sets targets to the concatenation of inputs + zero token + targets:

https://github.com/tensorflow/tensor2tensor/blob/v1.14.1/tensor2tensor/data_generators/problem.py#L150

It looks like during inference the function sets the feature "partial_targets" to the concatenation of inputs + zero token. However, this function is not called during decoding!

The Transformer models support reading the "partial_targets" feature, so what was left is to save the inputs to "partial_targets" in the _*_input_tensor_to_features_dict() functions in decoding.py. Or let me know if I've missed something and there's a better way to do it.

… "none".

Fixes hyperparameter sets universal_transformer_big and universal_transformer_base_tpu.

senarvi · 2019-10-18T15:40:42Z

I can see testFactoredTensorImplicitConversion failing. I doubt that my small change could have caused that.

The dimension of the multiplication of the partial targets was wrong: (a, b, c, d) --> (a, b, c, d, a, b, c, d) Correct multiplication needs to be: (a, b, c, d) --> (a, a, b, b, c, c, d, d) This is because it is (batch_size * beam_size) instead of (beam_size * batch_size). Basically, tf.tile needs to be replaced by tf.repeat which is introduced in tf 1.15. This is a workaround for tf 1.14.

senarvi · 2019-11-13T10:52:31Z

The same problem with CommonLayersTest.testFactoredTensorImplicitConversion appears in all merge requests, so I assume it doesn't come from these changes.

afrozenator · 2019-11-21T14:34:28Z

tensor2tensor/models/transformer.py

+          # Workaround for: tf.one_hot(
+          #               tf.repeat(partial_targets[:, i], [beam_size]), vocab_size, 0.0,
+          #               -1e9)
+          # Can be replaced by the above in future versions (from tf 1.15).


Hi @senarvi -- thanks for the PR, I'm going to be releasing on TF 1.15 really soon.

Can you "flip" the comments, i.e. make the 1.15 one main and put the other in a comment?

Also a brief explanation of why this workaround is needed would be great

afrozenator · 2019-11-21T14:35:41Z

tensor2tensor/utils/decoding.py

@@ -957,6 +964,13 @@ def _decode_input_tensor_to_features_dict(feature_map, hparams):
  features["decode_length"] = (
      IMAGE_DECODE_LENGTH if input_is_image else tf.shape(x)[1] + 50)
  features["inputs"] = x
+  # Save inputs to "partial_targets" when prepending inputs to targets. Also
+  # keep "inputs" as some models crash if they don't exist.
+  if getattr(hparams, "prepend_mode", "none") != "none":


if hasattr(hparams, "prepend_mode"): ...

seems more idiomatic python, here and above

afrozenator

Hi @senarvi -- Thanks for the PR, a few minor things and then I'll be able to merge this in. Thanks!

afrozenator · 2019-11-21T17:58:09Z

Thanks @senarvi -- I'll fix the changes on my end!

This reverts commit c825d12.

afrozenator · 2019-11-21T18:23:46Z

Hi @senarvi -- Actually this makes both transformer_test and evolved_transformer_test fail on my end with the same reason, can you PTAL and fix?

[  FAILED  ] TransformerTest.testSlowVsFastNoInput
[ RUN      ] TransformerTest.test_session
[  SKIPPED ] TransformerTest.test_session
======================================================================
ERROR: testSlowVsFastNoInput (__main__.TransformerTest)
testSlowVsFastNoInput (__main__.TransformerTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../absl/third_party/unittest3_backport/case.py", line 37, in testPartExecutor
    yield
  File ".../absl/third_party/unittest3_backport/case.py", line 162, in run
    testMethod()
  File ".../tensor2tensor/models/transformer_test.py", line 180, in testSlowVsFastNoInput
    fast_result = model._greedy_infer(features, decode_length)["outputs"]
  File ".../tensor2tensor/models/transformer.py", line 327, in _greedy_infer
    return self._fast_decode(features, decode_length)
  File ".../tensor2tensor/models/transformer.py", line 877, in _fast_decode
    cache=att_cache)
  File ".../tensor2tensor/models/transformer.py", line 1237, in fast_decode
    tf.TensorShape([None]),
  File "...//tensorflow/python/ops/control_flow_ops.py", line 2754, in while_loop
    return_same_structure)
  File "...//tensorflow/python/ops/control_flow_ops.py", line 2246, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "...//tensorflow/python/ops/control_flow_ops.py", line 2171, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File ".../tensor2tensor/models/transformer.py", line 1192, in inner_loop
    logits, cache = symbols_to_logits_fn(next_id, i, cache)
  File ".../tensor2tensor/models/transformer.py", line 856, in symbols_to_logits_fn
    tf.less(i, partial_targets_length), forced_logits, lambda: ret)
  File "...//tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "...//tensorflow/python/ops/control_flow_ops.py", line 1225, in cond
    orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
  File "...//tensorflow/python/ops/control_flow_ops.py", line 1062, in BuildCondBranch
    original_result = fn()
  File ".../tensor2tensor/models/transformer.py", line 853, in forced_logits
    partial_targets[:, i], [beam_size]), vocab_size, 0.0, -1e9)
  File "...//tensorflow/python/ops/array_ops.py", line 5372, in repeat
    return repeat_with_axis(input, repeats, axis, name)
  File "...//tensorflow/python/ops/array_ops.py", line 5246, in repeat_with_axis
    data.shape.dims[axis].assert_is_compatible_with(repeats.shape[0])
  File "...//tensorflow/python/framework/tensor_shape.py", line 277, in assert_is_compatible_with
    (self, other))
ValueError: Dimensions 3 and 1 are not compatible

This reverts commit b84d5e8.

afrozenator · 2019-11-21T20:10:29Z

Sorry, reverted the revert -- this is good to go!

PiperOrigin-RevId: 281806525

Seppo Enarvi added 5 commits December 4, 2018 09:10

Create an integer problem_0_steps variable.

17f27db

Merge branch 'master' of github.com:tensorflow/tensor2tensor

e8cce3f

Merge branch 'master' of github.com:tensorflow/tensor2tensor

9da23d6

Save inputs to the feature "partial_targets" when prepend_mode is not…

a2a0503

… "none".

Removed a second call to update_hparams_for_universal_transformer().

b0a0a21

Fixes hyperparameter sets universal_transformer_big and universal_transformer_base_tpu.

googlebot added the cla: yes PR author has signed CLA label Oct 18, 2019

afrozenator reviewed Nov 21, 2019

View reviewed changes

afrozenator suggested changes Nov 21, 2019

View reviewed changes

afrozenator merged commit c825d12 into tensorflow:master Nov 21, 2019

afrozenator added a commit that referenced this pull request Nov 21, 2019

Revert "Fix decoding in prepend mode (#1726)"

176d8b3

This reverts commit c825d12.

afrozenator mentioned this pull request Nov 21, 2019

Revert "Fix decoding in prepend mode" #1749

Merged

afrozenator added a commit that referenced this pull request Nov 21, 2019

Revert "Fix decoding in prepend mode (#1726)" (#1749)

b84d5e8

This reverts commit c825d12.

afrozenator added a commit that referenced this pull request Nov 21, 2019

Revert "Revert "Fix decoding in prepend mode (#1726)" (#1749)"

d6b5a8a

This reverts commit b84d5e8.

afrozenator added a commit that referenced this pull request Nov 21, 2019

Revert "Revert "Fix decoding in prepend mode (#1726)" (#1749)" (#1750)

c57ccbb

This reverts commit b84d5e8.

tensorflow-copybara pushed a commit that referenced this pull request Nov 21, 2019

Merge of PR #1726

2efd8aa

PiperOrigin-RevId: 281806525

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix decoding in prepend mode #1726

Fix decoding in prepend mode #1726

senarvi commented Oct 18, 2019

senarvi commented Oct 18, 2019

senarvi commented Nov 13, 2019

afrozenator Nov 21, 2019

afrozenator Nov 21, 2019

afrozenator left a comment

afrozenator commented Nov 21, 2019

afrozenator commented Nov 21, 2019

afrozenator commented Nov 21, 2019

Fix decoding in prepend mode #1726

Fix decoding in prepend mode #1726

Conversation

senarvi commented Oct 18, 2019

Changes

Some background

senarvi commented Oct 18, 2019

senarvi commented Nov 13, 2019

afrozenator Nov 21, 2019

Choose a reason for hiding this comment

afrozenator Nov 21, 2019

Choose a reason for hiding this comment

afrozenator left a comment

Choose a reason for hiding this comment

afrozenator commented Nov 21, 2019

afrozenator commented Nov 21, 2019

afrozenator commented Nov 21, 2019