Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Documentation for creating own model #1589

Merged
merged 4 commits into from
Jun 7, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 94 additions & 3 deletions docs/new_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,103 @@ version](https://badge.fury.io/py/tensor2tensor.svg)](https://badge.fury.io/py/t
[![GitHub
Issues](https://img.shields.io/github/issues/tensorflow/tensor2tensor.svg)](https://github.com/tensorflow/tensor2tensor/issues)
[![Contributions
welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CONTRIBUTING.md)
welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](../CONTRIBUTING.md)
[![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/tensor2tensor/Lobby)
[![License](https://img.shields.io/badge/License-Apache%202.0-brightgreen.svg)](https://opensource.org/licenses/Apache-2.0)

Here we show how to create your own model in T2T.

## The T2TModel class
## The T2TModel class - abstract base class for models

TODO: complete.
`T2TModel` has three typical usages:

1. Estimator: The method `make_estimator_model_fn` builds a `model_fn` for
the tf.Estimator workflow of training, evaluation, and prediction.
It performs the method `call`, which performs the core computation,
followed by `estimator_spec_train`, `estimator_spec_eval`, or
`estimator_spec_predict` depending on the tf.Estimator mode.
2. Layer: The method `call` enables `T2TModel` to be used a callable by
itself. It calls the following methods:

* `bottom`, which transforms features according to `problem_hparams`' input
and target `Modality`s;
* `body`, which takes features and performs the core model computation to
return output and any auxiliary loss terms;
* `top`, which takes features and the body output, and transforms them
according to `problem_hparams`' input and target `Modality`s to return
the final logits;
* `loss`, which takes the logits, forms any missing training loss, and sums
all loss terms.
3. Inference: The method `infer` enables `T2TModel` to make sequence
predictions by itself.


## Creating your own model

1. Create class that extends T2TModel
in this example it will be a copy of existing basic fully connected network:
```python
from tensor2tensor.utils import t2t_model

class MyFC(t2t_model.T2TModel):
pass
```

2. Implement body method:
```python
class MyFC(t2t_model.T2TModel):
def body(self, features):
hparams = self.hparams
x = features["inputs"]
shape = common_layers.shape_list(x)
x = tf.reshape(x, [-1, shape[1] * shape[2] * shape[3]]) # Flatten input as in T2T they are all 4D vectors
for i in range(hparams.num_hidden_layers): # create layers
x = tf.layers.dense(x, hparams.hidden_size, name="layer_%d" % i)
x = tf.nn.dropout(x, keep_prob=1.0 - hparams.dropout)
x = tf.nn.relu(x)
return tf.expand_dims(tf.expand_dims(x, axis=1), axis=1) # 4D For T2T.
```

method signature:
* Args:
* features: dict of str to Tensor, where each Tensor has shape [batch_size,
..., hidden_size]. It typically contains keys `inputs` and `targets`.

* Returns one of:
* output: Tensor of pre-logit activations with shape [batch_size, ...,
hidden_size].
* losses: Either single loss as a scalar, a list, a Tensor (to be averaged),
or a dictionary of losses. If losses is a dictionary with the key
"training", losses["training"] is considered the final training
loss and output is considered logits; self.top and self.loss will
be skipped.

3. Register your model
```python
from tensor2tensor.utils import registry

@registry.register_model
class MyFC(t2t_model.T2TModel):
# ...
```

3. Use it with t2t tools as any other model

Have in mind that names are translated from camel case to snake_case `MyFC` -> `my_fc`
and that you need to point t2t to directory containing your model with `t2t_usr_dir` switch.
For example if you want to train model on gcloud with 1 GPU worker on IMDB sentiment task you can run your model
by executing following command from your model class directory.

```bash
t2t-trainer \
--model=my_fc \
--t2t_usr_dir=.
--cloud_mlengine --worker_gpu=1 \
--generate_data \
--data_dir='gs://data' \
--output_dir='gs://out' \
--problem=sentiment_imdb \
--hparams_set=basic_fc_small \
--train_steps=10000 \
--eval_steps=10 \
```