Restore keras seq2seq model












0















I'm working with the keras seq2seq example here:
https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py



I would like to persist the vocabulary and decoder so that I can load it again later, and apply it to new sequences.



While the code calls model.save(), this is insufficient because I can see the decoding setup referencing a number of other variables which are deep pointers into the trained model:



encoder_model = Model(encoder_inputs, encoder_states)

decoder_state_input_h = Input(shape=(latent_dim,))
decoder_state_input_c = Input(shape=(latent_dim,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(
decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model(
[decoder_inputs] + decoder_states_inputs,
[decoder_outputs] + decoder_states)


I would like to translate this code to determine encoder_inputs, encoder_states, latent_dim, decoder_inputs from a model loaded from disk. It's ok to assume I know the model architecture in advance. Is there a straightforward way to do this?



Update:
I have made some progress using the decoder construction code and pulling out the layer inputs/outputs as needed.



encoder_inputs = model.input[0] #input_1
decoder_inputs = model.input[1] #input_2
encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
_, state_h_dec, state_c_dec = model.layers[3].output # lstm_2
decoder_outputs = model.layers[4].output # dense_1

encoder_states = [state_h_enc, state_c_enc]
encoder_model = Model(encoder_inputs, encoder_states)

latent_dim = 256 # TODO: infer this from the model. Should match lstm_1 outputs.

decoder_state_input_h = Input(shape=(latent_dim,))
decoder_state_input_c = Input(shape=(latent_dim,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

decoder_states = [state_h_dec, state_c_dec]

decoder_model = Model(
[decoder_inputs] + decoder_states_inputs,
[decoder_outputs] + decoder_states)


However, when I try to construct the decoder model, I encounter this error:



RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(?, ?, 96), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: 


As a test I tried Model(decoder_inputs,decoder_outputs) with the same result. It's not clear to me what is disconnected from the graph, since these layers are loaded from the model.










share|improve this question





























    0















    I'm working with the keras seq2seq example here:
    https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py



    I would like to persist the vocabulary and decoder so that I can load it again later, and apply it to new sequences.



    While the code calls model.save(), this is insufficient because I can see the decoding setup referencing a number of other variables which are deep pointers into the trained model:



    encoder_model = Model(encoder_inputs, encoder_states)

    decoder_state_input_h = Input(shape=(latent_dim,))
    decoder_state_input_c = Input(shape=(latent_dim,))
    decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
    decoder_outputs, state_h, state_c = decoder_lstm(
    decoder_inputs, initial_state=decoder_states_inputs)
    decoder_states = [state_h, state_c]
    decoder_outputs = decoder_dense(decoder_outputs)
    decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs] + decoder_states)


    I would like to translate this code to determine encoder_inputs, encoder_states, latent_dim, decoder_inputs from a model loaded from disk. It's ok to assume I know the model architecture in advance. Is there a straightforward way to do this?



    Update:
    I have made some progress using the decoder construction code and pulling out the layer inputs/outputs as needed.



    encoder_inputs = model.input[0] #input_1
    decoder_inputs = model.input[1] #input_2
    encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
    _, state_h_dec, state_c_dec = model.layers[3].output # lstm_2
    decoder_outputs = model.layers[4].output # dense_1

    encoder_states = [state_h_enc, state_c_enc]
    encoder_model = Model(encoder_inputs, encoder_states)

    latent_dim = 256 # TODO: infer this from the model. Should match lstm_1 outputs.

    decoder_state_input_h = Input(shape=(latent_dim,))
    decoder_state_input_c = Input(shape=(latent_dim,))
    decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

    decoder_states = [state_h_dec, state_c_dec]

    decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs] + decoder_states)


    However, when I try to construct the decoder model, I encounter this error:



    RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(?, ?, 96), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: 


    As a test I tried Model(decoder_inputs,decoder_outputs) with the same result. It's not clear to me what is disconnected from the graph, since these layers are loaded from the model.










    share|improve this question



























      0












      0








      0








      I'm working with the keras seq2seq example here:
      https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py



      I would like to persist the vocabulary and decoder so that I can load it again later, and apply it to new sequences.



      While the code calls model.save(), this is insufficient because I can see the decoding setup referencing a number of other variables which are deep pointers into the trained model:



      encoder_model = Model(encoder_inputs, encoder_states)

      decoder_state_input_h = Input(shape=(latent_dim,))
      decoder_state_input_c = Input(shape=(latent_dim,))
      decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
      decoder_outputs, state_h, state_c = decoder_lstm(
      decoder_inputs, initial_state=decoder_states_inputs)
      decoder_states = [state_h, state_c]
      decoder_outputs = decoder_dense(decoder_outputs)
      decoder_model = Model(
      [decoder_inputs] + decoder_states_inputs,
      [decoder_outputs] + decoder_states)


      I would like to translate this code to determine encoder_inputs, encoder_states, latent_dim, decoder_inputs from a model loaded from disk. It's ok to assume I know the model architecture in advance. Is there a straightforward way to do this?



      Update:
      I have made some progress using the decoder construction code and pulling out the layer inputs/outputs as needed.



      encoder_inputs = model.input[0] #input_1
      decoder_inputs = model.input[1] #input_2
      encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
      _, state_h_dec, state_c_dec = model.layers[3].output # lstm_2
      decoder_outputs = model.layers[4].output # dense_1

      encoder_states = [state_h_enc, state_c_enc]
      encoder_model = Model(encoder_inputs, encoder_states)

      latent_dim = 256 # TODO: infer this from the model. Should match lstm_1 outputs.

      decoder_state_input_h = Input(shape=(latent_dim,))
      decoder_state_input_c = Input(shape=(latent_dim,))
      decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

      decoder_states = [state_h_dec, state_c_dec]

      decoder_model = Model(
      [decoder_inputs] + decoder_states_inputs,
      [decoder_outputs] + decoder_states)


      However, when I try to construct the decoder model, I encounter this error:



      RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(?, ?, 96), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: 


      As a test I tried Model(decoder_inputs,decoder_outputs) with the same result. It's not clear to me what is disconnected from the graph, since these layers are loaded from the model.










      share|improve this question
















      I'm working with the keras seq2seq example here:
      https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py



      I would like to persist the vocabulary and decoder so that I can load it again later, and apply it to new sequences.



      While the code calls model.save(), this is insufficient because I can see the decoding setup referencing a number of other variables which are deep pointers into the trained model:



      encoder_model = Model(encoder_inputs, encoder_states)

      decoder_state_input_h = Input(shape=(latent_dim,))
      decoder_state_input_c = Input(shape=(latent_dim,))
      decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
      decoder_outputs, state_h, state_c = decoder_lstm(
      decoder_inputs, initial_state=decoder_states_inputs)
      decoder_states = [state_h, state_c]
      decoder_outputs = decoder_dense(decoder_outputs)
      decoder_model = Model(
      [decoder_inputs] + decoder_states_inputs,
      [decoder_outputs] + decoder_states)


      I would like to translate this code to determine encoder_inputs, encoder_states, latent_dim, decoder_inputs from a model loaded from disk. It's ok to assume I know the model architecture in advance. Is there a straightforward way to do this?



      Update:
      I have made some progress using the decoder construction code and pulling out the layer inputs/outputs as needed.



      encoder_inputs = model.input[0] #input_1
      decoder_inputs = model.input[1] #input_2
      encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
      _, state_h_dec, state_c_dec = model.layers[3].output # lstm_2
      decoder_outputs = model.layers[4].output # dense_1

      encoder_states = [state_h_enc, state_c_enc]
      encoder_model = Model(encoder_inputs, encoder_states)

      latent_dim = 256 # TODO: infer this from the model. Should match lstm_1 outputs.

      decoder_state_input_h = Input(shape=(latent_dim,))
      decoder_state_input_c = Input(shape=(latent_dim,))
      decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

      decoder_states = [state_h_dec, state_c_dec]

      decoder_model = Model(
      [decoder_inputs] + decoder_states_inputs,
      [decoder_outputs] + decoder_states)


      However, when I try to construct the decoder model, I encounter this error:



      RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(?, ?, 96), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: 


      As a test I tried Model(decoder_inputs,decoder_outputs) with the same result. It's not clear to me what is disconnected from the graph, since these layers are loaded from the model.







      python tensorflow machine-learning keras






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 18 '18 at 0:51







      Robert Sim

















      asked Jan 8 '18 at 21:48









      Robert SimRobert Sim

      657316




      657316
























          2 Answers
          2






          active

          oldest

          votes


















          3














          Ok, I solved this problem and the decoder is producing reasonable results. In my code above I missed a couple details in the decoder step, specifically that it call()s the LSTM and Dense layers in order to wire them up. In addition, the new decoder inputs need unique names so they don't collide with input_1 and input_2 (this detail smells like a keras bug).



          encoder_inputs = model.input[0] #input_1
          encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
          encoder_states = [state_h_enc, state_c_enc]
          encoder_model = Model(encoder_inputs, encoder_states)

          decoder_inputs = model.input[1] #input_2
          decoder_state_input_h = Input(shape=(latent_dim,),name='input_3')
          decoder_state_input_c = Input(shape=(latent_dim,),name='input_4')
          decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
          decoder_lstm = model.layers[3]
          decoder_outputs, state_h_dec, state_c_dec = decoder_lstm(
          decoder_inputs, initial_state=decoder_states_inputs)
          decoder_states = [state_h_dec, state_c_dec]
          decoder_dense = model.layers[4]
          decoder_outputs=decoder_dense(decoder_outputs)

          decoder_model = Model(
          [decoder_inputs] + decoder_states_inputs,
          [decoder_outputs] + decoder_states)


          A big drawback with this code is the fact we know the full architecture in advance. I would like to eventually be able to load an architecture-agnostic decoder.






          share|improve this answer
























          • I created a Keras PR here to provide a working example: https://github.com/keras-team/keras/pull/9119

            – Robert Sim
            Jan 18 '18 at 22:21



















          1














          At a point in the code of the Keras seq2seq example you will have a finished encoder and decoder model. You can save the architecture and weights of these models to disk and load them later. The following works for me:



          Save the models to disk:



          with open('encoder_model.json', 'w', encoding='utf8') as f:
          f.write(encoder_model.to_json())
          encoder_model.save_weights('encoder_model_weights.h5')

          with open('decoder_model.json', 'w', encoding='utf8') as f:
          f.write(decoder_model.to_json())
          decoder_model.save_weights('decoder_model_weights.h5')


          Later load the encoder and decoder:



          def load_model(model_filename, model_weights_filename):
          with open(model_filename, 'r', encoding='utf8') as f:
          model = model_from_json(f.read())
          model.load_weights(model_weights_filename)
          return model

          encoder = load_model('encoder_model.json', 'encoder_model_weights.h5')
          decoder = load_model('decoder_model.json', 'decoder_model_weights.h5')


          During prediction you will also need a number of other data, like number of encoder/decoder tokens, dictionaries mapping char to index etc. You can just save these to file after training and load them later, just like with the models.






          share|improve this answer

























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f48158547%2frestore-keras-seq2seq-model%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            Ok, I solved this problem and the decoder is producing reasonable results. In my code above I missed a couple details in the decoder step, specifically that it call()s the LSTM and Dense layers in order to wire them up. In addition, the new decoder inputs need unique names so they don't collide with input_1 and input_2 (this detail smells like a keras bug).



            encoder_inputs = model.input[0] #input_1
            encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
            encoder_states = [state_h_enc, state_c_enc]
            encoder_model = Model(encoder_inputs, encoder_states)

            decoder_inputs = model.input[1] #input_2
            decoder_state_input_h = Input(shape=(latent_dim,),name='input_3')
            decoder_state_input_c = Input(shape=(latent_dim,),name='input_4')
            decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
            decoder_lstm = model.layers[3]
            decoder_outputs, state_h_dec, state_c_dec = decoder_lstm(
            decoder_inputs, initial_state=decoder_states_inputs)
            decoder_states = [state_h_dec, state_c_dec]
            decoder_dense = model.layers[4]
            decoder_outputs=decoder_dense(decoder_outputs)

            decoder_model = Model(
            [decoder_inputs] + decoder_states_inputs,
            [decoder_outputs] + decoder_states)


            A big drawback with this code is the fact we know the full architecture in advance. I would like to eventually be able to load an architecture-agnostic decoder.






            share|improve this answer
























            • I created a Keras PR here to provide a working example: https://github.com/keras-team/keras/pull/9119

              – Robert Sim
              Jan 18 '18 at 22:21
















            3














            Ok, I solved this problem and the decoder is producing reasonable results. In my code above I missed a couple details in the decoder step, specifically that it call()s the LSTM and Dense layers in order to wire them up. In addition, the new decoder inputs need unique names so they don't collide with input_1 and input_2 (this detail smells like a keras bug).



            encoder_inputs = model.input[0] #input_1
            encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
            encoder_states = [state_h_enc, state_c_enc]
            encoder_model = Model(encoder_inputs, encoder_states)

            decoder_inputs = model.input[1] #input_2
            decoder_state_input_h = Input(shape=(latent_dim,),name='input_3')
            decoder_state_input_c = Input(shape=(latent_dim,),name='input_4')
            decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
            decoder_lstm = model.layers[3]
            decoder_outputs, state_h_dec, state_c_dec = decoder_lstm(
            decoder_inputs, initial_state=decoder_states_inputs)
            decoder_states = [state_h_dec, state_c_dec]
            decoder_dense = model.layers[4]
            decoder_outputs=decoder_dense(decoder_outputs)

            decoder_model = Model(
            [decoder_inputs] + decoder_states_inputs,
            [decoder_outputs] + decoder_states)


            A big drawback with this code is the fact we know the full architecture in advance. I would like to eventually be able to load an architecture-agnostic decoder.






            share|improve this answer
























            • I created a Keras PR here to provide a working example: https://github.com/keras-team/keras/pull/9119

              – Robert Sim
              Jan 18 '18 at 22:21














            3












            3








            3







            Ok, I solved this problem and the decoder is producing reasonable results. In my code above I missed a couple details in the decoder step, specifically that it call()s the LSTM and Dense layers in order to wire them up. In addition, the new decoder inputs need unique names so they don't collide with input_1 and input_2 (this detail smells like a keras bug).



            encoder_inputs = model.input[0] #input_1
            encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
            encoder_states = [state_h_enc, state_c_enc]
            encoder_model = Model(encoder_inputs, encoder_states)

            decoder_inputs = model.input[1] #input_2
            decoder_state_input_h = Input(shape=(latent_dim,),name='input_3')
            decoder_state_input_c = Input(shape=(latent_dim,),name='input_4')
            decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
            decoder_lstm = model.layers[3]
            decoder_outputs, state_h_dec, state_c_dec = decoder_lstm(
            decoder_inputs, initial_state=decoder_states_inputs)
            decoder_states = [state_h_dec, state_c_dec]
            decoder_dense = model.layers[4]
            decoder_outputs=decoder_dense(decoder_outputs)

            decoder_model = Model(
            [decoder_inputs] + decoder_states_inputs,
            [decoder_outputs] + decoder_states)


            A big drawback with this code is the fact we know the full architecture in advance. I would like to eventually be able to load an architecture-agnostic decoder.






            share|improve this answer













            Ok, I solved this problem and the decoder is producing reasonable results. In my code above I missed a couple details in the decoder step, specifically that it call()s the LSTM and Dense layers in order to wire them up. In addition, the new decoder inputs need unique names so they don't collide with input_1 and input_2 (this detail smells like a keras bug).



            encoder_inputs = model.input[0] #input_1
            encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1
            encoder_states = [state_h_enc, state_c_enc]
            encoder_model = Model(encoder_inputs, encoder_states)

            decoder_inputs = model.input[1] #input_2
            decoder_state_input_h = Input(shape=(latent_dim,),name='input_3')
            decoder_state_input_c = Input(shape=(latent_dim,),name='input_4')
            decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
            decoder_lstm = model.layers[3]
            decoder_outputs, state_h_dec, state_c_dec = decoder_lstm(
            decoder_inputs, initial_state=decoder_states_inputs)
            decoder_states = [state_h_dec, state_c_dec]
            decoder_dense = model.layers[4]
            decoder_outputs=decoder_dense(decoder_outputs)

            decoder_model = Model(
            [decoder_inputs] + decoder_states_inputs,
            [decoder_outputs] + decoder_states)


            A big drawback with this code is the fact we know the full architecture in advance. I would like to eventually be able to load an architecture-agnostic decoder.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Jan 18 '18 at 5:50









            Robert SimRobert Sim

            657316




            657316













            • I created a Keras PR here to provide a working example: https://github.com/keras-team/keras/pull/9119

              – Robert Sim
              Jan 18 '18 at 22:21



















            • I created a Keras PR here to provide a working example: https://github.com/keras-team/keras/pull/9119

              – Robert Sim
              Jan 18 '18 at 22:21

















            I created a Keras PR here to provide a working example: https://github.com/keras-team/keras/pull/9119

            – Robert Sim
            Jan 18 '18 at 22:21





            I created a Keras PR here to provide a working example: https://github.com/keras-team/keras/pull/9119

            – Robert Sim
            Jan 18 '18 at 22:21













            1














            At a point in the code of the Keras seq2seq example you will have a finished encoder and decoder model. You can save the architecture and weights of these models to disk and load them later. The following works for me:



            Save the models to disk:



            with open('encoder_model.json', 'w', encoding='utf8') as f:
            f.write(encoder_model.to_json())
            encoder_model.save_weights('encoder_model_weights.h5')

            with open('decoder_model.json', 'w', encoding='utf8') as f:
            f.write(decoder_model.to_json())
            decoder_model.save_weights('decoder_model_weights.h5')


            Later load the encoder and decoder:



            def load_model(model_filename, model_weights_filename):
            with open(model_filename, 'r', encoding='utf8') as f:
            model = model_from_json(f.read())
            model.load_weights(model_weights_filename)
            return model

            encoder = load_model('encoder_model.json', 'encoder_model_weights.h5')
            decoder = load_model('decoder_model.json', 'decoder_model_weights.h5')


            During prediction you will also need a number of other data, like number of encoder/decoder tokens, dictionaries mapping char to index etc. You can just save these to file after training and load them later, just like with the models.






            share|improve this answer






























              1














              At a point in the code of the Keras seq2seq example you will have a finished encoder and decoder model. You can save the architecture and weights of these models to disk and load them later. The following works for me:



              Save the models to disk:



              with open('encoder_model.json', 'w', encoding='utf8') as f:
              f.write(encoder_model.to_json())
              encoder_model.save_weights('encoder_model_weights.h5')

              with open('decoder_model.json', 'w', encoding='utf8') as f:
              f.write(decoder_model.to_json())
              decoder_model.save_weights('decoder_model_weights.h5')


              Later load the encoder and decoder:



              def load_model(model_filename, model_weights_filename):
              with open(model_filename, 'r', encoding='utf8') as f:
              model = model_from_json(f.read())
              model.load_weights(model_weights_filename)
              return model

              encoder = load_model('encoder_model.json', 'encoder_model_weights.h5')
              decoder = load_model('decoder_model.json', 'decoder_model_weights.h5')


              During prediction you will also need a number of other data, like number of encoder/decoder tokens, dictionaries mapping char to index etc. You can just save these to file after training and load them later, just like with the models.






              share|improve this answer




























                1












                1








                1







                At a point in the code of the Keras seq2seq example you will have a finished encoder and decoder model. You can save the architecture and weights of these models to disk and load them later. The following works for me:



                Save the models to disk:



                with open('encoder_model.json', 'w', encoding='utf8') as f:
                f.write(encoder_model.to_json())
                encoder_model.save_weights('encoder_model_weights.h5')

                with open('decoder_model.json', 'w', encoding='utf8') as f:
                f.write(decoder_model.to_json())
                decoder_model.save_weights('decoder_model_weights.h5')


                Later load the encoder and decoder:



                def load_model(model_filename, model_weights_filename):
                with open(model_filename, 'r', encoding='utf8') as f:
                model = model_from_json(f.read())
                model.load_weights(model_weights_filename)
                return model

                encoder = load_model('encoder_model.json', 'encoder_model_weights.h5')
                decoder = load_model('decoder_model.json', 'decoder_model_weights.h5')


                During prediction you will also need a number of other data, like number of encoder/decoder tokens, dictionaries mapping char to index etc. You can just save these to file after training and load them later, just like with the models.






                share|improve this answer















                At a point in the code of the Keras seq2seq example you will have a finished encoder and decoder model. You can save the architecture and weights of these models to disk and load them later. The following works for me:



                Save the models to disk:



                with open('encoder_model.json', 'w', encoding='utf8') as f:
                f.write(encoder_model.to_json())
                encoder_model.save_weights('encoder_model_weights.h5')

                with open('decoder_model.json', 'w', encoding='utf8') as f:
                f.write(decoder_model.to_json())
                decoder_model.save_weights('decoder_model_weights.h5')


                Later load the encoder and decoder:



                def load_model(model_filename, model_weights_filename):
                with open(model_filename, 'r', encoding='utf8') as f:
                model = model_from_json(f.read())
                model.load_weights(model_weights_filename)
                return model

                encoder = load_model('encoder_model.json', 'encoder_model_weights.h5')
                decoder = load_model('decoder_model.json', 'decoder_model_weights.h5')


                During prediction you will also need a number of other data, like number of encoder/decoder tokens, dictionaries mapping char to index etc. You can just save these to file after training and load them later, just like with the models.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Apr 26 '18 at 20:54

























                answered Apr 26 '18 at 17:08









                user1236689user1236689

                3816




                3816






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f48158547%2frestore-keras-seq2seq-model%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Xamarin.iOS Cant Deploy on Iphone

                    Glorious Revolution

                    Dulmage-Mendelsohn matrix decomposition in Python