Concatenate layer in Keras makes fitting fail

Whenever I concatenate the outputs of two layers (for example, because I want to use softmax on some outputs and another activation function on the rest), the network always fails to learn.

This is some example code to demonstrate the problem:

from tensorflow.keras.layers import Lambda, Input, Dense, Concatenate, Dropout, Reshape, 

                                    Conv2D, Flatten, MaxPooling2D

from tensorflow.keras.models import Model

from tensorflow.keras.datasets import mnist

from tensorflow.keras.losses import mse, categorical_crossentropy, binary_crossentropy

from tensorflow.keras.utils import plot_model, to_categorical

from tensorflow.keras import backend as K

from tensorflow.keras import optimizers



import numpy as np

import matplotlib.pyplot as plt

import argparse

import os

import pygameVisualise as pyvis



# MNIST dataset

(x_train, y_train), (x_test, y_test) = mnist.load_data()



no_cls = max(y_train)+1

width = 20



extra_dims = True



image_size = x_train.shape[1]

original_dim = image_size * image_size

x_train = np.reshape(x_train, [-1, original_dim])

x_test = np.reshape(x_test, [-1, original_dim])

x_train = x_train.astype('float32') / 255

x_test = x_test.astype('float32') / 255

y_train = to_categorical(y_train, num_classes=width if extra_dims else no_cls)

y_test = to_categorical(y_test, num_classes=width if extra_dims else no_cls)



hidden_dim = 512

batch_sz = 256

eps = 10



ins = Input(shape=(original_dim,))

x = Dense(hidden_dim)(ins)

cls_pred = Dense(no_cls, activation="softmax")(x)

other    = Dense(width-no_cls)(x)

outs = Concatenate()([cls_pred, other])



encoder = Model(ins, outs if extra_dims else cls_pred, name="encoder")

encoder.summary()



def cust_loss_fn(y_true, y_pred):

    return categorical_crossentropy(y_true[:no_cls], y_pred[:no_cls])



optimiser = optimizers.SGD(lr=0.003, clipvalue=0.1)

encoder.compile(optimizer=optimiser, loss=cust_loss_fn,

                metrics=["accuracy"])



encoder.fit(x_train, y_train,

            batch_size=batch_sz,

            epochs=eps,

            validation_data=(x_test, y_test))



score = encoder.evaluate(x_test, y_test)

print(score)



print(encoder.predict(x_train[0:10]))

With extra_dims = False, i.e. no concatenate layer, the network will consistently reach 88% accuracy in the 10 epochs. When it is True, the network will stay at around 8% accuracy and the loss will not drop at all during training.

Am I doing something wrong?

asked Nov 12 at 18:53

TheAbelo2

17413

1

Are you sure the model runs? Since that way of slicing in the loss function is wrong as it is only selecting the first no_cls samples and therefore the shapes would not be consistent. It must be [:,:no_cls] instead.
– today
Nov 12 at 19:36

@today Thank you so much, I changed that line and it started learning, although much slower than before - guess this was to be expected though. Thanks for pointing it out! Wonder why this works without the concatenate layer though? Wasn't having any problems with extra_dims = False
– TheAbelo2
Nov 12 at 21:36

It does not run and gives me errors when I use keras directly. I did not test it using tensorflow.keras and I don't know why it works with that. It's strange indeed.
– today
Nov 13 at 4:57

add a comment |

Whenever I concatenate the outputs of two layers (for example, because I want to use softmax on some outputs and another activation function on the rest), the network always fails to learn.

This is some example code to demonstrate the problem:

from tensorflow.keras.layers import Lambda, Input, Dense, Concatenate, Dropout, Reshape, 

                                    Conv2D, Flatten, MaxPooling2D

from tensorflow.keras.models import Model

from tensorflow.keras.datasets import mnist

from tensorflow.keras.losses import mse, categorical_crossentropy, binary_crossentropy

from tensorflow.keras.utils import plot_model, to_categorical

from tensorflow.keras import backend as K

from tensorflow.keras import optimizers



import numpy as np

import matplotlib.pyplot as plt

import argparse

import os

import pygameVisualise as pyvis



# MNIST dataset

(x_train, y_train), (x_test, y_test) = mnist.load_data()



no_cls = max(y_train)+1

width = 20



extra_dims = True



image_size = x_train.shape[1]

original_dim = image_size * image_size

x_train = np.reshape(x_train, [-1, original_dim])

x_test = np.reshape(x_test, [-1, original_dim])

x_train = x_train.astype('float32') / 255

x_test = x_test.astype('float32') / 255

y_train = to_categorical(y_train, num_classes=width if extra_dims else no_cls)

y_test = to_categorical(y_test, num_classes=width if extra_dims else no_cls)



hidden_dim = 512

batch_sz = 256

eps = 10



ins = Input(shape=(original_dim,))

x = Dense(hidden_dim)(ins)

cls_pred = Dense(no_cls, activation="softmax")(x)

other    = Dense(width-no_cls)(x)

outs = Concatenate()([cls_pred, other])



encoder = Model(ins, outs if extra_dims else cls_pred, name="encoder")

encoder.summary()



def cust_loss_fn(y_true, y_pred):

    return categorical_crossentropy(y_true[:no_cls], y_pred[:no_cls])



optimiser = optimizers.SGD(lr=0.003, clipvalue=0.1)

encoder.compile(optimizer=optimiser, loss=cust_loss_fn,

                metrics=["accuracy"])



encoder.fit(x_train, y_train,

            batch_size=batch_sz,

            epochs=eps,

            validation_data=(x_test, y_test))



score = encoder.evaluate(x_test, y_test)

print(score)



print(encoder.predict(x_train[0:10]))

Am I doing something wrong?

asked Nov 12 at 18:53

TheAbelo2

17413

1

Are you sure the model runs? Since that way of slicing in the loss function is wrong as it is only selecting the first no_cls samples and therefore the shapes would not be consistent. It must be [:,:no_cls] instead.
– today
Nov 12 at 19:36

@today Thank you so much, I changed that line and it started learning, although much slower than before - guess this was to be expected though. Thanks for pointing it out! Wonder why this works without the concatenate layer though? Wasn't having any problems with extra_dims = False
– TheAbelo2
Nov 12 at 21:36

It does not run and gives me errors when I use keras directly. I did not test it using tensorflow.keras and I don't know why it works with that. It's strange indeed.
– today
Nov 13 at 4:57

add a comment |

Whenever I concatenate the outputs of two layers (for example, because I want to use softmax on some outputs and another activation function on the rest), the network always fails to learn.

This is some example code to demonstrate the problem:

from tensorflow.keras.layers import Lambda, Input, Dense, Concatenate, Dropout, Reshape, 

                                    Conv2D, Flatten, MaxPooling2D

from tensorflow.keras.models import Model

from tensorflow.keras.datasets import mnist

from tensorflow.keras.losses import mse, categorical_crossentropy, binary_crossentropy

from tensorflow.keras.utils import plot_model, to_categorical

from tensorflow.keras import backend as K

from tensorflow.keras import optimizers



import numpy as np

import matplotlib.pyplot as plt

import argparse

import os

import pygameVisualise as pyvis



# MNIST dataset

(x_train, y_train), (x_test, y_test) = mnist.load_data()



no_cls = max(y_train)+1

width = 20



extra_dims = True



image_size = x_train.shape[1]

original_dim = image_size * image_size

x_train = np.reshape(x_train, [-1, original_dim])

x_test = np.reshape(x_test, [-1, original_dim])

x_train = x_train.astype('float32') / 255

x_test = x_test.astype('float32') / 255

y_train = to_categorical(y_train, num_classes=width if extra_dims else no_cls)

y_test = to_categorical(y_test, num_classes=width if extra_dims else no_cls)



hidden_dim = 512

batch_sz = 256

eps = 10



ins = Input(shape=(original_dim,))

x = Dense(hidden_dim)(ins)

cls_pred = Dense(no_cls, activation="softmax")(x)

other    = Dense(width-no_cls)(x)

outs = Concatenate()([cls_pred, other])



encoder = Model(ins, outs if extra_dims else cls_pred, name="encoder")

encoder.summary()



def cust_loss_fn(y_true, y_pred):

    return categorical_crossentropy(y_true[:no_cls], y_pred[:no_cls])



optimiser = optimizers.SGD(lr=0.003, clipvalue=0.1)

encoder.compile(optimizer=optimiser, loss=cust_loss_fn,

                metrics=["accuracy"])



encoder.fit(x_train, y_train,

            batch_size=batch_sz,

            epochs=eps,

            validation_data=(x_test, y_test))



score = encoder.evaluate(x_test, y_test)

print(score)



print(encoder.predict(x_train[0:10]))

Am I doing something wrong?

asked Nov 12 at 18:53

TheAbelo2

17413

Whenever I concatenate the outputs of two layers (for example, because I want to use softmax on some outputs and another activation function on the rest), the network always fails to learn.

This is some example code to demonstrate the problem:

from tensorflow.keras.layers import Lambda, Input, Dense, Concatenate, Dropout, Reshape, 

                                    Conv2D, Flatten, MaxPooling2D

from tensorflow.keras.models import Model

from tensorflow.keras.datasets import mnist

from tensorflow.keras.losses import mse, categorical_crossentropy, binary_crossentropy

from tensorflow.keras.utils import plot_model, to_categorical

from tensorflow.keras import backend as K

from tensorflow.keras import optimizers



import numpy as np

import matplotlib.pyplot as plt

import argparse

import os

import pygameVisualise as pyvis



# MNIST dataset

(x_train, y_train), (x_test, y_test) = mnist.load_data()



no_cls = max(y_train)+1

width = 20



extra_dims = True



image_size = x_train.shape[1]

original_dim = image_size * image_size

x_train = np.reshape(x_train, [-1, original_dim])

x_test = np.reshape(x_test, [-1, original_dim])

x_train = x_train.astype('float32') / 255

x_test = x_test.astype('float32') / 255

y_train = to_categorical(y_train, num_classes=width if extra_dims else no_cls)

y_test = to_categorical(y_test, num_classes=width if extra_dims else no_cls)



hidden_dim = 512

batch_sz = 256

eps = 10



ins = Input(shape=(original_dim,))

x = Dense(hidden_dim)(ins)

cls_pred = Dense(no_cls, activation="softmax")(x)

other    = Dense(width-no_cls)(x)

outs = Concatenate()([cls_pred, other])



encoder = Model(ins, outs if extra_dims else cls_pred, name="encoder")

encoder.summary()



def cust_loss_fn(y_true, y_pred):

    return categorical_crossentropy(y_true[:no_cls], y_pred[:no_cls])



optimiser = optimizers.SGD(lr=0.003, clipvalue=0.1)

encoder.compile(optimizer=optimiser, loss=cust_loss_fn,

                metrics=["accuracy"])



encoder.fit(x_train, y_train,

            batch_size=batch_sz,

            epochs=eps,

            validation_data=(x_test, y_test))



score = encoder.evaluate(x_test, y_test)

print(score)



print(encoder.predict(x_train[0:10]))

Am I doing something wrong?

python tensorflow keras

asked Nov 12 at 18:53

TheAbelo2

17413

asked Nov 12 at 18:53

TheAbelo2

17413

asked Nov 12 at 18:53

TheAbelo2

17413

asked Nov 12 at 18:53

TheAbelo2

17413

asked Nov 12 at 18:53

TheAbelo2

17413

1

Are you sure the model runs? Since that way of slicing in the loss function is wrong as it is only selecting the first no_cls samples and therefore the shapes would not be consistent. It must be [:,:no_cls] instead.
– today
Nov 12 at 19:36

@today Thank you so much, I changed that line and it started learning, although much slower than before - guess this was to be expected though. Thanks for pointing it out! Wonder why this works without the concatenate layer though? Wasn't having any problems with extra_dims = False
– TheAbelo2
Nov 12 at 21:36

It does not run and gives me errors when I use keras directly. I did not test it using tensorflow.keras and I don't know why it works with that. It's strange indeed.
– today
Nov 13 at 4:57

add a comment |

1

Are you sure the model runs? Since that way of slicing in the loss function is wrong as it is only selecting the first no_cls samples and therefore the shapes would not be consistent. It must be [:,:no_cls] instead.
– today
Nov 12 at 19:36

@today Thank you so much, I changed that line and it started learning, although much slower than before - guess this was to be expected though. Thanks for pointing it out! Wonder why this works without the concatenate layer though? Wasn't having any problems with extra_dims = False
– TheAbelo2
Nov 12 at 21:36

It does not run and gives me errors when I use keras directly. I did not test it using tensorflow.keras and I don't know why it works with that. It's strange indeed.
– today
Nov 13 at 4:57

Are you sure the model runs? Since that way of slicing in the loss function is wrong as it is only selecting the first no_cls samples and therefore the shapes would not be consistent. It must be [:,:no_cls] instead.
– today
Nov 12 at 19:36

@today Thank you so much, I changed that line and it started learning, although much slower than before - guess this was to be expected though. Thanks for pointing it out! Wonder why this works without the concatenate layer though? Wasn't having any problems with extra_dims = False
– TheAbelo2
Nov 12 at 21:36

It does not run and gives me errors when I use keras directly. I did not test it using tensorflow.keras and I don't know why it works with that. It's strange indeed.
– today
Nov 13 at 4:57

add a comment |

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268374%2fconcatenate-layer-in-keras-makes-fitting-fail%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vfrdtyky