Problems when implementing Keras model in Tensorflow

I'm just starting off with Tensorflow.

I tried implementing a model to classify digits in the MNSIT dataset.

I am familiar with Keras, so I first used it to create the model.

Keras code:

from keras.models import Sequential

from keras.layers import Dense

from keras.datasets import mnist

from os import path



import numpy as np



network = Sequential()

network.add(Dense(700, input_dim=784, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(10, activation='softmax'))



network.compile(loss='categorical_crossentropy', optimizer='adam')



(x_train, y_temp), (x_test, y_test) = mnist.load_data()

y_train = vectorize(y_temp)  # I defined this function to create vectors of the labels. It works without issues.



x_train = x_train.reshape(x_train.shape[0], x_train.shape[1]*x_train.shape[2])



network.fit(x_train, y_train, batch_size=100, epochs=3)



x_test = x_test.reshape(x_test.shape[0], x_test.shape[1]*x_test.shape[2])





scores = network.predict(x_test)



correct_pred = 0

for i in range(len(scores)):

    if np.argmax(scores[i]) == y_test[i]:

        correct_pred += 1



print((correct_pred/len(scores))*100)

The above code gives me an accuracy of around 92%.

I tried implementing the same model in Tensorflow:

import sys



import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data



data = input_data.read_data_sets('.', one_hot=True)



sess = tf.InteractiveSession()



x = tf.placeholder(tf.float32, [None, 784])

y = tf.placeholder(tf.float32, [None, 10])



w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))



h1 = tf.nn.tanh(tf.matmul(x, w))

h2 = tf.nn.tanh(tf.matmul(h1, w2))

h3 = tf.nn.tanh(tf.matmul(h2, w3))

h4 = tf.nn.tanh(tf.matmul(h3, w4))

h = tf.matmul(h4, w5)



loss = tf.math.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=h, labels=y))

gradient_descent = tf.train.AdamOptimizer().minimize(loss)



correct_mask = tf.equal(tf.argmax(h, 1), tf.argmax(y, 1))

accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))



sess.run(tf.global_variables_initializer())



for i in range(3):

    batch_x, batch_y = data.train.next_batch(100)

    loss_print = tf.print(loss, output_stream=sys.stdout)

    sess.run([gradient_descent, loss_print], feed_dict={x: batch_x, y: batch_y})



ans = sess.run(accuracy, feed_dict={x: data.test.images, y: data.test.labels})



print(ans)

However, this code only gave me an accuracy of around 11%.
I tried increasing the number of epochs to 1000, but the result didn't change. Furthermore, the loss in every epoch was the same (2.30).

Am I missing something in the Tensorflow code?

edited Nov 15 '18 at 17:21

asked Nov 15 '18 at 17:06

Susmit Agrawal

1,002513

One issue is that you have not considered the bias variables of Dense layers in your TF model.

– today
Nov 15 '18 at 17:12

The bias values are all zero. Does it still make a difference if I don't include them?

– Susmit Agrawal
Nov 15 '18 at 17:15

They are initially zero, but during training they change like the kernel weights. That's why they are called variables not constants.

– today
Nov 15 '18 at 17:16

add a comment |

I'm just starting off with Tensorflow.

I tried implementing a model to classify digits in the MNSIT dataset.

I am familiar with Keras, so I first used it to create the model.

Keras code:

from keras.models import Sequential

from keras.layers import Dense

from keras.datasets import mnist

from os import path



import numpy as np



network = Sequential()

network.add(Dense(700, input_dim=784, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(10, activation='softmax'))



network.compile(loss='categorical_crossentropy', optimizer='adam')



(x_train, y_temp), (x_test, y_test) = mnist.load_data()

y_train = vectorize(y_temp)  # I defined this function to create vectors of the labels. It works without issues.



x_train = x_train.reshape(x_train.shape[0], x_train.shape[1]*x_train.shape[2])



network.fit(x_train, y_train, batch_size=100, epochs=3)



x_test = x_test.reshape(x_test.shape[0], x_test.shape[1]*x_test.shape[2])





scores = network.predict(x_test)



correct_pred = 0

for i in range(len(scores)):

    if np.argmax(scores[i]) == y_test[i]:

        correct_pred += 1



print((correct_pred/len(scores))*100)

The above code gives me an accuracy of around 92%.

I tried implementing the same model in Tensorflow:

import sys



import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data



data = input_data.read_data_sets('.', one_hot=True)



sess = tf.InteractiveSession()



x = tf.placeholder(tf.float32, [None, 784])

y = tf.placeholder(tf.float32, [None, 10])



w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))



h1 = tf.nn.tanh(tf.matmul(x, w))

h2 = tf.nn.tanh(tf.matmul(h1, w2))

h3 = tf.nn.tanh(tf.matmul(h2, w3))

h4 = tf.nn.tanh(tf.matmul(h3, w4))

h = tf.matmul(h4, w5)



loss = tf.math.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=h, labels=y))

gradient_descent = tf.train.AdamOptimizer().minimize(loss)



correct_mask = tf.equal(tf.argmax(h, 1), tf.argmax(y, 1))

accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))



sess.run(tf.global_variables_initializer())



for i in range(3):

    batch_x, batch_y = data.train.next_batch(100)

    loss_print = tf.print(loss, output_stream=sys.stdout)

    sess.run([gradient_descent, loss_print], feed_dict={x: batch_x, y: batch_y})



ans = sess.run(accuracy, feed_dict={x: data.test.images, y: data.test.labels})



print(ans)

However, this code only gave me an accuracy of around 11%.
I tried increasing the number of epochs to 1000, but the result didn't change. Furthermore, the loss in every epoch was the same (2.30).

Am I missing something in the Tensorflow code?

edited Nov 15 '18 at 17:21

asked Nov 15 '18 at 17:06

Susmit Agrawal

1,002513

One issue is that you have not considered the bias variables of Dense layers in your TF model.

– today
Nov 15 '18 at 17:12

The bias values are all zero. Does it still make a difference if I don't include them?

– Susmit Agrawal
Nov 15 '18 at 17:15

They are initially zero, but during training they change like the kernel weights. That's why they are called variables not constants.

– today
Nov 15 '18 at 17:16

add a comment |

I'm just starting off with Tensorflow.

I tried implementing a model to classify digits in the MNSIT dataset.

I am familiar with Keras, so I first used it to create the model.

Keras code:

from keras.models import Sequential

from keras.layers import Dense

from keras.datasets import mnist

from os import path



import numpy as np



network = Sequential()

network.add(Dense(700, input_dim=784, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(10, activation='softmax'))



network.compile(loss='categorical_crossentropy', optimizer='adam')



(x_train, y_temp), (x_test, y_test) = mnist.load_data()

y_train = vectorize(y_temp)  # I defined this function to create vectors of the labels. It works without issues.



x_train = x_train.reshape(x_train.shape[0], x_train.shape[1]*x_train.shape[2])



network.fit(x_train, y_train, batch_size=100, epochs=3)



x_test = x_test.reshape(x_test.shape[0], x_test.shape[1]*x_test.shape[2])





scores = network.predict(x_test)



correct_pred = 0

for i in range(len(scores)):

    if np.argmax(scores[i]) == y_test[i]:

        correct_pred += 1



print((correct_pred/len(scores))*100)

The above code gives me an accuracy of around 92%.

I tried implementing the same model in Tensorflow:

import sys



import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data



data = input_data.read_data_sets('.', one_hot=True)



sess = tf.InteractiveSession()



x = tf.placeholder(tf.float32, [None, 784])

y = tf.placeholder(tf.float32, [None, 10])



w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))



h1 = tf.nn.tanh(tf.matmul(x, w))

h2 = tf.nn.tanh(tf.matmul(h1, w2))

h3 = tf.nn.tanh(tf.matmul(h2, w3))

h4 = tf.nn.tanh(tf.matmul(h3, w4))

h = tf.matmul(h4, w5)



loss = tf.math.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=h, labels=y))

gradient_descent = tf.train.AdamOptimizer().minimize(loss)



correct_mask = tf.equal(tf.argmax(h, 1), tf.argmax(y, 1))

accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))



sess.run(tf.global_variables_initializer())



for i in range(3):

    batch_x, batch_y = data.train.next_batch(100)

    loss_print = tf.print(loss, output_stream=sys.stdout)

    sess.run([gradient_descent, loss_print], feed_dict={x: batch_x, y: batch_y})



ans = sess.run(accuracy, feed_dict={x: data.test.images, y: data.test.labels})



print(ans)

However, this code only gave me an accuracy of around 11%.
I tried increasing the number of epochs to 1000, but the result didn't change. Furthermore, the loss in every epoch was the same (2.30).

Am I missing something in the Tensorflow code?

edited Nov 15 '18 at 17:21

asked Nov 15 '18 at 17:06

Susmit Agrawal

1,002513

I'm just starting off with Tensorflow.

I tried implementing a model to classify digits in the MNSIT dataset.

I am familiar with Keras, so I first used it to create the model.

Keras code:

from keras.models import Sequential

from keras.layers import Dense

from keras.datasets import mnist

from os import path



import numpy as np



network = Sequential()

network.add(Dense(700, input_dim=784, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(500, activation='tanh'))

network.add(Dense(10, activation='softmax'))



network.compile(loss='categorical_crossentropy', optimizer='adam')



(x_train, y_temp), (x_test, y_test) = mnist.load_data()

y_train = vectorize(y_temp)  # I defined this function to create vectors of the labels. It works without issues.



x_train = x_train.reshape(x_train.shape[0], x_train.shape[1]*x_train.shape[2])



network.fit(x_train, y_train, batch_size=100, epochs=3)



x_test = x_test.reshape(x_test.shape[0], x_test.shape[1]*x_test.shape[2])





scores = network.predict(x_test)



correct_pred = 0

for i in range(len(scores)):

    if np.argmax(scores[i]) == y_test[i]:

        correct_pred += 1



print((correct_pred/len(scores))*100)

The above code gives me an accuracy of around 92%.

I tried implementing the same model in Tensorflow:

import sys



import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data



data = input_data.read_data_sets('.', one_hot=True)



sess = tf.InteractiveSession()



x = tf.placeholder(tf.float32, [None, 784])

y = tf.placeholder(tf.float32, [None, 10])



w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))



h1 = tf.nn.tanh(tf.matmul(x, w))

h2 = tf.nn.tanh(tf.matmul(h1, w2))

h3 = tf.nn.tanh(tf.matmul(h2, w3))

h4 = tf.nn.tanh(tf.matmul(h3, w4))

h = tf.matmul(h4, w5)



loss = tf.math.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=h, labels=y))

gradient_descent = tf.train.AdamOptimizer().minimize(loss)



correct_mask = tf.equal(tf.argmax(h, 1), tf.argmax(y, 1))

accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))



sess.run(tf.global_variables_initializer())



for i in range(3):

    batch_x, batch_y = data.train.next_batch(100)

    loss_print = tf.print(loss, output_stream=sys.stdout)

    sess.run([gradient_descent, loss_print], feed_dict={x: batch_x, y: batch_y})



ans = sess.run(accuracy, feed_dict={x: data.test.images, y: data.test.labels})



print(ans)

However, this code only gave me an accuracy of around 11%.
I tried increasing the number of epochs to 1000, but the result didn't change. Furthermore, the loss in every epoch was the same (2.30).

Am I missing something in the Tensorflow code?

python python-3.x tensorflow keras neural-network

edited Nov 15 '18 at 17:21

asked Nov 15 '18 at 17:06

Susmit Agrawal

1,002513

edited Nov 15 '18 at 17:21

asked Nov 15 '18 at 17:06

Susmit Agrawal

1,002513

edited Nov 15 '18 at 17:21

asked Nov 15 '18 at 17:06

Susmit Agrawal

1,002513

asked Nov 15 '18 at 17:06

Susmit Agrawal

1,002513

asked Nov 15 '18 at 17:06

Susmit Agrawal

1,002513

One issue is that you have not considered the bias variables of Dense layers in your TF model.

– today
Nov 15 '18 at 17:12

The bias values are all zero. Does it still make a difference if I don't include them?

– Susmit Agrawal
Nov 15 '18 at 17:15

They are initially zero, but during training they change like the kernel weights. That's why they are called variables not constants.

– today
Nov 15 '18 at 17:16

add a comment |

One issue is that you have not considered the bias variables of Dense layers in your TF model.

– today
Nov 15 '18 at 17:12

The bias values are all zero. Does it still make a difference if I don't include them?

– Susmit Agrawal
Nov 15 '18 at 17:15

They are initially zero, but during training they change like the kernel weights. That's why they are called variables not constants.

– today
Nov 15 '18 at 17:16

One issue is that you have not considered the bias variables of Dense layers in your TF model.

– today
Nov 15 '18 at 17:12

The bias values are all zero. Does it still make a difference if I don't include them?

– Susmit Agrawal
Nov 15 '18 at 17:15

They are initially zero, but during training they change like the kernel weights. That's why they are called variables not constants.

– today
Nov 15 '18 at 17:16

add a comment |

1 Answer
1

active

oldest

votes

Turns out, the problem was that I initialized the weights as zeros!

Simply changing

w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))

w = tf.Variable(tf.random_normal([784, 700], seed=42))

w2 = tf.Variable(tf.random_normal([700, 500], seed=42))

w3 = tf.Variable(tf.random_normal([500, 500], seed=42))

w4 = tf.Variable(tf.random_normal([500, 500], seed=42))

w5 = tf.Variable(tf.random_normal([500, 10], seed=42))

gave significant improvements.

answered Nov 15 '18 at 18:06

Susmit Agrawal

1,002513

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53324596%2fproblems-when-implementing-keras-model-in-tensorflow%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Turns out, the problem was that I initialized the weights as zeros!

Simply changing

w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))

w = tf.Variable(tf.random_normal([784, 700], seed=42))

w2 = tf.Variable(tf.random_normal([700, 500], seed=42))

w3 = tf.Variable(tf.random_normal([500, 500], seed=42))

w4 = tf.Variable(tf.random_normal([500, 500], seed=42))

w5 = tf.Variable(tf.random_normal([500, 10], seed=42))

gave significant improvements.

answered Nov 15 '18 at 18:06

Susmit Agrawal

1,002513

add a comment |

Turns out, the problem was that I initialized the weights as zeros!

Simply changing

w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))

w = tf.Variable(tf.random_normal([784, 700], seed=42))

w2 = tf.Variable(tf.random_normal([700, 500], seed=42))

w3 = tf.Variable(tf.random_normal([500, 500], seed=42))

w4 = tf.Variable(tf.random_normal([500, 500], seed=42))

w5 = tf.Variable(tf.random_normal([500, 10], seed=42))

gave significant improvements.

answered Nov 15 '18 at 18:06

Susmit Agrawal

1,002513

add a comment |

Turns out, the problem was that I initialized the weights as zeros!

Simply changing

w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))

w = tf.Variable(tf.random_normal([784, 700], seed=42))

w2 = tf.Variable(tf.random_normal([700, 500], seed=42))

w3 = tf.Variable(tf.random_normal([500, 500], seed=42))

w4 = tf.Variable(tf.random_normal([500, 500], seed=42))

w5 = tf.Variable(tf.random_normal([500, 10], seed=42))

gave significant improvements.

answered Nov 15 '18 at 18:06

Susmit Agrawal

1,002513

Turns out, the problem was that I initialized the weights as zeros!

Simply changing

w = tf.Variable(tf.zeros([784, 700]))

w2 = tf.Variable(tf.zeros([700, 500]))

w3 = tf.Variable(tf.zeros([500, 500]))

w4 = tf.Variable(tf.zeros([500, 500]))

w5 = tf.Variable(tf.zeros([500, 10]))

w = tf.Variable(tf.random_normal([784, 700], seed=42))

w2 = tf.Variable(tf.random_normal([700, 500], seed=42))

w3 = tf.Variable(tf.random_normal([500, 500], seed=42))

w4 = tf.Variable(tf.random_normal([500, 500], seed=42))

w5 = tf.Variable(tf.random_normal([500, 10], seed=42))

gave significant improvements.

answered Nov 15 '18 at 18:06

Susmit Agrawal

1,002513

answered Nov 15 '18 at 18:06

Susmit Agrawal

1,002513

answered Nov 15 '18 at 18:06

Susmit Agrawal

1,002513

answered Nov 15 '18 at 18:06

Susmit Agrawal

1,002513

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

8 QCsJAKSV7vryEXiwStsm Aq98RF r5Uz YYBO 3j77AQA H,RB9AsgfC 5zkUW VB y6TmhZ3 2yj9

搜尋此網誌

Vfrdtyky