Use and modify variables in tensorflow bijectors

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

myBij = tfp.bijectors.Affine(shift=shift)



# Normal distribution centered in zero, then shifted to 1 using the bijection

myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



# 2 samples of a normal centered at 1:

y = myDistr.sample(2)

# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:

x = myBij.inverse(y)

I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do

shift.assign(2.)

gx = myBij.forward(x)

I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.

If I try to modify the bijector directly, i.e.:

myBij.shift.assign(2.)

I get

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

Computing gradients also does not work as expected:

with tf.GradientTape() as tape:

    gx = myBij.forward(x)

grad = tape.gradient(gx, shift)

Yields None, as well as this exception when the script ends:

Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>

Traceback (most recent call last):

File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__

AttributeError: 'NoneType' object has no attribute 'context'

What am I missing here?

Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...

Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0

edited Nov 16 '18 at 16:33

asked Nov 16 '18 at 13:38

swertz

214

add a comment |

In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

myBij = tfp.bijectors.Affine(shift=shift)



# Normal distribution centered in zero, then shifted to 1 using the bijection

myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



# 2 samples of a normal centered at 1:

y = myDistr.sample(2)

# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:

x = myBij.inverse(y)

I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do

shift.assign(2.)

gx = myBij.forward(x)

I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.

If I try to modify the bijector directly, i.e.:

myBij.shift.assign(2.)

I get

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

Computing gradients also does not work as expected:

with tf.GradientTape() as tape:

    gx = myBij.forward(x)

grad = tape.gradient(gx, shift)

Yields None, as well as this exception when the script ends:

Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>

Traceback (most recent call last):

File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__

AttributeError: 'NoneType' object has no attribute 'context'

What am I missing here?

Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...

Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0

edited Nov 16 '18 at 16:33

asked Nov 16 '18 at 13:38

swertz

214

add a comment |

In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

myBij = tfp.bijectors.Affine(shift=shift)



# Normal distribution centered in zero, then shifted to 1 using the bijection

myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



# 2 samples of a normal centered at 1:

y = myDistr.sample(2)

# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:

x = myBij.inverse(y)

I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do

shift.assign(2.)

gx = myBij.forward(x)

I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.

If I try to modify the bijector directly, i.e.:

myBij.shift.assign(2.)

I get

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

Computing gradients also does not work as expected:

with tf.GradientTape() as tape:

    gx = myBij.forward(x)

grad = tape.gradient(gx, shift)

Yields None, as well as this exception when the script ends:

Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>

Traceback (most recent call last):

File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__

AttributeError: 'NoneType' object has no attribute 'context'

What am I missing here?

Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...

Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0

edited Nov 16 '18 at 16:33

asked Nov 16 '18 at 13:38

swertz

214

In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

myBij = tfp.bijectors.Affine(shift=shift)



# Normal distribution centered in zero, then shifted to 1 using the bijection

myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



# 2 samples of a normal centered at 1:

y = myDistr.sample(2)

# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:

x = myBij.inverse(y)

I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do

shift.assign(2.)

gx = myBij.forward(x)

I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.

If I try to modify the bijector directly, i.e.:

myBij.shift.assign(2.)

I get

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

Computing gradients also does not work as expected:

with tf.GradientTape() as tape:

    gx = myBij.forward(x)

grad = tape.gradient(gx, shift)

Yields None, as well as this exception when the script ends:

Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>

Traceback (most recent call last):

File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__

AttributeError: 'NoneType' object has no attribute 'context'

What am I missing here?

Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...

Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0

python tensorflow machine-learning tensorflow-probability

edited Nov 16 '18 at 16:33

asked Nov 16 '18 at 13:38

swertz

214

edited Nov 16 '18 at 16:33

asked Nov 16 '18 at 13:38

swertz

214

edited Nov 16 '18 at 16:33

asked Nov 16 '18 at 13:38

swertz

214

asked Nov 16 '18 at 13:38

swertz

214

asked Nov 16 '18 at 13:38

swertz

214

add a comment |

1 Answer
1

active

oldest

votes

If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

def f():

  myBij = tfp.bijectors.Affine(shift=shift)



  # Normal distribution centered in zero, then shifted to 1 using the bijection

  myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



  # 2 samples of a normal centered at 1:

  y = myDistr.sample(2)

  # 2 samples of a normal centered at 0, obtained using inverse

  # transform of myBij:

  x = myBij.inverse(y)

  return x, y

x, y = f()

shift.assign(2.)

gx, _ = f()

Regarding gradients, you will need to wrap calls to f() in a GradientTape

answered Nov 26 '18 at 19:10

Brian Patton

1111

Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data using myDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?

– swertz
Nov 27 '18 at 20:36

1

Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have converted shift to a tensor (tf.convert_to_tensor), which induces a read on the variable and returns a tf.Tensor with a fixed .numpy() value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.

– Brian Patton
Dec 5 '18 at 15:46

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53338975%2fuse-and-modify-variables-in-tensorflow-bijectors%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

def f():

  myBij = tfp.bijectors.Affine(shift=shift)



  # Normal distribution centered in zero, then shifted to 1 using the bijection

  myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



  # 2 samples of a normal centered at 1:

  y = myDistr.sample(2)

  # 2 samples of a normal centered at 0, obtained using inverse

  # transform of myBij:

  x = myBij.inverse(y)

  return x, y

x, y = f()

shift.assign(2.)

gx, _ = f()

Regarding gradients, you will need to wrap calls to f() in a GradientTape

answered Nov 26 '18 at 19:10

Brian Patton

1111

Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data using myDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?

– swertz
Nov 27 '18 at 20:36

1

Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have converted shift to a tensor (tf.convert_to_tensor), which induces a read on the variable and returns a tf.Tensor with a fixed .numpy() value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.

– Brian Patton
Dec 5 '18 at 15:46

add a comment |

If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

def f():

  myBij = tfp.bijectors.Affine(shift=shift)



  # Normal distribution centered in zero, then shifted to 1 using the bijection

  myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



  # 2 samples of a normal centered at 1:

  y = myDistr.sample(2)

  # 2 samples of a normal centered at 0, obtained using inverse

  # transform of myBij:

  x = myBij.inverse(y)

  return x, y

x, y = f()

shift.assign(2.)

gx, _ = f()

Regarding gradients, you will need to wrap calls to f() in a GradientTape

answered Nov 26 '18 at 19:10

Brian Patton

1111

Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data using myDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?

– swertz
Nov 27 '18 at 20:36

1

Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have converted shift to a tensor (tf.convert_to_tensor), which induces a read on the variable and returns a tf.Tensor with a fixed .numpy() value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.

– Brian Patton
Dec 5 '18 at 15:46

add a comment |

If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

def f():

  myBij = tfp.bijectors.Affine(shift=shift)



  # Normal distribution centered in zero, then shifted to 1 using the bijection

  myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



  # 2 samples of a normal centered at 1:

  y = myDistr.sample(2)

  # 2 samples of a normal centered at 0, obtained using inverse

  # transform of myBij:

  x = myBij.inverse(y)

  return x, y

x, y = f()

shift.assign(2.)

gx, _ = f()

Regarding gradients, you will need to wrap calls to f() in a GradientTape

answered Nov 26 '18 at 19:10

Brian Patton

1111

If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;

import tensorflow as tf

import tensorflow_probability as tfp

tfd = tfp.distributions



tf.enable_eager_execution()



shift = tf.Variable(1., dtype=tf.float32)

def f():

  myBij = tfp.bijectors.Affine(shift=shift)



  # Normal distribution centered in zero, then shifted to 1 using the bijection

  myDistr = tfd.TransformedDistribution(

            distribution=tfd.Normal(loc=0., scale=1.),

            bijector=myBij,

            name="test")



  # 2 samples of a normal centered at 1:

  y = myDistr.sample(2)

  # 2 samples of a normal centered at 0, obtained using inverse

  # transform of myBij:

  x = myBij.inverse(y)

  return x, y

x, y = f()

shift.assign(2.)

gx, _ = f()

Regarding gradients, you will need to wrap calls to f() in a GradientTape

answered Nov 26 '18 at 19:10

Brian Patton

1111

answered Nov 26 '18 at 19:10

Brian Patton

1111

answered Nov 26 '18 at 19:10

Brian Patton

1111

answered Nov 26 '18 at 19:10

Brian Patton

1111

Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data using myDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?

– swertz
Nov 27 '18 at 20:36

1

Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have converted shift to a tensor (tf.convert_to_tensor), which induces a read on the variable and returns a tf.Tensor with a fixed .numpy() value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.

– Brian Patton
Dec 5 '18 at 15:46

add a comment |

Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data using myDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?

– swertz
Nov 27 '18 at 20:36

1

Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have converted shift to a tensor (tf.convert_to_tensor), which induces a read on the variable and returns a tf.Tensor with a fixed .numpy() value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.

– Brian Patton
Dec 5 '18 at 15:46

Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data using myDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?

– swertz
Nov 27 '18 at 20:36

Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have converted shift to a tensor (tf.convert_to_tensor), which induces a read on the variable and returns a tf.Tensor with a fixed .numpy() value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.

– Brian Patton
Dec 5 '18 at 15:46

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vfrdtyky