Use and modify variables in tensorflow bijectors
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:
x = myBij.inverse(y)
I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do
shift.assign(2.)
gx = myBij.forward(x)
I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.
If I try to modify the bijector directly, i.e.:
myBij.shift.assign(2.)
I get
AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'
Computing gradients also does not work as expected:
with tf.GradientTape() as tape:
gx = myBij.forward(x)
grad = tape.gradient(gx, shift)
Yields None, as well as this exception when the script ends:
Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>
Traceback (most recent call last):
File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__
AttributeError: 'NoneType' object has no attribute 'context'
What am I missing here?
Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...
Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0
python tensorflow machine-learning tensorflow-probability
add a comment |
In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:
x = myBij.inverse(y)
I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do
shift.assign(2.)
gx = myBij.forward(x)
I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.
If I try to modify the bijector directly, i.e.:
myBij.shift.assign(2.)
I get
AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'
Computing gradients also does not work as expected:
with tf.GradientTape() as tape:
gx = myBij.forward(x)
grad = tape.gradient(gx, shift)
Yields None, as well as this exception when the script ends:
Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>
Traceback (most recent call last):
File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__
AttributeError: 'NoneType' object has no attribute 'context'
What am I missing here?
Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...
Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0
python tensorflow machine-learning tensorflow-probability
add a comment |
In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:
x = myBij.inverse(y)
I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do
shift.assign(2.)
gx = myBij.forward(x)
I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.
If I try to modify the bijector directly, i.e.:
myBij.shift.assign(2.)
I get
AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'
Computing gradients also does not work as expected:
with tf.GradientTape() as tape:
gx = myBij.forward(x)
grad = tape.gradient(gx, shift)
Yields None, as well as this exception when the script ends:
Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>
Traceback (most recent call last):
File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__
AttributeError: 'NoneType' object has no attribute 'context'
What am I missing here?
Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...
Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0
python tensorflow machine-learning tensorflow-probability
In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:
x = myBij.inverse(y)
I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do
shift.assign(2.)
gx = myBij.forward(x)
I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.
If I try to modify the bijector directly, i.e.:
myBij.shift.assign(2.)
I get
AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'
Computing gradients also does not work as expected:
with tf.GradientTape() as tape:
gx = myBij.forward(x)
grad = tape.gradient(gx, shift)
Yields None, as well as this exception when the script ends:
Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>
Traceback (most recent call last):
File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__
AttributeError: 'NoneType' object has no attribute 'context'
What am I missing here?
Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...
Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0
python tensorflow machine-learning tensorflow-probability
python tensorflow machine-learning tensorflow-probability
edited Nov 16 '18 at 16:33
swertz
asked Nov 16 '18 at 13:38
swertzswertz
214
214
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
def f():
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse
# transform of myBij:
x = myBij.inverse(y)
return x, y
x, y = f()
shift.assign(2.)
gx, _ = f()
Regarding gradients, you will need to wrap calls to f() in a GradientTape
Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data usingmyDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?
– swertz
Nov 27 '18 at 20:36
1
Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have convertedshiftto a tensor (tf.convert_to_tensor), which induces areadon the variable and returns atf.Tensorwith a fixed.numpy()value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.
– Brian Patton
Dec 5 '18 at 15:46
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53338975%2fuse-and-modify-variables-in-tensorflow-bijectors%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
def f():
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse
# transform of myBij:
x = myBij.inverse(y)
return x, y
x, y = f()
shift.assign(2.)
gx, _ = f()
Regarding gradients, you will need to wrap calls to f() in a GradientTape
Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data usingmyDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?
– swertz
Nov 27 '18 at 20:36
1
Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have convertedshiftto a tensor (tf.convert_to_tensor), which induces areadon the variable and returns atf.Tensorwith a fixed.numpy()value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.
– Brian Patton
Dec 5 '18 at 15:46
add a comment |
If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
def f():
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse
# transform of myBij:
x = myBij.inverse(y)
return x, y
x, y = f()
shift.assign(2.)
gx, _ = f()
Regarding gradients, you will need to wrap calls to f() in a GradientTape
Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data usingmyDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?
– swertz
Nov 27 '18 at 20:36
1
Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have convertedshiftto a tensor (tf.convert_to_tensor), which induces areadon the variable and returns atf.Tensorwith a fixed.numpy()value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.
– Brian Patton
Dec 5 '18 at 15:46
add a comment |
If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
def f():
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse
# transform of myBij:
x = myBij.inverse(y)
return x, y
x, y = f()
shift.assign(2.)
gx, _ = f()
Regarding gradients, you will need to wrap calls to f() in a GradientTape
If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
def f():
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse
# transform of myBij:
x = myBij.inverse(y)
return x, y
x, y = f()
shift.assign(2.)
gx, _ = f()
Regarding gradients, you will need to wrap calls to f() in a GradientTape
answered Nov 26 '18 at 19:10
Brian PattonBrian Patton
1111
1111
Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data usingmyDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?
– swertz
Nov 27 '18 at 20:36
1
Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have convertedshiftto a tensor (tf.convert_to_tensor), which induces areadon the variable and returns atf.Tensorwith a fixed.numpy()value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.
– Brian Patton
Dec 5 '18 at 15:46
add a comment |
Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data usingmyDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?
– swertz
Nov 27 '18 at 20:36
1
Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have convertedshiftto a tensor (tf.convert_to_tensor), which induces areadon the variable and returns atf.Tensorwith a fixed.numpy()value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.
– Brian Patton
Dec 5 '18 at 15:46
Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data using
myDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?– swertz
Nov 27 '18 at 20:36
Thanks, I see! Now if I don't want to simply sample the function again, but (for instance) compute the likelihood of some fixed data using
myDistr.log_prob(), and gradient-ascend the shift variable to maximise that likelihood: does it mean I have to re-create the bijection and transformed distribution object for each step? That seems to involve a lot of overhead (especially if the bijection is a complex normalising flow) compared to what is possible in "regular" graph mode...?– swertz
Nov 27 '18 at 20:36
1
1
Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have converted
shift to a tensor (tf.convert_to_tensor), which induces a read on the variable and returns a tf.Tensor with a fixed .numpy() value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.– Brian Patton
Dec 5 '18 at 15:46
Yes, you should think of using eager mode more or less like using python floats or numpy arrays. If you change the value of shift, and want to compute some chain using its updated value, you have to repeat the computation. The Bijector[s] will have converted
shift to a tensor (tf.convert_to_tensor), which induces a read on the variable and returns a tf.Tensor with a fixed .numpy() value acquired at the time it was read. Note that keras does things a little differently, and we are looking at a revamped trainable layers API in TFP.– Brian Patton
Dec 5 '18 at 15:46
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53338975%2fuse-and-modify-variables-in-tensorflow-bijectors%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown