Disable cache while running SQL - Google Big Query












0















Need some advice on cache issue we are facing in Google Big Query.



I'm using Talend tBigQueryInput component to run a Google Big Query SQL. However the SQL is not returning current data. We are suspecting it is returning stale data from cache.



Is there a disable cache option which can be embeded with in the SQL so that we are saying Big Query strictly not to take from cache?










share|improve this question























  • You have a low rate. Important on SO - you can mark accepted answer by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!

    – Mikhail Berlyant
    Dec 6 '18 at 20:59
















0















Need some advice on cache issue we are facing in Google Big Query.



I'm using Talend tBigQueryInput component to run a Google Big Query SQL. However the SQL is not returning current data. We are suspecting it is returning stale data from cache.



Is there a disable cache option which can be embeded with in the SQL so that we are saying Big Query strictly not to take from cache?










share|improve this question























  • You have a low rate. Important on SO - you can mark accepted answer by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!

    – Mikhail Berlyant
    Dec 6 '18 at 20:59














0












0








0








Need some advice on cache issue we are facing in Google Big Query.



I'm using Talend tBigQueryInput component to run a Google Big Query SQL. However the SQL is not returning current data. We are suspecting it is returning stale data from cache.



Is there a disable cache option which can be embeded with in the SQL so that we are saying Big Query strictly not to take from cache?










share|improve this question














Need some advice on cache issue we are facing in Google Big Query.



I'm using Talend tBigQueryInput component to run a Google Big Query SQL. However the SQL is not returning current data. We are suspecting it is returning stale data from cache.



Is there a disable cache option which can be embeded with in the SQL so that we are saying Big Query strictly not to take from cache?







sql caching google-bigquery talend






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 6:14









Pal1989Pal1989

62




62













  • You have a low rate. Important on SO - you can mark accepted answer by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!

    – Mikhail Berlyant
    Dec 6 '18 at 20:59



















  • You have a low rate. Important on SO - you can mark accepted answer by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!

    – Mikhail Berlyant
    Dec 6 '18 at 20:59

















You have a low rate. Important on SO - you can mark accepted answer by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!

– Mikhail Berlyant
Dec 6 '18 at 20:59





You have a low rate. Important on SO - you can mark accepted answer by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also important to vote on answer. Vote up answers that are helpful. ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. Following these simple rules you increase your own reputation score and at the same time you keep us motivated to answer your questions :o) please consider!

– Mikhail Berlyant
Dec 6 '18 at 20:59












2 Answers
2






active

oldest

votes


















1














Two quick thoughts come to mind:




  • You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.


  • You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like CURRENT_TIMESTAMP(), CURRENT_DATE() etc should be sufficient for this purpose.



From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache to false for the query job, which is the behavior you're requesting.






share|improve this answer
























  • Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.

    – Pal1989
    Nov 15 '18 at 22:55













  • Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.

    – shollyman
    Nov 16 '18 at 1:39



















0















Is there a disable cache option which can be embeded with in the SQL




No! There is no such option to be used within the query



Meantime, to enforce not using cache you can add something like below to your query



WHERE RAND() < 2    


This is just silly example - but you should get an idea :o)






share|improve this answer
























  • Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?

    – Pal1989
    Nov 15 '18 at 23:01











  • query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching

    – Mikhail Berlyant
    Nov 15 '18 at 23:10











  • ... and obviously because RAND() is always less than 1 it will not affect result of your query :o)

    – Mikhail Berlyant
    Nov 15 '18 at 23:14











  • did this explained? make sense?

    – Mikhail Berlyant
    Nov 15 '18 at 23:37











  • Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.

    – Pal1989
    Nov 15 '18 at 23:42











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313468%2fdisable-cache-while-running-sql-google-big-query%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Two quick thoughts come to mind:




  • You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.


  • You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like CURRENT_TIMESTAMP(), CURRENT_DATE() etc should be sufficient for this purpose.



From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache to false for the query job, which is the behavior you're requesting.






share|improve this answer
























  • Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.

    – Pal1989
    Nov 15 '18 at 22:55













  • Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.

    – shollyman
    Nov 16 '18 at 1:39
















1














Two quick thoughts come to mind:




  • You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.


  • You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like CURRENT_TIMESTAMP(), CURRENT_DATE() etc should be sufficient for this purpose.



From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache to false for the query job, which is the behavior you're requesting.






share|improve this answer
























  • Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.

    – Pal1989
    Nov 15 '18 at 22:55













  • Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.

    – shollyman
    Nov 16 '18 at 1:39














1












1








1







Two quick thoughts come to mind:




  • You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.


  • You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like CURRENT_TIMESTAMP(), CURRENT_DATE() etc should be sufficient for this purpose.



From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache to false for the query job, which is the behavior you're requesting.






share|improve this answer













Two quick thoughts come to mind:




  • You could examine the job history in the UI to see the jobs that Talend is running. From the statistics, they will report whether or not the results were served from cache. You could also examine the audit logs for this information.


  • You could execute an altered test query leveraging a non-deterministic aspect, which will ensure the results cannot be served from cache. Functions like CURRENT_TIMESTAMP(), CURRENT_DATE() etc should be sufficient for this purpose.



From scanning the public Talend docs, there's no configuration setting I can see that maps into setting configuration.query.useQueryCache to false for the query job, which is the behavior you're requesting.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 15 '18 at 7:55









shollymanshollyman

1,63699




1,63699













  • Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.

    – Pal1989
    Nov 15 '18 at 22:55













  • Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.

    – shollyman
    Nov 16 '18 at 1:39



















  • Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.

    – Pal1989
    Nov 15 '18 at 22:55













  • Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.

    – shollyman
    Nov 16 '18 at 1:39

















Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.

– Pal1989
Nov 15 '18 at 22:55







Hi Shollyman, thanks for your response. For point 1, could you please show where in UI will I be able to see the statistics about results served from cache? In Google Big Query--> Job History . There is no job history for the job run via Talend. Also for the jobs available there, there is no where it says if data is from cache. Also from where can I get the audit logs information. Thanks a lot for your help on this.

– Pal1989
Nov 15 '18 at 22:55















Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.

– shollyman
Nov 16 '18 at 1:39





Sorry, I was unclear. By UI I meant the BigQuery web UI, not a Talend interface, which expose job history information. If you're using the interface in cloud console, the Query History link in the left nav will expose queries run in a project. cloud.google.com/bigquery/audit-logs has more information about BigQuery's audit log functionality.

– shollyman
Nov 16 '18 at 1:39













0















Is there a disable cache option which can be embeded with in the SQL




No! There is no such option to be used within the query



Meantime, to enforce not using cache you can add something like below to your query



WHERE RAND() < 2    


This is just silly example - but you should get an idea :o)






share|improve this answer
























  • Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?

    – Pal1989
    Nov 15 '18 at 23:01











  • query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching

    – Mikhail Berlyant
    Nov 15 '18 at 23:10











  • ... and obviously because RAND() is always less than 1 it will not affect result of your query :o)

    – Mikhail Berlyant
    Nov 15 '18 at 23:14











  • did this explained? make sense?

    – Mikhail Berlyant
    Nov 15 '18 at 23:37











  • Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.

    – Pal1989
    Nov 15 '18 at 23:42
















0















Is there a disable cache option which can be embeded with in the SQL




No! There is no such option to be used within the query



Meantime, to enforce not using cache you can add something like below to your query



WHERE RAND() < 2    


This is just silly example - but you should get an idea :o)






share|improve this answer
























  • Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?

    – Pal1989
    Nov 15 '18 at 23:01











  • query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching

    – Mikhail Berlyant
    Nov 15 '18 at 23:10











  • ... and obviously because RAND() is always less than 1 it will not affect result of your query :o)

    – Mikhail Berlyant
    Nov 15 '18 at 23:14











  • did this explained? make sense?

    – Mikhail Berlyant
    Nov 15 '18 at 23:37











  • Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.

    – Pal1989
    Nov 15 '18 at 23:42














0












0








0








Is there a disable cache option which can be embeded with in the SQL




No! There is no such option to be used within the query



Meantime, to enforce not using cache you can add something like below to your query



WHERE RAND() < 2    


This is just silly example - but you should get an idea :o)






share|improve this answer














Is there a disable cache option which can be embeded with in the SQL




No! There is no such option to be used within the query



Meantime, to enforce not using cache you can add something like below to your query



WHERE RAND() < 2    


This is just silly example - but you should get an idea :o)







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 15 '18 at 16:14









Mikhail BerlyantMikhail Berlyant

60.4k43671




60.4k43671













  • Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?

    – Pal1989
    Nov 15 '18 at 23:01











  • query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching

    – Mikhail Berlyant
    Nov 15 '18 at 23:10











  • ... and obviously because RAND() is always less than 1 it will not affect result of your query :o)

    – Mikhail Berlyant
    Nov 15 '18 at 23:14











  • did this explained? make sense?

    – Mikhail Berlyant
    Nov 15 '18 at 23:37











  • Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.

    – Pal1989
    Nov 15 '18 at 23:42



















  • Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?

    – Pal1989
    Nov 15 '18 at 23:01











  • query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching

    – Mikhail Berlyant
    Nov 15 '18 at 23:10











  • ... and obviously because RAND() is always less than 1 it will not affect result of your query :o)

    – Mikhail Berlyant
    Nov 15 '18 at 23:14











  • did this explained? make sense?

    – Mikhail Berlyant
    Nov 15 '18 at 23:37











  • Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.

    – Pal1989
    Nov 15 '18 at 23:42

















Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?

– Pal1989
Nov 15 '18 at 23:01





Hi Mikhail, thanks for your response. Can you please explain how using something like RAND() < 2 will force the query not to use cache results?

– Pal1989
Nov 15 '18 at 23:01













query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching

– Mikhail Berlyant
Nov 15 '18 at 23:10





query result cache is valid for 24 hours and is used if underlying data is not changed, query itself is not changed and result is deterministic. by using non-deterministic function you effectively will force bq engine to not to use cached result - see more about Exceptions to query caching

– Mikhail Berlyant
Nov 15 '18 at 23:10













... and obviously because RAND() is always less than 1 it will not affect result of your query :o)

– Mikhail Berlyant
Nov 15 '18 at 23:14





... and obviously because RAND() is always less than 1 it will not affect result of your query :o)

– Mikhail Berlyant
Nov 15 '18 at 23:14













did this explained? make sense?

– Mikhail Berlyant
Nov 15 '18 at 23:37





did this explained? make sense?

– Mikhail Berlyant
Nov 15 '18 at 23:37













Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.

– Pal1989
Nov 15 '18 at 23:42





Hi Mikhail, yes this makes sense now. Let me add this condition to SQL and monitor the query for a while. Thanks for your help.

– Pal1989
Nov 15 '18 at 23:42


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313468%2fdisable-cache-while-running-sql-google-big-query%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Bressuire

Vorschmack

Quarantine