How to return specific variable from SPARQL Federated Query (Service keyword)?
I'm using a federated query to retrieve some infos from a remote server, but I don't want to retrieve all the variables (select *) that I'm working on inside the federated query, I want to return just the count variable. How can I do that?
Code:
SERVICE <https://sparql.uniprot.org/sparql/> {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
If was not a federated query, I would do like this:
SELECT distinct (count(distinct ?protein) as ?count) WHERE {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
But in the federated query I cannot select variables, so is there a way to do what I want?
** EDIT 1 **
After @TallTed response I notice that I may have skipped some details in order to make the question simple but the details turn out to be important so I will describe the whole situation.
I have a local data set containing triples about biological process and genes. I have to count how many genes are related to each biological process and divide that number by the total number of proteins identified in Uniprot about the same biological process (and its "childrens").
To do this, I first query my local data set counting the genes for each biological process and then I run a federated query to count all the identified proteins in Uniprot of each biological process (and its "childrens").
The full SPARQL code:
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX uniprot: <http://purl.uniprot.org/core/>
PREFIX up:<http://purl.uniprot.org/core/>
PREFIX owl:<http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?bp_iri ?bp_count (count(distinct ?protein) as ?bp_total) ((?bp_count / ?bp_total) as ?divided) WHERE {
{
SELECT DISTINCT ?bp_iri (COUNT(?bp_iri) as ?bp_count) WHERE{
?genes_iri a uniprot:Gene .
?genes_iri obo:RO_0000056 ?bp_iri .
}group by ?bp_iri order by DESC(?bp_count)
}
SERVICE silent <https://sparql.uniprot.org/sparql/> {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}group by ?bp_iri ?bp_count ?bp_total order by DESC(?divided)
When I run this query using Jena ARQ (a query engine) the variable ?bp_iri
is replaced at the moment of the HTTP request by an specific biological process IRI (one HTTP request for each biological process) as shown in the image below:
Note that in the explain
image, the federated query is selecting everything (*) but the problem is that I don't want to retrieve all these relations that I'm dealing in the federated query, I just want to retrieve the count but the count is a aggragated function that is only allowed to be placed in front of the SELECT
keyword. (I don't want to retrieve all the relations because these query returns A LOT of triples (in order of tens of thousands, sometimes milions) and its not necessary to have them in my computer just to count.)
To solve this, I tried to create a subquery inside the federated query to select only the count (?bp_total
) and not all the triples. Code used:
SERVICE silent <https://sparql.uniprot.org/sparql/> {
{
SELECT (count(distinct ?protein) as ?bp_total) WHERE {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
Running the explain
again, I noticed that when I put a subquery inside the federated query, the variable ?bp_iri
is not replaced by the biological process IRI as shown in the image below:
Considering this, how can I retrieve only the count from a federated query?
Sorry about the long post.
sparql jena arq federated-queries
add a comment |
I'm using a federated query to retrieve some infos from a remote server, but I don't want to retrieve all the variables (select *) that I'm working on inside the federated query, I want to return just the count variable. How can I do that?
Code:
SERVICE <https://sparql.uniprot.org/sparql/> {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
If was not a federated query, I would do like this:
SELECT distinct (count(distinct ?protein) as ?count) WHERE {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
But in the federated query I cannot select variables, so is there a way to do what I want?
** EDIT 1 **
After @TallTed response I notice that I may have skipped some details in order to make the question simple but the details turn out to be important so I will describe the whole situation.
I have a local data set containing triples about biological process and genes. I have to count how many genes are related to each biological process and divide that number by the total number of proteins identified in Uniprot about the same biological process (and its "childrens").
To do this, I first query my local data set counting the genes for each biological process and then I run a federated query to count all the identified proteins in Uniprot of each biological process (and its "childrens").
The full SPARQL code:
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX uniprot: <http://purl.uniprot.org/core/>
PREFIX up:<http://purl.uniprot.org/core/>
PREFIX owl:<http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?bp_iri ?bp_count (count(distinct ?protein) as ?bp_total) ((?bp_count / ?bp_total) as ?divided) WHERE {
{
SELECT DISTINCT ?bp_iri (COUNT(?bp_iri) as ?bp_count) WHERE{
?genes_iri a uniprot:Gene .
?genes_iri obo:RO_0000056 ?bp_iri .
}group by ?bp_iri order by DESC(?bp_count)
}
SERVICE silent <https://sparql.uniprot.org/sparql/> {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}group by ?bp_iri ?bp_count ?bp_total order by DESC(?divided)
When I run this query using Jena ARQ (a query engine) the variable ?bp_iri
is replaced at the moment of the HTTP request by an specific biological process IRI (one HTTP request for each biological process) as shown in the image below:
Note that in the explain
image, the federated query is selecting everything (*) but the problem is that I don't want to retrieve all these relations that I'm dealing in the federated query, I just want to retrieve the count but the count is a aggragated function that is only allowed to be placed in front of the SELECT
keyword. (I don't want to retrieve all the relations because these query returns A LOT of triples (in order of tens of thousands, sometimes milions) and its not necessary to have them in my computer just to count.)
To solve this, I tried to create a subquery inside the federated query to select only the count (?bp_total
) and not all the triples. Code used:
SERVICE silent <https://sparql.uniprot.org/sparql/> {
{
SELECT (count(distinct ?protein) as ?bp_total) WHERE {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
Running the explain
again, I noticed that when I put a subquery inside the federated query, the variable ?bp_iri
is not replaced by the biological process IRI as shown in the image below:
Considering this, how can I retrieve only the count from a federated query?
Sorry about the long post.
sparql jena arq federated-queries
How many processes are there in your dataset? How many processes are there in the Uniprot ontology?
– Stanislav Kralin
Nov 13 '18 at 12:50
2
I think the long post is not a problem, but I would suggest that the revised/expanded question should be a new question -- because it is substantially different from that which it "edits."
– TallTed
Nov 13 '18 at 14:35
add a comment |
I'm using a federated query to retrieve some infos from a remote server, but I don't want to retrieve all the variables (select *) that I'm working on inside the federated query, I want to return just the count variable. How can I do that?
Code:
SERVICE <https://sparql.uniprot.org/sparql/> {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
If was not a federated query, I would do like this:
SELECT distinct (count(distinct ?protein) as ?count) WHERE {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
But in the federated query I cannot select variables, so is there a way to do what I want?
** EDIT 1 **
After @TallTed response I notice that I may have skipped some details in order to make the question simple but the details turn out to be important so I will describe the whole situation.
I have a local data set containing triples about biological process and genes. I have to count how many genes are related to each biological process and divide that number by the total number of proteins identified in Uniprot about the same biological process (and its "childrens").
To do this, I first query my local data set counting the genes for each biological process and then I run a federated query to count all the identified proteins in Uniprot of each biological process (and its "childrens").
The full SPARQL code:
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX uniprot: <http://purl.uniprot.org/core/>
PREFIX up:<http://purl.uniprot.org/core/>
PREFIX owl:<http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?bp_iri ?bp_count (count(distinct ?protein) as ?bp_total) ((?bp_count / ?bp_total) as ?divided) WHERE {
{
SELECT DISTINCT ?bp_iri (COUNT(?bp_iri) as ?bp_count) WHERE{
?genes_iri a uniprot:Gene .
?genes_iri obo:RO_0000056 ?bp_iri .
}group by ?bp_iri order by DESC(?bp_count)
}
SERVICE silent <https://sparql.uniprot.org/sparql/> {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}group by ?bp_iri ?bp_count ?bp_total order by DESC(?divided)
When I run this query using Jena ARQ (a query engine) the variable ?bp_iri
is replaced at the moment of the HTTP request by an specific biological process IRI (one HTTP request for each biological process) as shown in the image below:
Note that in the explain
image, the federated query is selecting everything (*) but the problem is that I don't want to retrieve all these relations that I'm dealing in the federated query, I just want to retrieve the count but the count is a aggragated function that is only allowed to be placed in front of the SELECT
keyword. (I don't want to retrieve all the relations because these query returns A LOT of triples (in order of tens of thousands, sometimes milions) and its not necessary to have them in my computer just to count.)
To solve this, I tried to create a subquery inside the federated query to select only the count (?bp_total
) and not all the triples. Code used:
SERVICE silent <https://sparql.uniprot.org/sparql/> {
{
SELECT (count(distinct ?protein) as ?bp_total) WHERE {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
Running the explain
again, I noticed that when I put a subquery inside the federated query, the variable ?bp_iri
is not replaced by the biological process IRI as shown in the image below:
Considering this, how can I retrieve only the count from a federated query?
Sorry about the long post.
sparql jena arq federated-queries
I'm using a federated query to retrieve some infos from a remote server, but I don't want to retrieve all the variables (select *) that I'm working on inside the federated query, I want to return just the count variable. How can I do that?
Code:
SERVICE <https://sparql.uniprot.org/sparql/> {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
If was not a federated query, I would do like this:
SELECT distinct (count(distinct ?protein) as ?count) WHERE {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
But in the federated query I cannot select variables, so is there a way to do what I want?
** EDIT 1 **
After @TallTed response I notice that I may have skipped some details in order to make the question simple but the details turn out to be important so I will describe the whole situation.
I have a local data set containing triples about biological process and genes. I have to count how many genes are related to each biological process and divide that number by the total number of proteins identified in Uniprot about the same biological process (and its "childrens").
To do this, I first query my local data set counting the genes for each biological process and then I run a federated query to count all the identified proteins in Uniprot of each biological process (and its "childrens").
The full SPARQL code:
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX uniprot: <http://purl.uniprot.org/core/>
PREFIX up:<http://purl.uniprot.org/core/>
PREFIX owl:<http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?bp_iri ?bp_count (count(distinct ?protein) as ?bp_total) ((?bp_count / ?bp_total) as ?divided) WHERE {
{
SELECT DISTINCT ?bp_iri (COUNT(?bp_iri) as ?bp_count) WHERE{
?genes_iri a uniprot:Gene .
?genes_iri obo:RO_0000056 ?bp_iri .
}group by ?bp_iri order by DESC(?bp_count)
}
SERVICE silent <https://sparql.uniprot.org/sparql/> {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}group by ?bp_iri ?bp_count ?bp_total order by DESC(?divided)
When I run this query using Jena ARQ (a query engine) the variable ?bp_iri
is replaced at the moment of the HTTP request by an specific biological process IRI (one HTTP request for each biological process) as shown in the image below:
Note that in the explain
image, the federated query is selecting everything (*) but the problem is that I don't want to retrieve all these relations that I'm dealing in the federated query, I just want to retrieve the count but the count is a aggragated function that is only allowed to be placed in front of the SELECT
keyword. (I don't want to retrieve all the relations because these query returns A LOT of triples (in order of tens of thousands, sometimes milions) and its not necessary to have them in my computer just to count.)
To solve this, I tried to create a subquery inside the federated query to select only the count (?bp_total
) and not all the triples. Code used:
SERVICE silent <https://sparql.uniprot.org/sparql/> {
{
SELECT (count(distinct ?protein) as ?bp_total) WHERE {
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp.
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
Running the explain
again, I noticed that when I put a subquery inside the federated query, the variable ?bp_iri
is not replaced by the biological process IRI as shown in the image below:
Considering this, how can I retrieve only the count from a federated query?
Sorry about the long post.
sparql jena arq federated-queries
sparql jena arq federated-queries
edited Nov 13 '18 at 14:34
TallTed
6,04521427
6,04521427
asked Nov 12 '18 at 22:37
Gabriel Gusmao
112
112
How many processes are there in your dataset? How many processes are there in the Uniprot ontology?
– Stanislav Kralin
Nov 13 '18 at 12:50
2
I think the long post is not a problem, but I would suggest that the revised/expanded question should be a new question -- because it is substantially different from that which it "edits."
– TallTed
Nov 13 '18 at 14:35
add a comment |
How many processes are there in your dataset? How many processes are there in the Uniprot ontology?
– Stanislav Kralin
Nov 13 '18 at 12:50
2
I think the long post is not a problem, but I would suggest that the revised/expanded question should be a new question -- because it is substantially different from that which it "edits."
– TallTed
Nov 13 '18 at 14:35
How many processes are there in your dataset? How many processes are there in the Uniprot ontology?
– Stanislav Kralin
Nov 13 '18 at 12:50
How many processes are there in your dataset? How many processes are there in the Uniprot ontology?
– Stanislav Kralin
Nov 13 '18 at 12:50
2
2
I think the long post is not a problem, but I would suggest that the revised/expanded question should be a new question -- because it is substantially different from that which it "edits."
– TallTed
Nov 13 '18 at 14:35
I think the long post is not a problem, but I would suggest that the revised/expanded question should be a new question -- because it is substantially different from that which it "edits."
– TallTed
Nov 13 '18 at 14:35
add a comment |
1 Answer
1
active
oldest
votes
As in Using Wikidata label service in federated queries, include some of the things that are nominally optional...
Note -- your remote query must actually execute on the remote endpoint, else you will get varying errors.
This is the query you're trying to run on the Uniprot endpoint --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism taxon:10090 .
}
That gets an error --
Query evaluation exception.
: SPARQL execute failed:[PREFIX up: PREFIX taxon: PREFIX rdfs: PREFIX owl: SELECT (COUNT(DISTINCT ?protein) AS ?count) WHERE { ?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri . ?protein up:classifiedWith ?sub_bp . ?protein up:organism taxon:10090 . }] Exception:virtuoso.jdbc4.VirtuosoException: TN...: Exceeded 1000000000 bytes in transitive temp memory. use t_distinct, t_max or more T_MAX_memory options to limit the search or increase the pool
-- but that's not due to a syntax error; it's due to the ZeroOrMorePath of rdfs:subClassOf
or owl:someValuesFrom
properties ((rdfs:subClassOf|owl:someValuesFrom)*
) Property Path you're querying, which has to try MANY possibilities.
If you limit the depth of that path, the Uniprot end point can handle it, and you can run it through Federated SPARQL.
Here's a reduced depth query (which I arbitrarily tried with 3 "ZeroOrOnePath") --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
-- that got a result --
count
"77633"xsd:int
-- which I found was the same result down to a single level --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
I just ran this query through URIBurner.com (which permits Federated SPARQL for authenticated users) --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT *
WHERE
{
SERVICE <https://sparql.uniprot.org/sparql>
{
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
That still produces an error --
Virtuoso HTCLI Error HC001: Read Error in HTTP Client
-- which suggests different settings are in play on the Uniprot server when you go directly through their web query form, which uses JDBC against their SPARQL server, then when you go straight through HTTP, as with Federated SPARQL.
I think the solution you need is a local Uniprot mirror, or a connection to the public Uniprot instance that has different permissions/settings than the primary public endpoint.
Thank you TallTed, your explanation gave me clarity of some things that I didn't know. Unfortunately it didn't solve my problem, but I think it was my explanation because I tried to keep the question simple and I skipped some details. I edited the original post with the completed situation. Thank you anyway, it was helpful! :-D
– Gabriel Gusmao
Nov 13 '18 at 10:43
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271110%2fhow-to-return-specific-variable-from-sparql-federated-query-service-keyword%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
As in Using Wikidata label service in federated queries, include some of the things that are nominally optional...
Note -- your remote query must actually execute on the remote endpoint, else you will get varying errors.
This is the query you're trying to run on the Uniprot endpoint --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism taxon:10090 .
}
That gets an error --
Query evaluation exception.
: SPARQL execute failed:[PREFIX up: PREFIX taxon: PREFIX rdfs: PREFIX owl: SELECT (COUNT(DISTINCT ?protein) AS ?count) WHERE { ?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri . ?protein up:classifiedWith ?sub_bp . ?protein up:organism taxon:10090 . }] Exception:virtuoso.jdbc4.VirtuosoException: TN...: Exceeded 1000000000 bytes in transitive temp memory. use t_distinct, t_max or more T_MAX_memory options to limit the search or increase the pool
-- but that's not due to a syntax error; it's due to the ZeroOrMorePath of rdfs:subClassOf
or owl:someValuesFrom
properties ((rdfs:subClassOf|owl:someValuesFrom)*
) Property Path you're querying, which has to try MANY possibilities.
If you limit the depth of that path, the Uniprot end point can handle it, and you can run it through Federated SPARQL.
Here's a reduced depth query (which I arbitrarily tried with 3 "ZeroOrOnePath") --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
-- that got a result --
count
"77633"xsd:int
-- which I found was the same result down to a single level --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
I just ran this query through URIBurner.com (which permits Federated SPARQL for authenticated users) --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT *
WHERE
{
SERVICE <https://sparql.uniprot.org/sparql>
{
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
That still produces an error --
Virtuoso HTCLI Error HC001: Read Error in HTTP Client
-- which suggests different settings are in play on the Uniprot server when you go directly through their web query form, which uses JDBC against their SPARQL server, then when you go straight through HTTP, as with Federated SPARQL.
I think the solution you need is a local Uniprot mirror, or a connection to the public Uniprot instance that has different permissions/settings than the primary public endpoint.
Thank you TallTed, your explanation gave me clarity of some things that I didn't know. Unfortunately it didn't solve my problem, but I think it was my explanation because I tried to keep the question simple and I skipped some details. I edited the original post with the completed situation. Thank you anyway, it was helpful! :-D
– Gabriel Gusmao
Nov 13 '18 at 10:43
add a comment |
As in Using Wikidata label service in federated queries, include some of the things that are nominally optional...
Note -- your remote query must actually execute on the remote endpoint, else you will get varying errors.
This is the query you're trying to run on the Uniprot endpoint --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism taxon:10090 .
}
That gets an error --
Query evaluation exception.
: SPARQL execute failed:[PREFIX up: PREFIX taxon: PREFIX rdfs: PREFIX owl: SELECT (COUNT(DISTINCT ?protein) AS ?count) WHERE { ?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri . ?protein up:classifiedWith ?sub_bp . ?protein up:organism taxon:10090 . }] Exception:virtuoso.jdbc4.VirtuosoException: TN...: Exceeded 1000000000 bytes in transitive temp memory. use t_distinct, t_max or more T_MAX_memory options to limit the search or increase the pool
-- but that's not due to a syntax error; it's due to the ZeroOrMorePath of rdfs:subClassOf
or owl:someValuesFrom
properties ((rdfs:subClassOf|owl:someValuesFrom)*
) Property Path you're querying, which has to try MANY possibilities.
If you limit the depth of that path, the Uniprot end point can handle it, and you can run it through Federated SPARQL.
Here's a reduced depth query (which I arbitrarily tried with 3 "ZeroOrOnePath") --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
-- that got a result --
count
"77633"xsd:int
-- which I found was the same result down to a single level --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
I just ran this query through URIBurner.com (which permits Federated SPARQL for authenticated users) --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT *
WHERE
{
SERVICE <https://sparql.uniprot.org/sparql>
{
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
That still produces an error --
Virtuoso HTCLI Error HC001: Read Error in HTTP Client
-- which suggests different settings are in play on the Uniprot server when you go directly through their web query form, which uses JDBC against their SPARQL server, then when you go straight through HTTP, as with Federated SPARQL.
I think the solution you need is a local Uniprot mirror, or a connection to the public Uniprot instance that has different permissions/settings than the primary public endpoint.
Thank you TallTed, your explanation gave me clarity of some things that I didn't know. Unfortunately it didn't solve my problem, but I think it was my explanation because I tried to keep the question simple and I skipped some details. I edited the original post with the completed situation. Thank you anyway, it was helpful! :-D
– Gabriel Gusmao
Nov 13 '18 at 10:43
add a comment |
As in Using Wikidata label service in federated queries, include some of the things that are nominally optional...
Note -- your remote query must actually execute on the remote endpoint, else you will get varying errors.
This is the query you're trying to run on the Uniprot endpoint --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism taxon:10090 .
}
That gets an error --
Query evaluation exception.
: SPARQL execute failed:[PREFIX up: PREFIX taxon: PREFIX rdfs: PREFIX owl: SELECT (COUNT(DISTINCT ?protein) AS ?count) WHERE { ?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri . ?protein up:classifiedWith ?sub_bp . ?protein up:organism taxon:10090 . }] Exception:virtuoso.jdbc4.VirtuosoException: TN...: Exceeded 1000000000 bytes in transitive temp memory. use t_distinct, t_max or more T_MAX_memory options to limit the search or increase the pool
-- but that's not due to a syntax error; it's due to the ZeroOrMorePath of rdfs:subClassOf
or owl:someValuesFrom
properties ((rdfs:subClassOf|owl:someValuesFrom)*
) Property Path you're querying, which has to try MANY possibilities.
If you limit the depth of that path, the Uniprot end point can handle it, and you can run it through Federated SPARQL.
Here's a reduced depth query (which I arbitrarily tried with 3 "ZeroOrOnePath") --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
-- that got a result --
count
"77633"xsd:int
-- which I found was the same result down to a single level --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
I just ran this query through URIBurner.com (which permits Federated SPARQL for authenticated users) --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT *
WHERE
{
SERVICE <https://sparql.uniprot.org/sparql>
{
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
That still produces an error --
Virtuoso HTCLI Error HC001: Read Error in HTTP Client
-- which suggests different settings are in play on the Uniprot server when you go directly through their web query form, which uses JDBC against their SPARQL server, then when you go straight through HTTP, as with Federated SPARQL.
I think the solution you need is a local Uniprot mirror, or a connection to the public Uniprot instance that has different permissions/settings than the primary public endpoint.
As in Using Wikidata label service in federated queries, include some of the things that are nominally optional...
Note -- your remote query must actually execute on the remote endpoint, else you will get varying errors.
This is the query you're trying to run on the Uniprot endpoint --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism taxon:10090 .
}
That gets an error --
Query evaluation exception.
: SPARQL execute failed:[PREFIX up: PREFIX taxon: PREFIX rdfs: PREFIX owl: SELECT (COUNT(DISTINCT ?protein) AS ?count) WHERE { ?sub_bp (rdfs:subClassOf|owl:someValuesFrom)* ?bp_iri . ?protein up:classifiedWith ?sub_bp . ?protein up:organism taxon:10090 . }] Exception:virtuoso.jdbc4.VirtuosoException: TN...: Exceeded 1000000000 bytes in transitive temp memory. use t_distinct, t_max or more T_MAX_memory options to limit the search or increase the pool
-- but that's not due to a syntax error; it's due to the ZeroOrMorePath of rdfs:subClassOf
or owl:someValuesFrom
properties ((rdfs:subClassOf|owl:someValuesFrom)*
) Property Path you're querying, which has to try MANY possibilities.
If you limit the depth of that path, the Uniprot end point can handle it, and you can run it through Federated SPARQL.
Here's a reduced depth query (which I arbitrarily tried with 3 "ZeroOrOnePath") --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)?
/ (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
-- that got a result --
count
"77633"xsd:int
-- which I found was the same result down to a single level --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
I just ran this query through URIBurner.com (which permits Federated SPARQL for authenticated users) --
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT *
WHERE
{
SERVICE <https://sparql.uniprot.org/sparql>
{
SELECT (COUNT(DISTINCT ?protein) AS ?count)
WHERE
{
?sub_bp (rdfs:subClassOf|owl:someValuesFrom)? ?bp_iri .
?protein up:classifiedWith ?sub_bp .
?protein up:organism <http://purl.uniprot.org/taxonomy/10090> .
}
}
}
That still produces an error --
Virtuoso HTCLI Error HC001: Read Error in HTTP Client
-- which suggests different settings are in play on the Uniprot server when you go directly through their web query form, which uses JDBC against their SPARQL server, then when you go straight through HTTP, as with Federated SPARQL.
I think the solution you need is a local Uniprot mirror, or a connection to the public Uniprot instance that has different permissions/settings than the primary public endpoint.
answered Nov 13 '18 at 3:48
TallTed
6,04521427
6,04521427
Thank you TallTed, your explanation gave me clarity of some things that I didn't know. Unfortunately it didn't solve my problem, but I think it was my explanation because I tried to keep the question simple and I skipped some details. I edited the original post with the completed situation. Thank you anyway, it was helpful! :-D
– Gabriel Gusmao
Nov 13 '18 at 10:43
add a comment |
Thank you TallTed, your explanation gave me clarity of some things that I didn't know. Unfortunately it didn't solve my problem, but I think it was my explanation because I tried to keep the question simple and I skipped some details. I edited the original post with the completed situation. Thank you anyway, it was helpful! :-D
– Gabriel Gusmao
Nov 13 '18 at 10:43
Thank you TallTed, your explanation gave me clarity of some things that I didn't know. Unfortunately it didn't solve my problem, but I think it was my explanation because I tried to keep the question simple and I skipped some details. I edited the original post with the completed situation. Thank you anyway, it was helpful! :-D
– Gabriel Gusmao
Nov 13 '18 at 10:43
Thank you TallTed, your explanation gave me clarity of some things that I didn't know. Unfortunately it didn't solve my problem, but I think it was my explanation because I tried to keep the question simple and I skipped some details. I edited the original post with the completed situation. Thank you anyway, it was helpful! :-D
– Gabriel Gusmao
Nov 13 '18 at 10:43
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271110%2fhow-to-return-specific-variable-from-sparql-federated-query-service-keyword%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
How many processes are there in your dataset? How many processes are there in the Uniprot ontology?
– Stanislav Kralin
Nov 13 '18 at 12:50
2
I think the long post is not a problem, but I would suggest that the revised/expanded question should be a new question -- because it is substantially different from that which it "edits."
– TallTed
Nov 13 '18 at 14:35