Python - how to 'stream' data from my MongoDB collection?
I'm running a script that pushes some data to a MongoDB database. Now I'm trying to have another Python script printing the new entries on my DB each time one is added.
For example:
If the number 80 is added to the DB, the script should fetch 80 from
the collection and print it to my console as soon as it's added on the
database.
My actual work is running fine. The only problem is that if I remove the time.sleep()
it will start printing every entry quickly.
As well, right now, instead of printing the new entry, it prints the whole collections + the new entry, instead of printing only the new one (I'm trying to do that because in the future I want my script to fetch the data and feed it later to a Python array).
- I can't use change_stream since my DB is not a replica set, i'm fairly new to this so i don't know much about replica sets.
- Could use a tailable cursor, but using a capped database wouldn't be the best choice, since i will be pushing data every 5 second, and having a "limit" (Isn't that what capped means?) would not be the best.
Any advice?
from pymongo import MongoClient
import time
import random
from pprint import pprint
client = MongoClient(port=27017)
arr =
db = client.one
mycol = client["coll"]
while True:
cursor = db.mycol.find()
for document in cursor:
print(document['num'])
time.sleep(2)
python mongodb
add a comment |
I'm running a script that pushes some data to a MongoDB database. Now I'm trying to have another Python script printing the new entries on my DB each time one is added.
For example:
If the number 80 is added to the DB, the script should fetch 80 from
the collection and print it to my console as soon as it's added on the
database.
My actual work is running fine. The only problem is that if I remove the time.sleep()
it will start printing every entry quickly.
As well, right now, instead of printing the new entry, it prints the whole collections + the new entry, instead of printing only the new one (I'm trying to do that because in the future I want my script to fetch the data and feed it later to a Python array).
- I can't use change_stream since my DB is not a replica set, i'm fairly new to this so i don't know much about replica sets.
- Could use a tailable cursor, but using a capped database wouldn't be the best choice, since i will be pushing data every 5 second, and having a "limit" (Isn't that what capped means?) would not be the best.
Any advice?
from pymongo import MongoClient
import time
import random
from pprint import pprint
client = MongoClient(port=27017)
arr =
db = client.one
mycol = client["coll"]
while True:
cursor = db.mycol.find()
for document in cursor:
print(document['num'])
time.sleep(2)
python mongodb
If you want a script to read the DB and read only the "new ones" then it needs to know state -- the last thing it read. This is actually independent of change_stream or anything else. Do you have any ideas around how you might capture that state?
– Buzz Moschetti
Nov 14 '18 at 1:42
Not yet to be honest @BuzzMoschetti
– Jack022
Nov 14 '18 at 9:24
add a comment |
I'm running a script that pushes some data to a MongoDB database. Now I'm trying to have another Python script printing the new entries on my DB each time one is added.
For example:
If the number 80 is added to the DB, the script should fetch 80 from
the collection and print it to my console as soon as it's added on the
database.
My actual work is running fine. The only problem is that if I remove the time.sleep()
it will start printing every entry quickly.
As well, right now, instead of printing the new entry, it prints the whole collections + the new entry, instead of printing only the new one (I'm trying to do that because in the future I want my script to fetch the data and feed it later to a Python array).
- I can't use change_stream since my DB is not a replica set, i'm fairly new to this so i don't know much about replica sets.
- Could use a tailable cursor, but using a capped database wouldn't be the best choice, since i will be pushing data every 5 second, and having a "limit" (Isn't that what capped means?) would not be the best.
Any advice?
from pymongo import MongoClient
import time
import random
from pprint import pprint
client = MongoClient(port=27017)
arr =
db = client.one
mycol = client["coll"]
while True:
cursor = db.mycol.find()
for document in cursor:
print(document['num'])
time.sleep(2)
python mongodb
I'm running a script that pushes some data to a MongoDB database. Now I'm trying to have another Python script printing the new entries on my DB each time one is added.
For example:
If the number 80 is added to the DB, the script should fetch 80 from
the collection and print it to my console as soon as it's added on the
database.
My actual work is running fine. The only problem is that if I remove the time.sleep()
it will start printing every entry quickly.
As well, right now, instead of printing the new entry, it prints the whole collections + the new entry, instead of printing only the new one (I'm trying to do that because in the future I want my script to fetch the data and feed it later to a Python array).
- I can't use change_stream since my DB is not a replica set, i'm fairly new to this so i don't know much about replica sets.
- Could use a tailable cursor, but using a capped database wouldn't be the best choice, since i will be pushing data every 5 second, and having a "limit" (Isn't that what capped means?) would not be the best.
Any advice?
from pymongo import MongoClient
import time
import random
from pprint import pprint
client = MongoClient(port=27017)
arr =
db = client.one
mycol = client["coll"]
while True:
cursor = db.mycol.find()
for document in cursor:
print(document['num'])
time.sleep(2)
python mongodb
python mongodb
edited Nov 16 '18 at 15:11
Nagaraj Tantri
2,855103963
2,855103963
asked Nov 13 '18 at 17:37
Jack022Jack022
5510
5510
If you want a script to read the DB and read only the "new ones" then it needs to know state -- the last thing it read. This is actually independent of change_stream or anything else. Do you have any ideas around how you might capture that state?
– Buzz Moschetti
Nov 14 '18 at 1:42
Not yet to be honest @BuzzMoschetti
– Jack022
Nov 14 '18 at 9:24
add a comment |
If you want a script to read the DB and read only the "new ones" then it needs to know state -- the last thing it read. This is actually independent of change_stream or anything else. Do you have any ideas around how you might capture that state?
– Buzz Moschetti
Nov 14 '18 at 1:42
Not yet to be honest @BuzzMoschetti
– Jack022
Nov 14 '18 at 9:24
If you want a script to read the DB and read only the "new ones" then it needs to know state -- the last thing it read. This is actually independent of change_stream or anything else. Do you have any ideas around how you might capture that state?
– Buzz Moschetti
Nov 14 '18 at 1:42
If you want a script to read the DB and read only the "new ones" then it needs to know state -- the last thing it read. This is actually independent of change_stream or anything else. Do you have any ideas around how you might capture that state?
– Buzz Moschetti
Nov 14 '18 at 1:42
Not yet to be honest @BuzzMoschetti
– Jack022
Nov 14 '18 at 9:24
Not yet to be honest @BuzzMoschetti
– Jack022
Nov 14 '18 at 9:24
add a comment |
1 Answer
1
active
oldest
votes
You can save the creation time of documents and repeat queries for documents created since your last query:
import datetime
import time
...
last_query_time = 0
while True:
now = datetime.datetime.now()
cursor = db.mycol.find({'created': {'$gt': last_query_time}})
last_query_time = now
for document in cursor:
print(document['num'])
time.sleep(2)
Hey! Thanks for answering! I tried it but it printed nothing, i'm trying to find the error right now
– Jack022
Nov 16 '18 at 23:20
Did you modify the creation of the documents to have a "created" field?
– roeen30
Nov 16 '18 at 23:23
You mean, i need to add another field to each document, "created" with datetime as attribute?
– Jack022
Nov 16 '18 at 23:35
Yes, otherwise it wouldn't be there...
– roeen30
Nov 16 '18 at 23:36
1
Got it, thank you!
– Jack022
Nov 16 '18 at 23:51
|
show 3 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53286652%2fpython-how-to-stream-data-from-my-mongodb-collection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can save the creation time of documents and repeat queries for documents created since your last query:
import datetime
import time
...
last_query_time = 0
while True:
now = datetime.datetime.now()
cursor = db.mycol.find({'created': {'$gt': last_query_time}})
last_query_time = now
for document in cursor:
print(document['num'])
time.sleep(2)
Hey! Thanks for answering! I tried it but it printed nothing, i'm trying to find the error right now
– Jack022
Nov 16 '18 at 23:20
Did you modify the creation of the documents to have a "created" field?
– roeen30
Nov 16 '18 at 23:23
You mean, i need to add another field to each document, "created" with datetime as attribute?
– Jack022
Nov 16 '18 at 23:35
Yes, otherwise it wouldn't be there...
– roeen30
Nov 16 '18 at 23:36
1
Got it, thank you!
– Jack022
Nov 16 '18 at 23:51
|
show 3 more comments
You can save the creation time of documents and repeat queries for documents created since your last query:
import datetime
import time
...
last_query_time = 0
while True:
now = datetime.datetime.now()
cursor = db.mycol.find({'created': {'$gt': last_query_time}})
last_query_time = now
for document in cursor:
print(document['num'])
time.sleep(2)
Hey! Thanks for answering! I tried it but it printed nothing, i'm trying to find the error right now
– Jack022
Nov 16 '18 at 23:20
Did you modify the creation of the documents to have a "created" field?
– roeen30
Nov 16 '18 at 23:23
You mean, i need to add another field to each document, "created" with datetime as attribute?
– Jack022
Nov 16 '18 at 23:35
Yes, otherwise it wouldn't be there...
– roeen30
Nov 16 '18 at 23:36
1
Got it, thank you!
– Jack022
Nov 16 '18 at 23:51
|
show 3 more comments
You can save the creation time of documents and repeat queries for documents created since your last query:
import datetime
import time
...
last_query_time = 0
while True:
now = datetime.datetime.now()
cursor = db.mycol.find({'created': {'$gt': last_query_time}})
last_query_time = now
for document in cursor:
print(document['num'])
time.sleep(2)
You can save the creation time of documents and repeat queries for documents created since your last query:
import datetime
import time
...
last_query_time = 0
while True:
now = datetime.datetime.now()
cursor = db.mycol.find({'created': {'$gt': last_query_time}})
last_query_time = now
for document in cursor:
print(document['num'])
time.sleep(2)
answered Nov 16 '18 at 23:15
roeen30roeen30
44629
44629
Hey! Thanks for answering! I tried it but it printed nothing, i'm trying to find the error right now
– Jack022
Nov 16 '18 at 23:20
Did you modify the creation of the documents to have a "created" field?
– roeen30
Nov 16 '18 at 23:23
You mean, i need to add another field to each document, "created" with datetime as attribute?
– Jack022
Nov 16 '18 at 23:35
Yes, otherwise it wouldn't be there...
– roeen30
Nov 16 '18 at 23:36
1
Got it, thank you!
– Jack022
Nov 16 '18 at 23:51
|
show 3 more comments
Hey! Thanks for answering! I tried it but it printed nothing, i'm trying to find the error right now
– Jack022
Nov 16 '18 at 23:20
Did you modify the creation of the documents to have a "created" field?
– roeen30
Nov 16 '18 at 23:23
You mean, i need to add another field to each document, "created" with datetime as attribute?
– Jack022
Nov 16 '18 at 23:35
Yes, otherwise it wouldn't be there...
– roeen30
Nov 16 '18 at 23:36
1
Got it, thank you!
– Jack022
Nov 16 '18 at 23:51
Hey! Thanks for answering! I tried it but it printed nothing, i'm trying to find the error right now
– Jack022
Nov 16 '18 at 23:20
Hey! Thanks for answering! I tried it but it printed nothing, i'm trying to find the error right now
– Jack022
Nov 16 '18 at 23:20
Did you modify the creation of the documents to have a "created" field?
– roeen30
Nov 16 '18 at 23:23
Did you modify the creation of the documents to have a "created" field?
– roeen30
Nov 16 '18 at 23:23
You mean, i need to add another field to each document, "created" with datetime as attribute?
– Jack022
Nov 16 '18 at 23:35
You mean, i need to add another field to each document, "created" with datetime as attribute?
– Jack022
Nov 16 '18 at 23:35
Yes, otherwise it wouldn't be there...
– roeen30
Nov 16 '18 at 23:36
Yes, otherwise it wouldn't be there...
– roeen30
Nov 16 '18 at 23:36
1
1
Got it, thank you!
– Jack022
Nov 16 '18 at 23:51
Got it, thank you!
– Jack022
Nov 16 '18 at 23:51
|
show 3 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53286652%2fpython-how-to-stream-data-from-my-mongodb-collection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If you want a script to read the DB and read only the "new ones" then it needs to know state -- the last thing it read. This is actually independent of change_stream or anything else. Do you have any ideas around how you might capture that state?
– Buzz Moschetti
Nov 14 '18 at 1:42
Not yet to be honest @BuzzMoschetti
– Jack022
Nov 14 '18 at 9:24