php binary stream parsing performance
I have binary file composed by several (+10k) records of 6 bytes each:
As example record a have a byte string like this ÿvDV
Ascii : 152 + 118 + 68 + 86 + 27 + 15
Binary: 000100000101101100000010010100110000000000000000
From the string I have to extract a list of "some" bits in some specific position and then to cast their value to integer:
1001100001110110 0100010001010110 0001101100001111
-- |-------| -- |-------| |------||------|
| | | | | |
| | | | | +------> 0001111 => $bin = 1 5
| | | | +--------------> 00011011 => $site = 27
| | | +-----------------------> 001010110 => $x = 86
| | +----------------------------------> 01 => $dp = 1
| +-----------------------------------------> 001110110 => $y = 118
+----------------------------------------------------> 10 => $tr = 2
Is there a faster approach than this one?
$binary = file_get_contents("/path/to/binary/file.dat");
$startbyte = 0; //I'm reading 55nth byte
while($startbyte <= strlen($binary)) {
$record = unpack("n3", substr($binary, $startbyte, 6));
$info = array(
'tr' => ($record[1] & 0xC000) >> 14,
'y' => ($record[1] & 0x01FF),
'dp' => ($record[2] & 0xC000) >> 14,
'x' => ($record[2] & 0x01FF),
'site' => (int) (($record[3] & 0x3F00) >> 8),
'bin' => (int) ($record[3] & 0x003F) + 1)
);
$startbyte += 6;
}
In your experience, there would be a faster approach?
php optimization binary unpack bytestream
|
show 12 more comments
I have binary file composed by several (+10k) records of 6 bytes each:
As example record a have a byte string like this ÿvDV
Ascii : 152 + 118 + 68 + 86 + 27 + 15
Binary: 000100000101101100000010010100110000000000000000
From the string I have to extract a list of "some" bits in some specific position and then to cast their value to integer:
1001100001110110 0100010001010110 0001101100001111
-- |-------| -- |-------| |------||------|
| | | | | |
| | | | | +------> 0001111 => $bin = 1 5
| | | | +--------------> 00011011 => $site = 27
| | | +-----------------------> 001010110 => $x = 86
| | +----------------------------------> 01 => $dp = 1
| +-----------------------------------------> 001110110 => $y = 118
+----------------------------------------------------> 10 => $tr = 2
Is there a faster approach than this one?
$binary = file_get_contents("/path/to/binary/file.dat");
$startbyte = 0; //I'm reading 55nth byte
while($startbyte <= strlen($binary)) {
$record = unpack("n3", substr($binary, $startbyte, 6));
$info = array(
'tr' => ($record[1] & 0xC000) >> 14,
'y' => ($record[1] & 0x01FF),
'dp' => ($record[2] & 0xC000) >> 14,
'x' => ($record[2] & 0x01FF),
'site' => (int) (($record[3] & 0x3F00) >> 8),
'bin' => (int) ($record[3] & 0x003F) + 1)
);
$startbyte += 6;
}
In your experience, there would be a faster approach?
php optimization binary unpack bytestream
1
I'm voting to close this question as off-topic because this question better suits codereview I think?
– RiggsFolly
Nov 15 '18 at 16:13
1
@RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. Theoptimization
tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.
– MonkeyZeus
Nov 15 '18 at 16:17
1
I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.
– Sammitch
Nov 15 '18 at 17:40
1
I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar withunpack()
. If anything I would choose to slightly slow down the code by reading the file one record at a time instead of usingfile_get_contents()
to avoid fatal memory exhaustion errors.
– MonkeyZeus
Nov 15 '18 at 17:52
1
@MonkeyZeus hah I missed that one. I actually feel like changing it to$fh = fopen(...); while( $record = fread($fh, 6) ) { ... }
would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.
– Sammitch
Nov 15 '18 at 18:55
|
show 12 more comments
I have binary file composed by several (+10k) records of 6 bytes each:
As example record a have a byte string like this ÿvDV
Ascii : 152 + 118 + 68 + 86 + 27 + 15
Binary: 000100000101101100000010010100110000000000000000
From the string I have to extract a list of "some" bits in some specific position and then to cast their value to integer:
1001100001110110 0100010001010110 0001101100001111
-- |-------| -- |-------| |------||------|
| | | | | |
| | | | | +------> 0001111 => $bin = 1 5
| | | | +--------------> 00011011 => $site = 27
| | | +-----------------------> 001010110 => $x = 86
| | +----------------------------------> 01 => $dp = 1
| +-----------------------------------------> 001110110 => $y = 118
+----------------------------------------------------> 10 => $tr = 2
Is there a faster approach than this one?
$binary = file_get_contents("/path/to/binary/file.dat");
$startbyte = 0; //I'm reading 55nth byte
while($startbyte <= strlen($binary)) {
$record = unpack("n3", substr($binary, $startbyte, 6));
$info = array(
'tr' => ($record[1] & 0xC000) >> 14,
'y' => ($record[1] & 0x01FF),
'dp' => ($record[2] & 0xC000) >> 14,
'x' => ($record[2] & 0x01FF),
'site' => (int) (($record[3] & 0x3F00) >> 8),
'bin' => (int) ($record[3] & 0x003F) + 1)
);
$startbyte += 6;
}
In your experience, there would be a faster approach?
php optimization binary unpack bytestream
I have binary file composed by several (+10k) records of 6 bytes each:
As example record a have a byte string like this ÿvDV
Ascii : 152 + 118 + 68 + 86 + 27 + 15
Binary: 000100000101101100000010010100110000000000000000
From the string I have to extract a list of "some" bits in some specific position and then to cast their value to integer:
1001100001110110 0100010001010110 0001101100001111
-- |-------| -- |-------| |------||------|
| | | | | |
| | | | | +------> 0001111 => $bin = 1 5
| | | | +--------------> 00011011 => $site = 27
| | | +-----------------------> 001010110 => $x = 86
| | +----------------------------------> 01 => $dp = 1
| +-----------------------------------------> 001110110 => $y = 118
+----------------------------------------------------> 10 => $tr = 2
Is there a faster approach than this one?
$binary = file_get_contents("/path/to/binary/file.dat");
$startbyte = 0; //I'm reading 55nth byte
while($startbyte <= strlen($binary)) {
$record = unpack("n3", substr($binary, $startbyte, 6));
$info = array(
'tr' => ($record[1] & 0xC000) >> 14,
'y' => ($record[1] & 0x01FF),
'dp' => ($record[2] & 0xC000) >> 14,
'x' => ($record[2] & 0x01FF),
'site' => (int) (($record[3] & 0x3F00) >> 8),
'bin' => (int) ($record[3] & 0x003F) + 1)
);
$startbyte += 6;
}
In your experience, there would be a faster approach?
php optimization binary unpack bytestream
php optimization binary unpack bytestream
edited Nov 15 '18 at 16:43
Stefano Radaelli
asked Nov 15 '18 at 16:07
Stefano RadaelliStefano Radaelli
485727
485727
1
I'm voting to close this question as off-topic because this question better suits codereview I think?
– RiggsFolly
Nov 15 '18 at 16:13
1
@RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. Theoptimization
tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.
– MonkeyZeus
Nov 15 '18 at 16:17
1
I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.
– Sammitch
Nov 15 '18 at 17:40
1
I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar withunpack()
. If anything I would choose to slightly slow down the code by reading the file one record at a time instead of usingfile_get_contents()
to avoid fatal memory exhaustion errors.
– MonkeyZeus
Nov 15 '18 at 17:52
1
@MonkeyZeus hah I missed that one. I actually feel like changing it to$fh = fopen(...); while( $record = fread($fh, 6) ) { ... }
would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.
– Sammitch
Nov 15 '18 at 18:55
|
show 12 more comments
1
I'm voting to close this question as off-topic because this question better suits codereview I think?
– RiggsFolly
Nov 15 '18 at 16:13
1
@RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. Theoptimization
tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.
– MonkeyZeus
Nov 15 '18 at 16:17
1
I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.
– Sammitch
Nov 15 '18 at 17:40
1
I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar withunpack()
. If anything I would choose to slightly slow down the code by reading the file one record at a time instead of usingfile_get_contents()
to avoid fatal memory exhaustion errors.
– MonkeyZeus
Nov 15 '18 at 17:52
1
@MonkeyZeus hah I missed that one. I actually feel like changing it to$fh = fopen(...); while( $record = fread($fh, 6) ) { ... }
would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.
– Sammitch
Nov 15 '18 at 18:55
1
1
I'm voting to close this question as off-topic because this question better suits codereview I think?
– RiggsFolly
Nov 15 '18 at 16:13
I'm voting to close this question as off-topic because this question better suits codereview I think?
– RiggsFolly
Nov 15 '18 at 16:13
1
1
@RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The
optimization
tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.– MonkeyZeus
Nov 15 '18 at 16:17
@RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The
optimization
tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.– MonkeyZeus
Nov 15 '18 at 16:17
1
1
I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.
– Sammitch
Nov 15 '18 at 17:40
I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.
– Sammitch
Nov 15 '18 at 17:40
1
1
I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with
unpack()
. If anything I would choose to slightly slow down the code by reading the file one record at a time instead of using file_get_contents()
to avoid fatal memory exhaustion errors.– MonkeyZeus
Nov 15 '18 at 17:52
I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with
unpack()
. If anything I would choose to slightly slow down the code by reading the file one record at a time instead of using file_get_contents()
to avoid fatal memory exhaustion errors.– MonkeyZeus
Nov 15 '18 at 17:52
1
1
@MonkeyZeus hah I missed that one. I actually feel like changing it to
$fh = fopen(...); while( $record = fread($fh, 6) ) { ... }
would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.– Sammitch
Nov 15 '18 at 18:55
@MonkeyZeus hah I missed that one. I actually feel like changing it to
$fh = fopen(...); while( $record = fread($fh, 6) ) { ... }
would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.– Sammitch
Nov 15 '18 at 18:55
|
show 12 more comments
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53323472%2fphp-binary-stream-parsing-performance%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53323472%2fphp-binary-stream-parsing-performance%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
I'm voting to close this question as off-topic because this question better suits codereview I think?
– RiggsFolly
Nov 15 '18 at 16:13
1
@RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The
optimization
tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.– MonkeyZeus
Nov 15 '18 at 16:17
1
I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.
– Sammitch
Nov 15 '18 at 17:40
1
I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with
unpack()
. If anything I would choose to slightly slow down the code by reading the file one record at a time instead of usingfile_get_contents()
to avoid fatal memory exhaustion errors.– MonkeyZeus
Nov 15 '18 at 17:52
1
@MonkeyZeus hah I missed that one. I actually feel like changing it to
$fh = fopen(...); while( $record = fread($fh, 6) ) { ... }
would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.– Sammitch
Nov 15 '18 at 18:55