php binary stream parsing performance












1















I have binary file composed by several (+10k) records of 6 bytes each:



As example record a have a byte string like this ÿvDV



Ascii : 152 + 118 + 68 + 86 + 27 + 15
Binary: 000100000101101100000010010100110000000000000000


From the string I have to extract a list of "some" bits in some specific position and then to cast their value to integer:



1001100001110110 0100010001010110 0001101100001111
-- |-------| -- |-------| |------||------|
| | | | | |
| | | | | +------> 0001111 => $bin = 1 5
| | | | +--------------> 00011011 => $site = 27
| | | +-----------------------> 001010110 => $x = 86
| | +----------------------------------> 01 => $dp = 1
| +-----------------------------------------> 001110110 => $y = 118
+----------------------------------------------------> 10 => $tr = 2


Is there a faster approach than this one?



$binary = file_get_contents("/path/to/binary/file.dat");
$startbyte = 0; //I'm reading 55nth byte
while($startbyte <= strlen($binary)) {
$record = unpack("n3", substr($binary, $startbyte, 6));
$info = array(
'tr' => ($record[1] & 0xC000) >> 14,
'y' => ($record[1] & 0x01FF),
'dp' => ($record[2] & 0xC000) >> 14,
'x' => ($record[2] & 0x01FF),
'site' => (int) (($record[3] & 0x3F00) >> 8),
'bin' => (int) ($record[3] & 0x003F) + 1)
);
$startbyte += 6;
}


In your experience, there would be a faster approach?










share|improve this question




















  • 1





    I'm voting to close this question as off-topic because this question better suits codereview I think?

    – RiggsFolly
    Nov 15 '18 at 16:13








  • 1





    @RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The optimization tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.

    – MonkeyZeus
    Nov 15 '18 at 16:17








  • 1





    I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.

    – Sammitch
    Nov 15 '18 at 17:40






  • 1





    I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with unpack(). If anything I would choose to slightly slow down the code by reading the file one record at a time instead of using file_get_contents() to avoid fatal memory exhaustion errors.

    – MonkeyZeus
    Nov 15 '18 at 17:52








  • 1





    @MonkeyZeus hah I missed that one. I actually feel like changing it to $fh = fopen(...); while( $record = fread($fh, 6) ) { ... } would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.

    – Sammitch
    Nov 15 '18 at 18:55
















1















I have binary file composed by several (+10k) records of 6 bytes each:



As example record a have a byte string like this ÿvDV



Ascii : 152 + 118 + 68 + 86 + 27 + 15
Binary: 000100000101101100000010010100110000000000000000


From the string I have to extract a list of "some" bits in some specific position and then to cast their value to integer:



1001100001110110 0100010001010110 0001101100001111
-- |-------| -- |-------| |------||------|
| | | | | |
| | | | | +------> 0001111 => $bin = 1 5
| | | | +--------------> 00011011 => $site = 27
| | | +-----------------------> 001010110 => $x = 86
| | +----------------------------------> 01 => $dp = 1
| +-----------------------------------------> 001110110 => $y = 118
+----------------------------------------------------> 10 => $tr = 2


Is there a faster approach than this one?



$binary = file_get_contents("/path/to/binary/file.dat");
$startbyte = 0; //I'm reading 55nth byte
while($startbyte <= strlen($binary)) {
$record = unpack("n3", substr($binary, $startbyte, 6));
$info = array(
'tr' => ($record[1] & 0xC000) >> 14,
'y' => ($record[1] & 0x01FF),
'dp' => ($record[2] & 0xC000) >> 14,
'x' => ($record[2] & 0x01FF),
'site' => (int) (($record[3] & 0x3F00) >> 8),
'bin' => (int) ($record[3] & 0x003F) + 1)
);
$startbyte += 6;
}


In your experience, there would be a faster approach?










share|improve this question




















  • 1





    I'm voting to close this question as off-topic because this question better suits codereview I think?

    – RiggsFolly
    Nov 15 '18 at 16:13








  • 1





    @RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The optimization tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.

    – MonkeyZeus
    Nov 15 '18 at 16:17








  • 1





    I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.

    – Sammitch
    Nov 15 '18 at 17:40






  • 1





    I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with unpack(). If anything I would choose to slightly slow down the code by reading the file one record at a time instead of using file_get_contents() to avoid fatal memory exhaustion errors.

    – MonkeyZeus
    Nov 15 '18 at 17:52








  • 1





    @MonkeyZeus hah I missed that one. I actually feel like changing it to $fh = fopen(...); while( $record = fread($fh, 6) ) { ... } would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.

    – Sammitch
    Nov 15 '18 at 18:55














1












1








1


1






I have binary file composed by several (+10k) records of 6 bytes each:



As example record a have a byte string like this ÿvDV



Ascii : 152 + 118 + 68 + 86 + 27 + 15
Binary: 000100000101101100000010010100110000000000000000


From the string I have to extract a list of "some" bits in some specific position and then to cast their value to integer:



1001100001110110 0100010001010110 0001101100001111
-- |-------| -- |-------| |------||------|
| | | | | |
| | | | | +------> 0001111 => $bin = 1 5
| | | | +--------------> 00011011 => $site = 27
| | | +-----------------------> 001010110 => $x = 86
| | +----------------------------------> 01 => $dp = 1
| +-----------------------------------------> 001110110 => $y = 118
+----------------------------------------------------> 10 => $tr = 2


Is there a faster approach than this one?



$binary = file_get_contents("/path/to/binary/file.dat");
$startbyte = 0; //I'm reading 55nth byte
while($startbyte <= strlen($binary)) {
$record = unpack("n3", substr($binary, $startbyte, 6));
$info = array(
'tr' => ($record[1] & 0xC000) >> 14,
'y' => ($record[1] & 0x01FF),
'dp' => ($record[2] & 0xC000) >> 14,
'x' => ($record[2] & 0x01FF),
'site' => (int) (($record[3] & 0x3F00) >> 8),
'bin' => (int) ($record[3] & 0x003F) + 1)
);
$startbyte += 6;
}


In your experience, there would be a faster approach?










share|improve this question
















I have binary file composed by several (+10k) records of 6 bytes each:



As example record a have a byte string like this ÿvDV



Ascii : 152 + 118 + 68 + 86 + 27 + 15
Binary: 000100000101101100000010010100110000000000000000


From the string I have to extract a list of "some" bits in some specific position and then to cast their value to integer:



1001100001110110 0100010001010110 0001101100001111
-- |-------| -- |-------| |------||------|
| | | | | |
| | | | | +------> 0001111 => $bin = 1 5
| | | | +--------------> 00011011 => $site = 27
| | | +-----------------------> 001010110 => $x = 86
| | +----------------------------------> 01 => $dp = 1
| +-----------------------------------------> 001110110 => $y = 118
+----------------------------------------------------> 10 => $tr = 2


Is there a faster approach than this one?



$binary = file_get_contents("/path/to/binary/file.dat");
$startbyte = 0; //I'm reading 55nth byte
while($startbyte <= strlen($binary)) {
$record = unpack("n3", substr($binary, $startbyte, 6));
$info = array(
'tr' => ($record[1] & 0xC000) >> 14,
'y' => ($record[1] & 0x01FF),
'dp' => ($record[2] & 0xC000) >> 14,
'x' => ($record[2] & 0x01FF),
'site' => (int) (($record[3] & 0x3F00) >> 8),
'bin' => (int) ($record[3] & 0x003F) + 1)
);
$startbyte += 6;
}


In your experience, there would be a faster approach?







php optimization binary unpack bytestream






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 16:43







Stefano Radaelli

















asked Nov 15 '18 at 16:07









Stefano RadaelliStefano Radaelli

485727




485727








  • 1





    I'm voting to close this question as off-topic because this question better suits codereview I think?

    – RiggsFolly
    Nov 15 '18 at 16:13








  • 1





    @RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The optimization tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.

    – MonkeyZeus
    Nov 15 '18 at 16:17








  • 1





    I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.

    – Sammitch
    Nov 15 '18 at 17:40






  • 1





    I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with unpack(). If anything I would choose to slightly slow down the code by reading the file one record at a time instead of using file_get_contents() to avoid fatal memory exhaustion errors.

    – MonkeyZeus
    Nov 15 '18 at 17:52








  • 1





    @MonkeyZeus hah I missed that one. I actually feel like changing it to $fh = fopen(...); while( $record = fread($fh, 6) ) { ... } would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.

    – Sammitch
    Nov 15 '18 at 18:55














  • 1





    I'm voting to close this question as off-topic because this question better suits codereview I think?

    – RiggsFolly
    Nov 15 '18 at 16:13








  • 1





    @RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The optimization tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.

    – MonkeyZeus
    Nov 15 '18 at 16:17








  • 1





    I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.

    – Sammitch
    Nov 15 '18 at 17:40






  • 1





    I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with unpack(). If anything I would choose to slightly slow down the code by reading the file one record at a time instead of using file_get_contents() to avoid fatal memory exhaustion errors.

    – MonkeyZeus
    Nov 15 '18 at 17:52








  • 1





    @MonkeyZeus hah I missed that one. I actually feel like changing it to $fh = fopen(...); while( $record = fread($fh, 6) ) { ... } would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.

    – Sammitch
    Nov 15 '18 at 18:55








1




1





I'm voting to close this question as off-topic because this question better suits codereview I think?

– RiggsFolly
Nov 15 '18 at 16:13







I'm voting to close this question as off-topic because this question better suits codereview I think?

– RiggsFolly
Nov 15 '18 at 16:13






1




1





@RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The optimization tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.

– MonkeyZeus
Nov 15 '18 at 16:17







@RiggsFolly Maybe you're right but maybe more eyeballs is better for this situation. The optimization tag does exist here and OP clearly showed a good understanding of the code at hand. The only thing I would ask for is maybe some sample data (20+ records?) and the current benchmark.

– MonkeyZeus
Nov 15 '18 at 16:17






1




1





I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.

– Sammitch
Nov 15 '18 at 17:40





I don't see any room for improvement in this code at all. You should profile what is happening when the code is run. Eg: are you capping out on CPU, IO, or memory? Also why does this need to run in less than one second? You may benefit for re-thinking you approach and where/how this bit of code fits into the larger application.

– Sammitch
Nov 15 '18 at 17:40




1




1





I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with unpack(). If anything I would choose to slightly slow down the code by reading the file one record at a time instead of using file_get_contents() to avoid fatal memory exhaustion errors.

– MonkeyZeus
Nov 15 '18 at 17:52







I agree with @Sammitch in regards to profiling the code; additionally, I trust their judgement in regard to having no room for improvement because I am not familiar with unpack(). If anything I would choose to slightly slow down the code by reading the file one record at a time instead of using file_get_contents() to avoid fatal memory exhaustion errors.

– MonkeyZeus
Nov 15 '18 at 17:52






1




1





@MonkeyZeus hah I missed that one. I actually feel like changing it to $fh = fopen(...); while( $record = fread($fh, 6) ) { ... } would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.

– Sammitch
Nov 15 '18 at 18:55





@MonkeyZeus hah I missed that one. I actually feel like changing it to $fh = fopen(...); while( $record = fread($fh, 6) ) { ... } would also be a slight performance improvement given PHP's solid IO internals. Though definitely not enough to shave 40% off the run time.

– Sammitch
Nov 15 '18 at 18:55












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53323472%2fphp-binary-stream-parsing-performance%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53323472%2fphp-binary-stream-parsing-performance%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Bressuire

Vorschmack

Quarantine