How to: Exclude a row from stddev/mean calculations and join later
So I'm trying to find machines that throw exceptionally many exceptions as compared to their group by environment and function. The intuition is that the load and task type should be fairly similar across the group, so if one machine throws a lot more exceptions, it's probably somehow in a bad state and should be serviced.
This works fairly well for large groups of machines, but for smaller groups there's a problem: If there's very few machines, and only one of them is throwing a lot of exceptions, it may not get detected. The reason is that because that datapoint is pars of the general stddev and mean calculation for the group, the mean and stddev are biased towards this outlier.
The solution would be to either somehow subtract that data point from the calculated stddev and mean for the entire group, or to calculate the stddev and mean for each machine/environment/function combination (excluding the machine in question from the stddev/mean calculation) instead of just by environment/function group.
Here's the current code that does it by environment/function. Is there an elegant solution to expand it to do the machine/environment/function?
// Find sick machines
let SickMachinesAt = (AtTime:datetime , TimeWindow:timespan = 1h, Sigmas:double = 3.0, MinimumExceptionsToTrigger:int = 10) {
// These are the exceptions we are looking at (time window constrained)
let Exceptions = exception
| where EventInfo_Time between((AtTime - TimeWindow ) .. AtTime);
// Calculate mean and stddev for each bin of environmentName + machineFunction
let MeanAndStdDev = Exceptions
| summarize count() by environmentName, machineFunction, machineName
| summarize avg(count_), stdev(count_) by environmentName, machineFunction
| order by environmentName, machineFunction;
let MachinesWithMeanAndStdDev = Exceptions
| summarize count() by environmentName, machineFunction, machineName
| join kind=fullouter MeanAndStdDev on environmentName, machineFunction;
let SickMachines = MachinesWithMeanAndStdDev |
project machineName,
machineFunction,
environmentName,
totalExceptionCount = count_,
cutoff = avg_count_ + Sigmas * stdev_count_,
signalStrength = ((count_ - avg_count_) / stdev_count_)
| where totalExceptionCount > cutoff and totalExceptionCount > MinimumExceptionsToTrigger
| order by signalStrength desc;
SickMachines
}
kusto azure-data-explorer
add a comment |
So I'm trying to find machines that throw exceptionally many exceptions as compared to their group by environment and function. The intuition is that the load and task type should be fairly similar across the group, so if one machine throws a lot more exceptions, it's probably somehow in a bad state and should be serviced.
This works fairly well for large groups of machines, but for smaller groups there's a problem: If there's very few machines, and only one of them is throwing a lot of exceptions, it may not get detected. The reason is that because that datapoint is pars of the general stddev and mean calculation for the group, the mean and stddev are biased towards this outlier.
The solution would be to either somehow subtract that data point from the calculated stddev and mean for the entire group, or to calculate the stddev and mean for each machine/environment/function combination (excluding the machine in question from the stddev/mean calculation) instead of just by environment/function group.
Here's the current code that does it by environment/function. Is there an elegant solution to expand it to do the machine/environment/function?
// Find sick machines
let SickMachinesAt = (AtTime:datetime , TimeWindow:timespan = 1h, Sigmas:double = 3.0, MinimumExceptionsToTrigger:int = 10) {
// These are the exceptions we are looking at (time window constrained)
let Exceptions = exception
| where EventInfo_Time between((AtTime - TimeWindow ) .. AtTime);
// Calculate mean and stddev for each bin of environmentName + machineFunction
let MeanAndStdDev = Exceptions
| summarize count() by environmentName, machineFunction, machineName
| summarize avg(count_), stdev(count_) by environmentName, machineFunction
| order by environmentName, machineFunction;
let MachinesWithMeanAndStdDev = Exceptions
| summarize count() by environmentName, machineFunction, machineName
| join kind=fullouter MeanAndStdDev on environmentName, machineFunction;
let SickMachines = MachinesWithMeanAndStdDev |
project machineName,
machineFunction,
environmentName,
totalExceptionCount = count_,
cutoff = avg_count_ + Sigmas * stdev_count_,
signalStrength = ((count_ - avg_count_) / stdev_count_)
| where totalExceptionCount > cutoff and totalExceptionCount > MinimumExceptionsToTrigger
| order by signalStrength desc;
SickMachines
}
kusto azure-data-explorer
add a comment |
So I'm trying to find machines that throw exceptionally many exceptions as compared to their group by environment and function. The intuition is that the load and task type should be fairly similar across the group, so if one machine throws a lot more exceptions, it's probably somehow in a bad state and should be serviced.
This works fairly well for large groups of machines, but for smaller groups there's a problem: If there's very few machines, and only one of them is throwing a lot of exceptions, it may not get detected. The reason is that because that datapoint is pars of the general stddev and mean calculation for the group, the mean and stddev are biased towards this outlier.
The solution would be to either somehow subtract that data point from the calculated stddev and mean for the entire group, or to calculate the stddev and mean for each machine/environment/function combination (excluding the machine in question from the stddev/mean calculation) instead of just by environment/function group.
Here's the current code that does it by environment/function. Is there an elegant solution to expand it to do the machine/environment/function?
// Find sick machines
let SickMachinesAt = (AtTime:datetime , TimeWindow:timespan = 1h, Sigmas:double = 3.0, MinimumExceptionsToTrigger:int = 10) {
// These are the exceptions we are looking at (time window constrained)
let Exceptions = exception
| where EventInfo_Time between((AtTime - TimeWindow ) .. AtTime);
// Calculate mean and stddev for each bin of environmentName + machineFunction
let MeanAndStdDev = Exceptions
| summarize count() by environmentName, machineFunction, machineName
| summarize avg(count_), stdev(count_) by environmentName, machineFunction
| order by environmentName, machineFunction;
let MachinesWithMeanAndStdDev = Exceptions
| summarize count() by environmentName, machineFunction, machineName
| join kind=fullouter MeanAndStdDev on environmentName, machineFunction;
let SickMachines = MachinesWithMeanAndStdDev |
project machineName,
machineFunction,
environmentName,
totalExceptionCount = count_,
cutoff = avg_count_ + Sigmas * stdev_count_,
signalStrength = ((count_ - avg_count_) / stdev_count_)
| where totalExceptionCount > cutoff and totalExceptionCount > MinimumExceptionsToTrigger
| order by signalStrength desc;
SickMachines
}
kusto azure-data-explorer
So I'm trying to find machines that throw exceptionally many exceptions as compared to their group by environment and function. The intuition is that the load and task type should be fairly similar across the group, so if one machine throws a lot more exceptions, it's probably somehow in a bad state and should be serviced.
This works fairly well for large groups of machines, but for smaller groups there's a problem: If there's very few machines, and only one of them is throwing a lot of exceptions, it may not get detected. The reason is that because that datapoint is pars of the general stddev and mean calculation for the group, the mean and stddev are biased towards this outlier.
The solution would be to either somehow subtract that data point from the calculated stddev and mean for the entire group, or to calculate the stddev and mean for each machine/environment/function combination (excluding the machine in question from the stddev/mean calculation) instead of just by environment/function group.
Here's the current code that does it by environment/function. Is there an elegant solution to expand it to do the machine/environment/function?
// Find sick machines
let SickMachinesAt = (AtTime:datetime , TimeWindow:timespan = 1h, Sigmas:double = 3.0, MinimumExceptionsToTrigger:int = 10) {
// These are the exceptions we are looking at (time window constrained)
let Exceptions = exception
| where EventInfo_Time between((AtTime - TimeWindow ) .. AtTime);
// Calculate mean and stddev for each bin of environmentName + machineFunction
let MeanAndStdDev = Exceptions
| summarize count() by environmentName, machineFunction, machineName
| summarize avg(count_), stdev(count_) by environmentName, machineFunction
| order by environmentName, machineFunction;
let MachinesWithMeanAndStdDev = Exceptions
| summarize count() by environmentName, machineFunction, machineName
| join kind=fullouter MeanAndStdDev on environmentName, machineFunction;
let SickMachines = MachinesWithMeanAndStdDev |
project machineName,
machineFunction,
environmentName,
totalExceptionCount = count_,
cutoff = avg_count_ + Sigmas * stdev_count_,
signalStrength = ((count_ - avg_count_) / stdev_count_)
| where totalExceptionCount > cutoff and totalExceptionCount > MinimumExceptionsToTrigger
| order by signalStrength desc;
SickMachines
}
kusto azure-data-explorer
kusto azure-data-explorer
asked Nov 14 '18 at 19:20
Anders Sewerin JohansenAnders Sewerin Johansen
607826
607826
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
One option to avoid missed detects due to strong outlier values is to use percentiles based detection. For this purpose you could use make-series followed by the built-in series_outliers function.
Another option is to remove outliers from the calculation and then join back on the data and requires multiple joins. Assuming your exceptions are in Exceptions which contain the dimesnions: environmentName, machineFunction, machineName, you could remove all machines with counts above the 98th percentile with the following pseudo-query:
let ExceptionsCounts = Exceptions
| summarize counts = count() by environmentName, machineFunction, machineName;
let ExceptionsCleansed = ExceptionsCounts
| summarize p98 = percentile(counts, 98) by environmentName, machineFunction
| join kind=inner (ExceptionsCounts) on environmentName, machineFunction
| where counts < p98;
From there you could use ExceptionsCleansed to calculate the mean/stddev and continue with the detection on the original Exceptions with the calculated figures with exactly the same query you posted.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53307372%2fhow-to-exclude-a-row-from-stddev-mean-calculations-and-join-later%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
One option to avoid missed detects due to strong outlier values is to use percentiles based detection. For this purpose you could use make-series followed by the built-in series_outliers function.
Another option is to remove outliers from the calculation and then join back on the data and requires multiple joins. Assuming your exceptions are in Exceptions which contain the dimesnions: environmentName, machineFunction, machineName, you could remove all machines with counts above the 98th percentile with the following pseudo-query:
let ExceptionsCounts = Exceptions
| summarize counts = count() by environmentName, machineFunction, machineName;
let ExceptionsCleansed = ExceptionsCounts
| summarize p98 = percentile(counts, 98) by environmentName, machineFunction
| join kind=inner (ExceptionsCounts) on environmentName, machineFunction
| where counts < p98;
From there you could use ExceptionsCleansed to calculate the mean/stddev and continue with the detection on the original Exceptions with the calculated figures with exactly the same query you posted.
add a comment |
One option to avoid missed detects due to strong outlier values is to use percentiles based detection. For this purpose you could use make-series followed by the built-in series_outliers function.
Another option is to remove outliers from the calculation and then join back on the data and requires multiple joins. Assuming your exceptions are in Exceptions which contain the dimesnions: environmentName, machineFunction, machineName, you could remove all machines with counts above the 98th percentile with the following pseudo-query:
let ExceptionsCounts = Exceptions
| summarize counts = count() by environmentName, machineFunction, machineName;
let ExceptionsCleansed = ExceptionsCounts
| summarize p98 = percentile(counts, 98) by environmentName, machineFunction
| join kind=inner (ExceptionsCounts) on environmentName, machineFunction
| where counts < p98;
From there you could use ExceptionsCleansed to calculate the mean/stddev and continue with the detection on the original Exceptions with the calculated figures with exactly the same query you posted.
add a comment |
One option to avoid missed detects due to strong outlier values is to use percentiles based detection. For this purpose you could use make-series followed by the built-in series_outliers function.
Another option is to remove outliers from the calculation and then join back on the data and requires multiple joins. Assuming your exceptions are in Exceptions which contain the dimesnions: environmentName, machineFunction, machineName, you could remove all machines with counts above the 98th percentile with the following pseudo-query:
let ExceptionsCounts = Exceptions
| summarize counts = count() by environmentName, machineFunction, machineName;
let ExceptionsCleansed = ExceptionsCounts
| summarize p98 = percentile(counts, 98) by environmentName, machineFunction
| join kind=inner (ExceptionsCounts) on environmentName, machineFunction
| where counts < p98;
From there you could use ExceptionsCleansed to calculate the mean/stddev and continue with the detection on the original Exceptions with the calculated figures with exactly the same query you posted.
One option to avoid missed detects due to strong outlier values is to use percentiles based detection. For this purpose you could use make-series followed by the built-in series_outliers function.
Another option is to remove outliers from the calculation and then join back on the data and requires multiple joins. Assuming your exceptions are in Exceptions which contain the dimesnions: environmentName, machineFunction, machineName, you could remove all machines with counts above the 98th percentile with the following pseudo-query:
let ExceptionsCounts = Exceptions
| summarize counts = count() by environmentName, machineFunction, machineName;
let ExceptionsCleansed = ExceptionsCounts
| summarize p98 = percentile(counts, 98) by environmentName, machineFunction
| join kind=inner (ExceptionsCounts) on environmentName, machineFunction
| where counts < p98;
From there you could use ExceptionsCleansed to calculate the mean/stddev and continue with the detection on the original Exceptions with the calculated figures with exactly the same query you posted.
answered Nov 19 '18 at 17:59
RoyORoyO
261
261
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53307372%2fhow-to-exclude-a-row-from-stddev-mean-calculations-and-join-later%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown