Pandas dataframe find first and last element given condition and calculate slope











up vote
0
down vote

favorite












The situation:



I have a pandas dataframe where I have some data about the production of a product. The product is produced in 3 phases. The phases are not fixed meaning that their cycles (the time till last) is changing. During the production phases, at each cycle the temperature of the product is measured.



Please see the table below:



enter image description here



The problem:



I need to calculate the slope for each cycle of each phase for each product. I also need to add it to the dataframe in a new column called "Slope". The one you can see, highlighted in yellow was added by me manually in an excel file. The real dataset contains hundreds of parameters (not only temperatures) so in reality I need to calculate the slope for many, many columns, therefore I tried to define a function.



My solution is not working at all:



This is the code I tried, but it does not work. I am trying to catch the first and last row for the given product, for the given phase. And then get the temperature data and the difference of these two rows. And this way I could calculate the slope.
This is all I could come up with so far (I created another column called: "Max_cylce_no", this stores the maximum amount of the cycle for each phase):



temp_at_start=-1

def slope(col_name):
global temp_at_start
start_cycle_no = 1
if row["Cycle"]==1:
temp_at_start =row["Temperature"]
start_row = df.index(row)


cycle_numbers = row["Max_cylce_no"]
last_cycle_row = cycle_numbers + start_row


last_temp = df.loc[last_cycle_row, "Temperature"]


And the way I would like to apply it:



df.apply(slope("Temperature"), axis=1)


Unfortunatelly I get a NameError right away saying that: name 'row' is not defined.



Could you please help me and show me the right direction on how to solve this problem. It gives me a really hard time. :(



Thank you in advance!










share|improve this question






















  • providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
    – Yuca
    Nov 10 at 19:26















up vote
0
down vote

favorite












The situation:



I have a pandas dataframe where I have some data about the production of a product. The product is produced in 3 phases. The phases are not fixed meaning that their cycles (the time till last) is changing. During the production phases, at each cycle the temperature of the product is measured.



Please see the table below:



enter image description here



The problem:



I need to calculate the slope for each cycle of each phase for each product. I also need to add it to the dataframe in a new column called "Slope". The one you can see, highlighted in yellow was added by me manually in an excel file. The real dataset contains hundreds of parameters (not only temperatures) so in reality I need to calculate the slope for many, many columns, therefore I tried to define a function.



My solution is not working at all:



This is the code I tried, but it does not work. I am trying to catch the first and last row for the given product, for the given phase. And then get the temperature data and the difference of these two rows. And this way I could calculate the slope.
This is all I could come up with so far (I created another column called: "Max_cylce_no", this stores the maximum amount of the cycle for each phase):



temp_at_start=-1

def slope(col_name):
global temp_at_start
start_cycle_no = 1
if row["Cycle"]==1:
temp_at_start =row["Temperature"]
start_row = df.index(row)


cycle_numbers = row["Max_cylce_no"]
last_cycle_row = cycle_numbers + start_row


last_temp = df.loc[last_cycle_row, "Temperature"]


And the way I would like to apply it:



df.apply(slope("Temperature"), axis=1)


Unfortunatelly I get a NameError right away saying that: name 'row' is not defined.



Could you please help me and show me the right direction on how to solve this problem. It gives me a really hard time. :(



Thank you in advance!










share|improve this question






















  • providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
    – Yuca
    Nov 10 at 19:26













up vote
0
down vote

favorite









up vote
0
down vote

favorite











The situation:



I have a pandas dataframe where I have some data about the production of a product. The product is produced in 3 phases. The phases are not fixed meaning that their cycles (the time till last) is changing. During the production phases, at each cycle the temperature of the product is measured.



Please see the table below:



enter image description here



The problem:



I need to calculate the slope for each cycle of each phase for each product. I also need to add it to the dataframe in a new column called "Slope". The one you can see, highlighted in yellow was added by me manually in an excel file. The real dataset contains hundreds of parameters (not only temperatures) so in reality I need to calculate the slope for many, many columns, therefore I tried to define a function.



My solution is not working at all:



This is the code I tried, but it does not work. I am trying to catch the first and last row for the given product, for the given phase. And then get the temperature data and the difference of these two rows. And this way I could calculate the slope.
This is all I could come up with so far (I created another column called: "Max_cylce_no", this stores the maximum amount of the cycle for each phase):



temp_at_start=-1

def slope(col_name):
global temp_at_start
start_cycle_no = 1
if row["Cycle"]==1:
temp_at_start =row["Temperature"]
start_row = df.index(row)


cycle_numbers = row["Max_cylce_no"]
last_cycle_row = cycle_numbers + start_row


last_temp = df.loc[last_cycle_row, "Temperature"]


And the way I would like to apply it:



df.apply(slope("Temperature"), axis=1)


Unfortunatelly I get a NameError right away saying that: name 'row' is not defined.



Could you please help me and show me the right direction on how to solve this problem. It gives me a really hard time. :(



Thank you in advance!










share|improve this question













The situation:



I have a pandas dataframe where I have some data about the production of a product. The product is produced in 3 phases. The phases are not fixed meaning that their cycles (the time till last) is changing. During the production phases, at each cycle the temperature of the product is measured.



Please see the table below:



enter image description here



The problem:



I need to calculate the slope for each cycle of each phase for each product. I also need to add it to the dataframe in a new column called "Slope". The one you can see, highlighted in yellow was added by me manually in an excel file. The real dataset contains hundreds of parameters (not only temperatures) so in reality I need to calculate the slope for many, many columns, therefore I tried to define a function.



My solution is not working at all:



This is the code I tried, but it does not work. I am trying to catch the first and last row for the given product, for the given phase. And then get the temperature data and the difference of these two rows. And this way I could calculate the slope.
This is all I could come up with so far (I created another column called: "Max_cylce_no", this stores the maximum amount of the cycle for each phase):



temp_at_start=-1

def slope(col_name):
global temp_at_start
start_cycle_no = 1
if row["Cycle"]==1:
temp_at_start =row["Temperature"]
start_row = df.index(row)


cycle_numbers = row["Max_cylce_no"]
last_cycle_row = cycle_numbers + start_row


last_temp = df.loc[last_cycle_row, "Temperature"]


And the way I would like to apply it:



df.apply(slope("Temperature"), axis=1)


Unfortunatelly I get a NameError right away saying that: name 'row' is not defined.



Could you please help me and show me the right direction on how to solve this problem. It gives me a really hard time. :(



Thank you in advance!







python pandas dataframe






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 10 at 19:19









hunsnowboarder

8111




8111












  • providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
    – Yuca
    Nov 10 at 19:26


















  • providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
    – Yuca
    Nov 10 at 19:26
















providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
– Yuca
Nov 10 at 19:26




providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
– Yuca
Nov 10 at 19:26












1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)





share|improve this answer

















  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242570%2fpandas-dataframe-find-first-and-last-element-given-condition-and-calculate-slope%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)





share|improve this answer

















  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59















up vote
2
down vote



accepted










I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)





share|improve this answer

















  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59













up vote
2
down vote



accepted







up vote
2
down vote



accepted






I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)





share|improve this answer












I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 10 at 19:27









jezrael

306k20239315




306k20239315








  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59














  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59








1




1




Nice one , i believe this is the output OP required.
– pygo
Nov 10 at 19:49




Nice one , i believe this is the output OP required.
– pygo
Nov 10 at 19:49




1




1




You are amazing! Thank you so much! It works like charm!
– hunsnowboarder
Nov 10 at 19:59




You are amazing! Thank you so much! It works like charm!
– hunsnowboarder
Nov 10 at 19:59












@hunsnowboarder - You are welcome!
– jezrael
Nov 10 at 19:59




@hunsnowboarder - You are welcome!
– jezrael
Nov 10 at 19:59


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242570%2fpandas-dataframe-find-first-and-last-element-given-condition-and-calculate-slope%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Xamarin.iOS Cant Deploy on Iphone

Glorious Revolution

Dulmage-Mendelsohn matrix decomposition in Python