0

Element wise subtraction of dataframes in python pandas

The following works, but I was surprised that I needed to use numpy to do this.

import pandas as pd
from io import StringIO

csv = '''\
pool,employee,xd1,xd2,xd1_bar,xd2_bar
1,a,-5.25,-3.92,-4.25,-3.42
1,b,-4.25,-3.92,-4.25,-3.42
1,c,-4.25,-2.92,-4.25,-3.42
1,d,-3.25,-2.92,-4.25,-3.42
2,e,-1.25,-0.92,-0.5,-0.16999999999999998
2,f,-1.25,1.08,-0.5,-0.16999999999999998
2,g,-0.25,0.08,-0.5,-0.16999999999999998
2,h,0.75,-0.92,-0.5,-0.16999999999999998
3,i,3.75,3.08,4.75,3.58
3,j,4.75,2.08,4.75,3.58
3,k,4.75,4.08,4.75,3.58
3,l,5.75,5.08,4.75,3.58
'''

data = pd.read_csv(StringIO(csv))

c1 = ["xd1", "xd2"]
c2 = ["xd1_bar", "xd2_bar"]

data_sub = data.join(
    pd.DataFrame(np.array(data[c1]) - np.array(data[c2]), columns=["x1_dev", "x2_dev"])
)

I expected the following to work:

data[c1].cub(data[c2]) 
1
  • what is c1 c2 here Commented Jul 4, 2020 at 15:42

1 Answer 1

2

They way you did here is correct ,since the pandas dataframe subtract will match the column and index , since you have different column and index for the c1 and c2

Kindly fix your output ,adding the index , since new dataframe , index will be range index, but your original one may not , you may not want to lost the information after join

data_sub = data.join(
    pd.DataFrame(np.array(data[c1]) - np.array(data[c2]), columns=["x1_dev", "x2_dev"], index=data.index)
)
Sign up to request clarification or add additional context in comments.

1 Comment

An alternative: data.join(np.subtract(data[c1],data[c2]),rsuffix='_dev')

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.