Subtracting values of attributes within one Pandas Dataframe from another dataframe

Question

This problem contains 3 separate dataframes. df1 represents the 'Total' of products 1,2,3, containing 'value1', 'value2' df2 represents the 'Customer1' of products 1,2,3, containing 'value1', 'value2' df3 represents the 'Customer2' of products 1,2,3, containing 'value1', 'value2'

df2 & df3 are essentially subsets of df1.

I would like to create another dataframe that subtracts df2&df3 from df1 and label this df4. I want df4 to be 'remaining customers' within the 'Market' Column.

This is what I have done so far

import pandas as pd


d1 = {'Market': ['Total', 'Total','Total'], 'Product Code': [1, 2, 3], 
'Value1':[10, 20, 30], 'Value2':[5, 15, 25]}
df1 = pd.DataFrame(data=d1)
df1



d2 = {'Market': ['Customer1', 'Customer1','Customer1'], 'Product Code': [1, 
2, 3], 'Value1':[3, 14, 10], 'Value2':[2, 4, 6]}
df2 = pd.DataFrame(data=d2)
df2


d3 = {'Market': ['Customer2', 'Customer2','Customer2'], 'Product Code': [1, 
2, 3], 'Value1':[3, 3, 4], 'Value2':[2, 6, 10]}
df3 = pd.DataFrame(data=d3)
df3

This produces the following result..

Market  Product Code  Value1  Value2
0  Total             1      10       5
1  Total             2      20      15
2  Total             3      30      25
  Market  Product Code  Value1  Value2
0  Customer1             1       3       2
1  Customer1             2      14       4
2  Customer1             3      10       6
  Market  Product Code  Value1  Value2
0  Customer2             1       3       2
1  Customer2             2       3       6
2  Customer2             3       4      10

To create df4, I try the following code and get an error 'TypeError: unsupported operand type(s) for -: 'str' and 'str'' Can anyone help?

df4 = df1-(df2+df3)

print(df4)

cs95 · Accepted Answer · 2018-02-19 03:43:39Z

3

Drop Market, set Product Code as the index, and perform index aligned arithmetic on the product codes. After that, just reset the index and insert Market into the result.

df1, df2, df3 = [
      df.drop('Market', 1).set_index('Product Code') for df in [df1, df2, df3]
]

df4 = (df1 - (df2 + df3)).reset_index()
df4.insert(0, 'Market', 'RemainingCustomers')

               Market  Product Code  Value1  Value2
0  RemainingCustomers             1       4       1
1  RemainingCustomers             2       3       5
2  RemainingCustomers             3      16       9

answered Feb 19, 2018 at 3:43

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

piRSquared · Accepted Answer · 2018-02-19 04:05:33Z

Not exactly what OP asked for but it seemed to me that this might be a better way to manage the data.

df = pd.concat([df1, df2, df3]).set_index(['Product Code', 'Market'])

formula = 'RemainingCustomers = Total - Customer1 - Customer2'
df = df.unstack().stack(0).eval(formula).unstack()
df

Market       Customer1        Customer2         Total        RemainingCustomers       
                Value1 Value2    Value1 Value2 Value1 Value2             Value1 Value2
Product Code                                                                          
1                    3      2         3      2     10      5                  4      1
2                   14      4         3      6     20     15                  3      5
3                   10      6         4     10     30     25                 16      9

And

df['RemainingCustomers']

              Value1  Value2
Product Code                
1                  4       1
2                  3       5
3                 16       9

If we insist on the requested output

df.stack(0).reset_index().query(
    'Market == "RemainingCustomers"').reindex(columns=df1.columns)

                Market  Product Code  Value1  Value2
2   RemainingCustomers             1       4       1
6   RemainingCustomers             2       3       5
10  RemainingCustomers             3      16       9

Or

df.stack(0).xs(
    'RemainingCustomers', level=1, drop_level=False
).reset_index().reindex(columns=df1.columns)

               Market  Product Code  Value1  Value2
0  RemainingCustomers             1       4       1
1  RemainingCustomers             2       3       5
2  RemainingCustomers             3      16       9

jpp · Accepted Answer · 2018-02-19 03:53:36Z

Here is one way:

cols = ['Value1', 'Value2']
df4 = df1[cols].subtract(df2[cols].add(df3[cols]))\
               .assign(**{'Market': 'RemainingCustomers', 'Product Code': [1, 2, 3]})\
               .sort_index(axis=1)

#                Market  Product Code  Value1  Value2
# 0  RemainingCustomers             1       4       1
# 1  RemainingCustomers             2       3       5
# 2  RemainingCustomers             3      16       9

Explanation

df1[cols].subtract(df2[cols].add(df3[cols])) performs calculation on specified columns only.
assign(**{'Market': 'RemainingCustomers', 'Product Code': [1, 2, 3]}) adds extra columns required for results dataframe.
sort_index(axis=1) reorders columns for desired output.

BENY · Accepted Answer · 2018-02-19 04:36:04Z

2

Maybe we can use select_dtypes

(df1.select_dtypes(exclude = 'object')
     -df2.select_dtypes(exclude = 'object')
       -df3.select_dtypes(exclude = 'object')).\
            drop('Product Code',1).\
              combine_first(df1).\
               assign(Market='remaining customers')
Out[133]: 
                Market  Product Code  Value1  Value2
0  remaining customers           1.0       4       1
1  remaining customers           2.0       3       5
2  remaining customers           3.0      16       9

edited Feb 19, 2018 at 4:36

answered Feb 19, 2018 at 4:20

BENY

324k22 gold badges176 silver badges250 bronze badges

1 Comment

cs95 Over a year ago

This definitely works but you should segment your answer across multiple lines :)

Collectives™ on Stack Overflow

Subtracting values of attributes within one Pandas Dataframe from another dataframe

4 Answers 4

Comments

Comments

1 Comment

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

1 Comment

1 Comment

Related