To sum Pandas DataFrame columns (given selected multiple columns) using either sum()
, iloc[]
, eval()
, and loc[]
functions. Among these Pandas DataFrame.sum()
function returns the sum of the values for the requested axis, in order to calculate the sum of columns use axis=1
. In this article, I will explain how to sum Pandas DataFrame rows for given columns with examples.
Advertisements
Key Points –
- The
sum()
method can be used to compute the sum of values across specified DataFrame columns. - By default, the
sum()
function operates along the index (i.e., vertically), summing each column. - The
axis
parameter can be specified to control the direction of the sum;axis=0
sums across rows, whileaxis=1
sums across columns. - You can sum specific columns by selecting them before applying the
sum()
function, enabling focused calculations. - The
sum()
function automatically skips over NA/null values by default, ensuring accurate results. - The
sum()
method can be chained with other DataFrame operations, allowing for concise and readable data manipulation.
Quick Examples of Sum DataFrame Columns
If you are in a hurry, below are some quick examples of how to sum pandas DataFrame by given or all columns.
# Quick examples of sum DataFrame columns# Example 1: Using DataFrame.sum() # To Sum of all columns # for each rowdf2 = df.sum(axis=1)# Example 2: Sum of all the columns for each row DataFramedf['Sum'] = df.sum(axis=1)# Example 3: Just a few columns to sumdf['Sum'] = df['mathematics'] + df['science'] + df['english']# Example 4: Remove english columncol_list= list(df)col_list.remove('english')# Example 5: Sum specific columnscol_list= list(df)col_list.remove('english')df['Sum'] = df[col_list].sum(axis=1)# Example 6: Select 1 to 3 columns to sumdf['Sum']=df.iloc[:,1:3].sum(axis=1)# Example 7: Select 1 and 2 columns # To sum Using DataFrame.iloc[] df['Sum']=df.iloc[:,[1,2]].sum(axis=1)# Example 8: Using DataFrame.iloc[] # To select 2 and 3 columns to sumdf['Sum']=df.iloc[:,[2,3]].sum(axis=1)# Example 9: Sum columns Fee and Discount for a row from r2 to r3df['Sum'] = df.loc['r2':'r4',['mathematics','science']].sum(axis = 1)# Example 10: Using DataFrame.eval() function # To sum of rowsdf2 = df.eval('Sum = mathematics + english')# Example 11: Using DataFrame.loc[] and eval function # To sum specific rowsdf2 = df.loc['r2':'r4'].eval('Sum = mathematics + science')
Now, let’s create a DataFrame with a few rows and columns, execute these examples and validate the results. Our DataFrame contains column names student_name
, mathematics
, science
and, english
.
# Create DataFrameimport pandas as pdstudentdetails = { "student_name":["Ram","Sam","Scott","Ann","John"], "mathematics" :[80,90,85,70,95], "science" :[85,95,80,90,75], "english" :[90,85,80,70,95] }index_labels=['r1','r2','r3','r4','r5']df = pd.DataFrame(studentdetails ,index=index_labels)print("Create DataFrame:\n", df)
Yields below output.
Using DataFrame.sum() to Sum All Columns
Use DataFrame.sum()
to get the sum/total of a DataFrame for both rows and columns. To get the total sum of columns you can use axis=1
param. By default, this method takes axis=0
which means the summing of rows.
# Using DataFrame.sum() to Sum of each rowdf2 = df.sum(axis=1)print("Get sum of all the columns for each row:\n", df2)
Yields below output. The above example calculates the sum of all numeric columns for each row. This also returns the index for each row to identify the result.
Add Sum Columns to DataFrame Using DataFrame.sum()
If you notice the above output, the actual column values that are part of the sum are not returned by DataFrame.sum()
function, however, you can get all columns including the sum column by assigning the DataFrame.sum()
to a DataFrame column. I would like to add a column 'Sum'
which is the sum of the column 'mathematics'
, 'science'
and ‘english'
.
# Sum of all the columns for each row DataFramedf['Sum'] = df.sum(axis=1)print("Get sum of all the columns for each row:\n", df)# If you have few columns to sumdf['Sum'] = df['mathematics'] + df['science'] + df['english']print("Get sum of all the columns for each row:\n", df2)
Yields below output.
# Output:# Get sum of all the columns for each row: student_name mathematics science english Sumr1 Ram 80 85 90 255r2 Sam 90 95 85 270r3 Scott 85 80 80 245r4 Ann 70 90 70 230r5 John 95 75 95 265
Calculate the Sum of the Given Columns
To calculate the sum of the given column or a list of columns you can use the sum() function. First, create a list with all the columns that you want to slice the DataFrame with the selected list of columns and apply the sum() function. Use df['Sum']=df[col_list].sum(axis=1)
to get the total sum.
# Create List of columnscol_list= ['student_name', 'mathematics', 'science']# Get sum of specific columns for each rowdf['Sum'] = df[col_list].sum(axis=1)print("Get sum of specific columns for each row:\n", df)
Yields below output.
# Output:# Get sum of specific columns for each row student_name mathematics science english Sumr1 Ram 80 85 90 165r2 Sam 90 95 85 185r3 Scott 85 80 80 165r4 Ann 70 90 70 160r5 John 95 75 95 170
Using DataFrame.iloc[] to Get Sum of Columns
Use DataFrame.iloc[] to select which columns to sum and call sum(axis=1)
on DataFrame.
# Select 1 to 3 columns to sumdf['Sum']=df.iloc[:,1:3].sum(axis=1)print("Get sum of specific columns for each row:\n", df)# Select 1 and 2 columns to sum # Using DataFrame.iloc[] df['Sum']=df.iloc[:,[1,2]].sum(axis=1)print("Get sum of specific columns for each row:\n", df)# Using DataFrame.iloc[]# To select 2 and 3 columns to sumdf['Sum']=df.iloc[:,[2,3]].sum(axis=1)print("Get sum of specific columns for each row:\n", df)
Yields below output.
# Output:# Get sum of specific columns for each row: student_name mathematics science english Sumr1 Ram 80 85 90 165r2 Sam 90 95 85 185r3 Scott 85 80 80 165r4 Ann 70 90 70 160r5 John 95 75 95 170
Using DataFrame.loc[] to Sum All Rows From Row r2 to r4
By using DataFrame.loc[] function, select the columns by labels, and then use sum(axis=1)
function to calculate the total sum of columns. Using this you can also specify the rows you want to get the sum value. For rows that are not specified with loc[]
results with NaN
on the Sum
column.
# Sum columns Fee and Discount for row from r2 to r3df['Sum'] = df.loc['r2':'r4',['mathematics','science']].sum(axis = 1)print("Get sum of specific columns for each row:\n", df)
Yields below output.
# Output:# Get sum of specific columns for each row: student_name mathematics science english Sumr1 Ram 80 85 90 NaNr2 Sam 90 95 85 185.0r3 Scott 85 80 80 165.0r4 Ann 70 90 70 160.0r5 John 95 75 95 NaN
To understand the differences between loc[] and iloc[], read the article pandas difference between loc[] vs iloc[]
Sum of Columns Using DataFrame.eval() Function
Try DataFrame.eval('Sum=mathematics + english')
to sum the specific columns for each row using the eval
function.
# Using DataFrame.eval() function to sum the columns for each rowdf2 = df.eval('Sum = mathematics + english')print("Get sum of specific columns for each row:\n", df)
Yields below output.
# Output:# Get sum of specific columns for each row: student_name mathematics science english Sumr1 Ram 80 85 90 170r2 Sam 90 95 85 175r3 Scott 85 80 80 165r4 Ann 70 90 70 140r5 John 95 75 95 190
Using DataFrame.loc[] and eval Function to Sum Specific Columns
You can use DataFrame.loc['r2':'r4'].eval('Sum = mathematics + science')
to get the sum of specific columns for specific rows. To evaluate the sum of the specific columns with specified rows, use DataFrame.loc[]
.
# Using DataFrame.loc[] and eval function to sum specific rowsdf2 = df.loc['r2':'r4'].eval('Sum = mathematics + science')print("Get sum of specific columns for specific each row:\n", df)
Yields below output.
# Output:# Get sum of specific columns for specific each row: student_name mathematics science english Sumr2 Sam 90 95 85 185r3 Scott 85 80 80 165r4 Ann 70 90 70 160
Complete Example For Sum DataFrame Columns
import pandas as pdstudentdetails = { "studentname":["Ram","Sam","Scott","Ann","John"], "mathematics" :[80,90,85,70,95], "science" :[85,95,80,90,75], "english" :[90,85,80,70,95] }index_labels=['r1','r2','r3','r4','r5']df = pd.DataFrame(studentdetails ,index=index_labels)print(df)# Using DataFrame.sum() to Sum of each rowdf2 = df.sum(axis=1)print("Get sum of all columns for each row:\n", df2)# Sum the rows of DataFramedf['Sum'] = df.sum(axis=1)print("Get sum of all columns for each row:\n", df)# Just a few columns to sumdf['Sum'] = df['mathematics'] + df['science'] + df['english']print("Get sum of all columns for each row:\n", df)# Sum specific columnscol_list= list(df)col_list.remove('english')df['Sum'] = df[col_list].sum(axis=1)print("Get sum of specific columns for each row:\n", df)# Select 1 to 3 columns to sumdf['Sum']=df.iloc[:,1:3].sum(axis=1)print("Get sum of specific columns for each row:\n", df)# Select 1 and 2 columns to sum Using DataFrame.iloc[] df['Sum']=df.iloc[:,[1,2]].sum(axis=1)print("Get sum of specific columns for each row:\n", df)# Using DataFrame.iloc[] to select 2 and 3 columns to sumdf['Sum']=df.iloc[:,[2,3]].sum(axis=1)print("Get sum of specific columns for each row:\n", df)# Sum columns Fee and Discount for row from r2 to r3df['Sum'] = df.loc['r2':'r4',['mathematics','science']].sum(axis = 1)print("Get sum of specific columns for each row:\n", df)# Using DataFrame.eval() function to sum of rowsdf2 = df.eval('Sum = mathematics + english')print("Get sum of specific columns for each row:\n", df2)# Using DataFrame.loc[] and eval function to sum specific rowsdf2 = df.loc['r2':'r4'].eval('Sum = mathematics + science')print("Get sum of specific columns for each row:\n", df2)
Frequently Asked Questions of Sum DataFrame Columns
How do I sum a single column in a Pandas DataFrame?
To sum a single column in a DataFrame, you can use the .sum()
method on that column. For example, df['column_name'].sum()
.
How do I sum multiple columns in a Pandas DataFrame?
To sum multiple columns in a DataFrame, you can pass a list of column names to the .sum()
method. For example, df[['column1', 'column2', 'column3']].sum()
How can I sum columns and store the result as a new column in the DataFrame?
You can create a new column to store the sum of the desired columns. For example, df['sum_column'] = df[['column1', 'column2', 'column3']].sum(axis=1)
How do I sum a specific range of columns in a Pandas DataFrame?
You can slice the DataFrame to select the range of columns you want to sum and then use the .sum()
method. For example, df.iloc[:, 2:5].sum()
How can I sum columns along a specific axis (row-wise instead of column-wise)?
To sum columns along rows (row-wise), you can use the .sum()
method with the axis
parameter set to 1. For example, df.sum(axis=1)
Conclusion
In this article, you have learned how to calculate the sum of Pandas DataFrame columns for all or specified columns for each row with the help of df.sum()
function, eval()
function, loc[]
attribute, and iloc[]
attribute of Pandas.
Happy Learning !!
Related Articles
- Calculate Summary Statistics in Pandas
- Pandas Get Total | Sum of Column
- How to Merge Series into Pandas DataFrame
- Convert Float to Integer in Pandas DataFrame
- Append a List as a Row to Pandas DataFrame
- Convert String to Float in Pandas DataFrame
- Pandas Sum DataFrame Rows With Examples
- Pandas Select DataFrame Columns by Label or Index