Syntax:
Parameters:
- Func: A custom/in-built function that can be applied on the rows/columns.
- axis: {columns (1), index (0)}
- skipna: Ignore the NA/null values when calculating the result.
- result_type: Returned type of the result like Series,DataFrame, etc.
- args: Specify the arguments.
Method 1: Apply() with Custom Function
To do some operation on all the rows in the Pandas DataFrame, we need to write a function that will do the computation and pass the function name within the apply() method. By this, the function is applied to each and every row.
Example:
In this example, we are having a DataFrame named “analysis” with 4 columns of integer type. Now, we write two custom functions on these 4 columns.
- The “operation1” adds rows in all the columns. The result across each column is stored in the “Addition Result” column.
- The “operation2” multiplies the rows in all the columns. The result across each column is stored in the “Multiplication Result” column.
# Create the dataframe using lists
analysis = pandas.DataFrame([[23,1000,34,56],
[23,700,11,0],
[23,20,4,2],
[21,400,32,45],
[21,100,456,78],
[23,800,90,12],
[21,400,32,45],
[20,120,1,67],
[23,100,90,12],
[22,450,76,56],
[22,40,0,1],
[22,12,45,0]
],columns=[‘points1’,‘points2’,‘points3’,‘points4’])
# Display the DataFrame – analysis
print(analysis)
# Function that adds each row for all columns.
def operation1(row):
return row[0]+row[1]+row[2]+row[3]
# Function that adds each row for all columns.
def operation2(row):
return row[0]*row[1]*row[2]*row[3]
# Pass the function to the apply() method and store it in ‘Addition Result’ column
analysis[‘Addition Result’] = analysis.apply(operation1, axis=1)
# Pass the function to the apply() method and store it in ‘Multiplication Result’ column
analysis[‘Multiplication Result’] = analysis.apply(operation2, axis=1)
print()
print(analysis)
Output:
Explanation:
We pass the “operation1” and “operation2” to the apply() function separately one after the other. You can see that the sum of all values across each row is stored in the “Addition Result” column and the product of all values across each row is stored in “Multiplication Result” column.
Method 2: Apply() with Lambda Expression
Here, we pass the lambda as a parameter to the apply() function and do the computation inside itself.
Example:
Let’s add and multiply four rows like the previous example and store them in two columns.
# Create the dataframe using lists
analysis = pandas.DataFrame([[23,1000,34,56],
[23,700,11,0],
[23,20,4,2],
[21,400,32,45],
[21,100,456,78],
[23,800,90,12],
[21,400,32,45],
[20,120,1,67],
[23,100,90,12],
[22,450,76,56],
[22,40,0,1],
[22,12,45,0]
],columns=[‘points1’,‘points2’,‘points3’,‘points4’])
# Add all rows and store them in the ‘Addition Result’ column.
analysis[‘Addition Result’] = analysis.apply(lambda record : record[0]+record[1]+record[2]+record[3], axis=1)
# Multiply all rows and store them in the ‘Multiplication Result’ column.
analysis[‘Multiplication Result’] = analysis.apply(lambda record : record[0]*record[1]*record[2]*record[3], axis=1)
print()
print(analysis)
Output:
Explanation:
The expressions that are used are as follows:
analysis[‘Addition Result’] = analysis.apply(lambda record : record[0]+record[1]+record[2]+record[3], axis=1)
# Multiply all rows and store them in the ‘Multiplication Result’ column.
analysis[‘Multiplication Result’] = analysis.apply(lambda record : record[0]*record[1]*record[2]*record[3], axis=1)
You can see that the sum of all values across each row is stored in the “Addition Result” column and the product of all values across each row is stored in the “Multiplication Result” column.
Method 3: Apply() with Pandas.Series
If you want to modify the row separately or you want to update all rows individually, you can do it by passing the Series inside the lambda expression.
Example:
Let’s subtract 10 from all columns.
# Create the dataframe using lists
analysis = pandas.DataFrame([[23,1000,34,56],
[23,700,11,0],
[23,20,4,2],
[21,400,32,45],
[21,100,456,78],
[23,800,90,12],
[21,400,32,45],
[20,120,1,67],
[23,100,90,12],
[22,450,76,56],
[22,40,0,1],
[22,12,45,0]
],columns=[‘points1’,‘points2’,‘points3’,‘points4’])
# Subtract 10 from all the rows
analysis=analysis.apply(lambda record : pandas.Series([record[0]–10,record[1]–10,record[2]–10,record[3]–10]), axis=1)
print(analysis)
Output:
Explanation:
The expressions that are used are as follows:
You can see that the values in all columns are subtracted from 10.
Method 4: Apply() with NumPy Functions
Let’s use the NumPy functions to perform the computation on all rows in Pandas DataFrame.
Example:
Let’s use the NumPy function to return the following:
- The sum of all rows using numpy.sum().
- The average of all rows using numpy.mean().
- The maximum among each row using numpy.max().
- The minimum among each row using numpy.min().
We store the result in four different columns.
import numpy
# Create the dataframe using lists
analysis = pandas.DataFrame([[23,1000,34,56],
[23,700,11,0],
[23,20,4,2],
[21,400,32,45],
[21,100,456,78],
[23,800,90,12],
[21,400,32,45],
[20,120,1,67],
[23,100,90,12],
[22,450,76,56],
[22,40,0,1],
[22,12,45,0]
],columns=[‘points1’,‘points2’,‘points3’,‘points4’])
# Get total sum of rows and store in ‘Sum Result’
analysis[‘Sum Result’] = analysis.apply(numpy.sum, axis = 1)
# Get average of rows and store in ‘Sum Result’
analysis[‘Average Result’] = analysis.apply(numpy.mean, axis = 1)
# Get maximum value from each row and store in ‘Max Result’
analysis[‘Max Result’] = analysis.apply(numpy.max, axis = 1)
# Get minimum value from each row and store in ‘Min Result’
analysis[‘Min Result’] = analysis.apply(numpy.min, axis = 1)
print(analysis)
Output:
Conclusion
We provided this guide to explain how to utilize the apply() function to every row. Our main aim is to provide you with a good, easy, and detailed explanation of this “apply()” function concept. We demonstrated four distinct examples in which we showed how to apply the function to every row in “pandas” with the help of the “apply()” function. We explained that when we want to implement any function to every row in DataFrame in Pandas, we utilize the “apply()” function for this purpose.