In Pandas, you can iterate over rows of a DataFrame using various methods. However, it's generally recommended to avoid using iteration as much as possible, as it can be slow and inefficient compared to vectorized operations.
Here are some ways to iterate over rows in a DataFrame:
1. Using the iterrows() method:
The iterrows() method returns an iterator that iterates over the DataFrame rows as (index, Series) pairs. Here's an example:
import pandas as pd
df = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'],
'age': [25, 30, 35]})
for index, row in df.iterrows():
print(index, row['name'], row['age'])
Output:
0 Alice 25
1 Bob 30
2 Charlie 35
0 Alice 25
1 Bob 30
2 Charlie 35
2. Using the itertuples() method:
The itertuples() method returns an iterator that iterates over the DataFrame rows as namedtuples. This method can be faster than iterrows() for large DataFrames. Here's an example:
import pandas as pd
df = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'],
'age': [25, 30, 35]})
for row in df.itertuples():
print(row.Index, row.name, row.age)
Output:
0 Alice 25
1 Bob 30
2 Charlie 35
0 Alice 25
1 Bob 30
2 Charlie 35
3. Using the apply() method:
The apply() method can be used to apply a function to each row of the DataFrame. The function should take a Series as input and return a Series as output. Here's an example:
import pandas as pd
df = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'],
'age': [25, 30, 35]})
def print_row(row):
print(row['name'], row['age'])
df.apply(print_row, axis=1)
Output:
Alice 25
Bob 30
Charlie 35
Note that using vectorized operations, such as those provided by Pandas and NumPy, can be much faster than iterating over rows in a DataFrame. Whenever possible, it's recommended to use vectorized operations instead of iteration. Alice 25
Bob 30
Charlie 35
Comments (0)