Slicing and Dicing Pandas DataFrames | by Punyakeerthi BL | Jun, 2024

Earlier than continuing with this text, please learn the next for continuation:

When working with knowledge in pandas DataFrames, you typically want to pick out particular rows, columns, or subsets for additional evaluation or manipulation. That is the place loc and iloc are available in as highly effective instruments for knowledge choice.

Why Use `loc` and `iloc`?

Think about you’ve gotten a big dataset of buyer data in a DataFrame. You would possibly need to:

Filter rows primarily based on particular standards (e.g., prospects from a selected area)
Choose columns containing related knowledge (e.g., buy historical past)
Seize particular knowledge factors by their row and column labels

loc and iloc make these duties environment friendly and intuitive, permitting you to focus on knowledge utilizing labels or positions inside the DataFrame.

Understanding `loc`

Objective: Selects rows and/or columns by label.
Syntax: df.loc[row_labels, column_labels]
Parameters:
row_labels: Could be a single label, a listing of labels, a slice, or a boolean array for filtering.
Single label: Selects the row with that particular label.
Listing of labels: Selects rows akin to the labels within the checklist.
Slice: Selects rows inside a specified vary primarily based on labels (just like Python slicing).
Boolean array: Selects rows the place the corresponding factor within the array is True.
column_labels (non-compulsory): Much like row_labels, however for choosing columns. If not offered, selects all columns for the chosen rows.

Instance:

import pandas as pd
knowledge = {'Identify': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 22, 38],
'Metropolis': ['New York', 'Los Angeles', 'Chicago', 'Miami']}
df = pd.DataFrame(knowledge)
# Choose row with label 'Bob' (utilizing single label)
print(df.loc['Bob'])
# Choose rows with labels 'Alice' and 'Charlie' (utilizing checklist of labels)
print(df.loc[['Alice', 'Charlie']])
# Choose rows the place Age is larger than 25 (utilizing boolean array)
print(df.loc[df['Age'] > 25])
# Choose 'Identify' and 'Metropolis' columns (utilizing column labels)
print(df.loc[:, ['Name', 'City']])

Output:

Identify  Age       Metropolis
Bob    Bob   30  Los AngelesIdentify  Age      Metropolis
Alice  Alice   25  New York
Charlie  Charlie  22  Chicago
Identify  Age       Metropolis
Bob    Bob   30  Los Angeles
David  David   38       Miami
Identify       Metropolis
0    Alice  New York
1       Bob  Los Angeles
2  Charlie    Chicago
3     David       Miami

Understanding `iloc`

Objective: Selects rows and/or columns by integer place.
Syntax: df.iloc[row_positions, column_positions]
Parameters:
row_positions: May be an integer, a listing of integers, or a slice for positional choice.
Integer: Selects the row at that particular place (0-based indexing, ranging from the primary row).
Listing of integers: Selects rows akin to the positions within the checklist.
Slice: Selects rows inside a specified vary primarily based on positions (just like Python slicing).
column_positions (non-compulsory): Much like row_positions, however for choosing columns by place. If not offered, selects all columns for the chosen rows.

Instance:

Python

# Choose second row (utilizing integer place)
print(df.iloc[1])
# Choose first two rows (utilizing checklist of positions)
print(df.iloc[[0, 1]])
# Choose rows from index 1 (inclusive) to three (unique)
print(df.iloc[1:3])
# Choose first column (utilizing integer place for column)
print(df.iloc[:, 0])

Output:

Identify    Bob  Age    30       Metropolis  Los Angeles
dtype: objectIdentify  Age       Metropolis
0    Alice   25  New York
1       Bob   30  Los Angeles
Identify  Age       Metropolis
Bob    Bob   3

Source link

Slicing and Dicing Pandas DataFrames | by Punyakeerthi BL | Jun, 2024

Working with Input-Convex Neural Networks part3(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

Embracing the Future: The Rise of AI-Driven Development in Software Engineering The software… | by DevBlogs | Jul, 2024

Research on Metaheuristic methods part4(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

How Real-Time Data Analytics and AI Are Transforming Heavy Equipment Operations

NVIDIA Accelerates Google Quantum AI Processor Design With Simulation of Quantum Device Physics

Game Development and Cloud Computing: Benefits of Cloud-Native Game Servers

Teradata AI Unlimited in Microsoft Fabric is Now Available for Public Preview through Microsoft Fabric Workload Hub

Cognigy Unveils Agentic AI: Transforming the Future of Enterprise Contact Centers

Our Picks

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space: A Brief Summary | by Kavishka Abeywardana | Jul, 2024

The Impact of Market Trends on Stock Price Targets

Dynamics of Data Manifolds used in Machine Learning Research methods part12 | by Monodeep Mukherjee | May, 2024

Most Popular

Revolutionizing the Way We Find Love

Will GenAI Replace Data Engineers? No – And Here’s Why.

Assortment Optimization Machine Learning | by Danishaliarshar | Mar, 2024

Slicing and Dicing Pandas DataFrames | by Punyakeerthi BL | Jun, 2024

Why Use loc and iloc?

Understanding loc

Understanding iloc

Related Posts

Why Use `loc` and `iloc`?

Understanding `loc`

Understanding `iloc`