Index. Source: interactive chaos
The index in the data frame acts as an address to be able to access data on certain rows and columns. The index for the row is usually called the index, while the index for the column we usually use the column name/label.
With the index feature, we can access and refer to data more efficiently than simple numerical indexes. In some cases, indexes can also be more complex hierarchical. Index can also organize data in a more structured manner so that it is more efficient in data analysis.
Function | Structure | Usage |
---|---|---|
*iloc* |
*df.iloc[row_indexes, column_indexes]* |
Take data from the index in integer |
*loc* |
*df.loc[row_indexes, column_indexes]* |
Take data from index in non-integer |
Case | Structure | Example |
---|---|---|
Take 1 row/column only | df.iloc[single_row_index, single_column_indexes] |
*df.iloc[100, 3] →* row 101st,4th column |
Take some rows/column indexes in list | df.iloc[row_indexes_list, column_indexes_list] |
*df.iloc[[20,30, 40], [4,5]] →* row 21st,31st and 41st then 5th and 6th column |
Take some rows/columns with indexes in range | df.iloc[row_indexes_range, column_indexes_range] |
*df.iloc[15:20,2:5] →* row 16th to 21st then 3rd to 6th column |
*df.loc[:5, 3:] →* without number in front of range it means all index BEFORE, while without number in END of the range it means all indexes after |
||
Take the last 1 row/column | df.iloc[-1, -1] |
*df.iloc[-1, -1]* |
Take some last rows and last columns | df.loc[-row_inde:,-column_index:] |
*df.iloc[-100:,-3:] →* 100 last rows then 3 last columns |
Example of Each Integer
Takes one specific row
Takes some rows/columns indexes in the list
Take some rows/columns with indexes in range
Take the last 1 row/column
Take all rows above the specific rows
Takes 1 last row
Takes 5 last rows
Take all rows above few last rows
Example of **row/column_indexes
**with non-integer index that is commonly used: