.apply
Apply is utilized to do transformation using custom function. For instance, we want to do algorithm transformation.
log_price = dataset["price"].apply(np.log)
log_price.hist()
<aside>
💡 **.apply
**is used to do transformation. Since pandas does not have certain formula to process math. We need to manually assign using .apply
</aside>
Create an anonymous function using .lambda
. This function is utilized to create a function without declaring a special or certain functions. For example, we want to actually create function f(x) = log(x) + 1
dataset["price"].apply(lambda x: np.log(x) + 1)
Implementation of function usage and apply. For instance, when we want to define the outliers.
q1, q3 = dataset["price"].quantile([0.25,0.75])
iqr = q3 - q1
def outliers(x):
if (x > q3 + 1.5 * iqr) or (x < q1 - 1.5 * iqr):
return True
else:
return False
dataset["price_outliers"] = dataset["price"].apply(outliers)
dataset["price_outliers"]
0 False
1 False
2 False
3 False
4 False
...
53935 False
53936 False
53937 False
53938 False
53939 False
Name: price_outliers, Length: 53940, dtype: bool