Vectorize your data
Pandas vectorized methods
Concepts covered
- Lambda Refresher
- lambda functions are small inline functions that are defined on-the-fly in Python
- lambda x: x>= 1 will take an input x and return x>=1, or a boolean that equals True or False.
- map()
- create a new Series by applying the lambda function to each element
- can only be used on a Series to return a new Series
- applymap()
- create a new DataFrame by applying the lambda function to each element
- can only be used on a DataFrame to return a new DataFrame
- df.apply(numpy.mean)
- Get mean of every column in a DataFrame
- Exactly the same as df.mean()
In [50]:
import pandas as pd
import numpy as np
In [51]:
# columns
columns = ['one', 'two']
In [52]:
# index
index = ['a', 'b', 'c', 'd']
In [53]:
# lists
one = [1, 2, 3, 4]
two = [1, 2, 3, 4]
In [54]:
# dictionary
d = {
'one': one,
'two': two
}
In [55]:
# DataFrame
df = pd.DataFrame(d, columns=col, index=index)
In [56]:
df
Out[56]:
In [58]:
# mean of every single column in df
df.apply(np.mean)
Out[58]:
In [61]:
# you can use a pandas command too
df.mean()
Out[61]:
In [64]:
# .map() on particular columns (Series)
# goes through every value in column and evaluate if it's > 1
df['one'].map(lambda x: x >= 1)
Out[64]:
In [65]:
df.applymap(lambda x: x >= 1)
Out[65]: