pandas中map和applymap及apply的区别
在pandas中,针对于行或者列又或者是每个元素的操作很容易让人混淆,下面我们来看看分别对应的几个函数区别。
apply()
apply()是pandas里DataFrame的函数,可以针对DataFrame中的行数据或列数据应用操作。
注意:这里的apply是应用到每一行或者每一列操作,有专门的axis可以指定,默认是axis=0.
import pandas as pd
import numpy as np
frame = pd.DataFrame(np.random.rand(4, 3), columns = list('abc'), index = ['Utah', 'Ohio', 'Texas', 'Oregon'])
print(frame)
# 输出如下:
# a b c
# Utah 0.443188 0.919623 0.550259
# Ohio 0.013923 0.557696 0.723975
# Texas 0.865469 0.720604 0.081306
# Oregon 0.506174 0.212421 0.061561
func = lambda x: x.max() - x.min()
print(frame.apply(func))
# 输出如下:
# a 0.851545
# b 0.707202
# c 0.662415
# dtype: float64
applymap()
apply()是pandas里DataFrame的函数,但是它对DataFrame中所有的元素应用操作。
import pandas as pd
import numpy as np
frame = pd.DataFrame(np.random.rand(4, 3), columns = list('abc'), index = ['Utah', 'Ohio', 'Texas', 'Oregon'])
print(frame)
# 输出如下:
# a b c
# Utah 0.443188 0.919623 0.550259
# Ohio 0.013923 0.557696 0.723975
# Texas 0.865469 0.720604 0.081306
# Oregon 0.506174 0.212421 0.061561
func = lambda x: f'{x:.2f}%'
print(frame.applymap(func))
# 输出如下:
# a b c
# Utah 0.34% 0.43% 0.67%
# Ohio 0.75% 0.50% 0.14%
# Texas 0.68% 0.28% 0.90%
# Oregon 0.05% 0.86% 0.78%
map()
map() 是python自带的函数,也就是我们常说的高阶函数,但它在DataFrame中可以直接使用.map()后缀的方式调用,由于只能直接对Series元素的操作,所以必须对DataFrame的某列或者某行应用操作。
import pandas as pd
import numpy as np
frame = pd.DataFrame(np.random.rand(4, 3), columns = list('abc'), index = ['Utah', 'Ohio', 'Texas', 'Oregon'])
print(frame)
# 输出如下:
# a b c
# Utah 0.443188 0.919623 0.550259
# Ohio 0.013923 0.557696 0.723975
# Texas 0.865469 0.720604 0.081306
# Oregon 0.506174 0.212421 0.061561
func = lambda x: f'{x:.2f}%'
print(frame['a'].map(func))
# 输出如下:
# Utah 0.65%
# Ohio 0.90%
# Texas 0.09%
# Oregon 0.72%
# Name: a, dtype: object