使用 pd.merge 来实现
on 表示查询的 columns,如果都有 id,那么这是很好的区别项,找到 id 相同的进行merge。
>>> import numpy as np >>> import pandas as pd >>> data1 = { ‘one‘: pd.Series([1,2,3]), ‘two‘: pd.Series([11,22,33]) } >>> df1 = pd.DataFrame(data = data1) >>> df1 one two 0 1 11 1 2 22 2 3 33 >>> data2 = { ‘one‘: pd.Series([1,2,3,4,5,6]), ‘two‘: pd.Series([11,22,33]), ‘three‘: pd.Series([111,222,333]), ‘four‘: pd.Series([1111,2222,3333,4444,5555,6666]) } >>> df2 = pd.DataFrame(data = data2) >>> df2 one two three four 0 1 11.0 111.0 1111 1 2 22.0 222.0 2222 2 3 33.0 333.0 3333 3 4 NaN NaN 4444 4 5 NaN NaN 5555 5 6 NaN NaN 6666 >>> df2[df2[‘one‘]<3] one two three four 0 1 11.0 111.0 1111 1 2 22.0 222.0 2222 >>> df = pd.merge(df1, df2, how=‘inner‘) >>> df one two three four 0 1 11 111.0 1111 1 2 22 222.0 2222 2 3 33 333.0 3333 >>> df1 one two 0 1 11 1 2 22 2 3 33 >>> df2 one two three four 0 1 11.0 111.0 1111 1 2 22.0 222.0 2222 2 3 33.0 333.0 3333 3 4 NaN NaN 4444 4 5 NaN NaN 5555 5 6 NaN NaN 6666 >>> pd.merge(df1, df2, how=‘inner‘) one two three four 0 1 11 111.0 1111 1 2 22 222.0 2222 2 3 33 333.0 3333 >>> pd.merge(df2, df1, how=‘inner‘) one two three four 0 1 11.0 111.0 1111 1 2 22.0 222.0 2222 2 3 33.0 333.0 3333 >>> five = pd.Series([1,2,3,4,5,6]) >>> df2[‘five‘] = five >>> df2 one two three four five 0 1 11.0 111.0 1111 1 1 2 22.0 222.0 2222 2 2 3 33.0 333.0 3333 3 3 4 NaN NaN 4444 4 4 5 NaN NaN 5555 5 5 6 NaN NaN 6666 6 >>> df1 one two 0 1 11 1 2 22 2 3 33 >>> pd.merge(df2, df1, how=‘inner‘) one two three four five 0 1 11.0 111.0 1111 1 1 2 22.0 222.0 2222 2 2 3 33.0 333.0 3333 3 >>> pd.merge(df1, df2, how=‘inner‘) one two three four five 0 1 11 111.0 1111 1 1 2 22 222.0 2222 2 2 3 33 333.0 3333 3 >>> df1 one two 0 1 11 1 2 22 2 3 33 >>> df2 one two three four five 0 1 11.0 111.0 1111 1 1 2 22.0 222.0 2222 2 2 3 33.0 333.0 3333 3 3 4 NaN NaN 4444 4 4 5 NaN NaN 5555 5 5 6 NaN NaN 6666 6 >>> six = pd.Series([-1, -2, -3]) >>> df1[‘six‘] = six >>> df1 one two six 0 1 11 -1 1 2 22 -2 2 3 33 -3 >>> df2 one two three four five 0 1 11.0 111.0 1111 1 1 2 22.0 222.0 2222 2 2 3 33.0 333.0 3333 3 3 4 NaN NaN 4444 4 4 5 NaN NaN 5555 5 5 6 NaN NaN 6666 6 >>> pd.merge(df1, df2, how=‘inner‘) one two six three four five 0 1 11 -1 111.0 1111 1 1 2 22 -2 222.0 2222 2 2 3 33 -3 333.0 3333 3 >>> pd.merge(df2, df1, how=‘inner‘) one two three four five six 0 1 11.0 111.0 1111 1 -1 1 2 22.0 222.0 2222 2 -2 2 3 33.0 333.0 3333 3 -3
原文:https://www.cnblogs.com/alex-bn-lee/p/11870500.html