python - Nearest match merge on two columns (pandas) -


similar 1 of previous questions (merge dataframes on nearest datetime / timestamp), merge 2 pandas data frames on 2 datetime columns using nearest match:

let , b 2 dataframes follows:

a = pd.dataframe({"id":["a", "a", "c" ,"b", "b"], "init_date":["01/01/2015","07/02/2014","08/02/1999","01/01/1991","06/22/2014"], "fin_date":["04/16/1923","09/24/1945","06/24/1952","11/26/1988","10/05/1990"]})   in [15]: out[15]:    id    fin_date   init_date 0   04/16/1923  01/01/2015 1   09/24/1945  07/02/2014 2  c  06/24/1952  08/02/1999 3  b  11/26/1988  01/01/1991 4  b  10/05/1990  06/22/2014   b = pd.dataframe({"id":["a", "a", "c" ,"b", "b"], "date":["02/15/2015","06/30/2014","07/02/1999","10/05/1990","06/24/2014"],"fin_date":["12/10/1926","01/01/1944","08/21/1955","12/12/1987","11/05/1991"], "value": ["3","5","1","7","8"] })   in [11]: b out[11]:    id        date    fin_date value 0   02/15/2015  12/10/1926     3 1   06/30/2014  01/01/1944     5 2  c  07/02/1999  08/21/1955     1 3  b  10/05/1990  12/12/1987     7 4  b  06/24/2014  11/05/1991     8 

the resulting data frame should following:

in [21]: c out[21]:    id    fin_date   init_date value 0   04/16/1923  01/01/2015     3 1   09/24/1945  07/02/2014     5 2  c  06/24/1952  08/02/1999     1 3  b  11/26/1988  01/01/1991     7 4  b  10/05/1990  06/22/2014     8 

the general problem potentially not have close match neither init_date nor fin_date, however, interested in solution when there exact matches init_date, example.

note 1 difficulty 1 match might closer value in init_date on final date, while competing match might opposite. in case, prefer 1 closer init_date. knowledge, after attempting similar approach 1 in link, reindexing "nearest" not implemented multi-indexing.

thank , appreciate help,

pd.merge(a,b['value'],on=['id','fin_date'],how='left') 

Comments

Popular posts from this blog

amazon web services - S3 Pre-signed POST validate file type? -

c# - Check Keyboard Input Winforms -