发布于2023-03-21 14:23 阅读(1051) 评论(0) 点赞(19) 收藏(1)
我有两个数据框,我希望在 DF1 中获得一个列,该列将具有“当前日期”列的值加上与 DF2 中的相关状态和技术相关的天数。例如,在下面的“新日期”列中的第一个值是 18/03/2022 + 1095 天,因为它正在检查技术是否 = 风和状态 = 施工。
东风1
当前日期 | 技术 | 地位 | 代码要求的新日期 |
---|---|---|---|
18/03/2022 | 风 | 建造 | 16/12/2022 |
15/02/2022 | 太阳的 | 建造 | 15/11/2022 |
24/01/2022 | 电池 | 申请获批 | 24/10/2022 |
23/09/2020 | 风 | 申请获批 | 24/03/2023 |
18/11/2021 | 太阳的 | 已提交申请 | 18/11/2023 |
25/06/2020 | 太阳的 | 申请获批 | 25/03/2021 |
27/02/2020 | 风 | 已提交申请 | 25/02/2025 |
10/03/2022 | 电池 | 已提交申请 | 09/03/2024 |
东风2
技术 | 已提交申请 | 申请获批 | 建造 |
---|---|---|---|
电池 | 730 | 273.75 | 273.75 |
太阳能光伏 | 730 | 273.75 | 273.75 |
风 | 1825 | 912.5 | 1095 |
DataFrame.melt
与将值转换为 timedeltas 一起使用to_timedelta
(如果需要更准确的删除.astype(int)
):
df2 = (df2.melt('Technology', var_name='Status', value_name='New Date')
.assign(**{'New Date':
lambda x: pd.to_timedelta(x['New Date'].astype(int), unit='d')}))
print (df2)
Technology Status New Date
0 Battery Application submitted 730 days
1 Solar Photovoltaics Application submitted 730 days
2 Wind Application submitted 1825 days
3 Battery Application approved 273 days
4 Solar Photovoltaics Application approved 273 days
5 Wind Application approved 912 days
6 Battery Construction 273 days
7 Solar Photovoltaics Construction 273 days
8 Wind Construction 1095 days
然后使用左连接并添加列Current Date
:
df = df1.merge(df2, on=['Technology','Status'], how='left')
df['New Date'] += pd.to_datetime(df['Current Date'], dayfirst=True)
print (df)
Current Date Technology Status New Date
0 18/03/2022 Wind Construction 2025-03-17
1 15/02/2022 Solar Construction NaT
2 24/01/2022 Battery Application approved 2022-10-24
3 23/09/2020 Wind Application approved 2023-03-24
4 18/11/2021 Solar Application submitted NaT
5 25/06/2020 Solar Application approved NaT
6 27/02/2020 Wind Application submitted 2025-02-25
7 10/03/2022 Battery Application submitted 2024-03-09
对于匹配Solar Photovoltaics
值,可以使用拆分并选择第一个值:
df2['Technology'] = df2['Technology'].str.split().str[0]
df2 = (df2.melt('Technology', var_name='Status', value_name='New Date')
.assign(**{'New Date':
lambda x: pd.to_timedelta(x['New Date'].astype(int), unit='d')}))
print (df2)
Technology Status New Date
0 Battery Application submitted 730 days
1 Solar Application submitted 730 days
2 Wind Application submitted 1825 days
3 Battery Application approved 273 days
4 Solar Application approved 273 days
5 Wind Application approved 912 days
6 Battery Construction 273 days
7 Solar Construction 273 days
8 Wind Construction 1095 days
df = df1.merge(df2, on=['Technology','Status'], how='left')
df['New Date'] += pd.to_datetime(df['Current Date'], dayfirst=True)
print (df)
Current Date Technology Status New Date
0 18/03/2022 Wind Construction 2025-03-17
1 15/02/2022 Solar Construction 2022-11-15
2 24/01/2022 Battery Application approved 2022-10-24
3 23/09/2020 Wind Application approved 2023-03-24
4 18/11/2021 Solar Application submitted 2023-11-18
5 25/06/2020 Solar Application approved 2021-03-25
6 27/02/2020 Wind Application submitted 2025-02-25
7 10/03/2022 Battery Application submitted 2024-03-09
作者:黑洞官方问答小能手
链接:https://www.pythonheidong.com/blog/article/1944122/ca2ea539ae4c3780ddd6/
来源:python黑洞网
任何形式的转载都请注明出处,如有侵权 一经发现 必将追究其法律责任
昵称:
评论内容:(最多支持255个字符)
---无人问津也好,技不如人也罢,你都要试着安静下来,去做自己该做的事,而不是让内心的烦躁、焦虑,坏掉你本来就不多的热情和定力
Copyright © 2018-2021 python黑洞网 All Rights Reserved 版权所有,并保留所有权利。 京ICP备18063182号-1
投诉与举报,广告合作请联系vgs_info@163.com或QQ3083709327
免责声明:网站文章均由用户上传,仅供读者学习交流使用,禁止用做商业用途。若文章涉及色情,反动,侵权等违法信息,请向我们举报,一经核实我们会立即删除!