如何将年、月和日列合并到单个日期时间列?
问题描述
我有以下数据框df:
id lat lon year month day
0 381 53.30660 -0.54649 2004 1 2
1 381 53.30660 -0.54649 2004 1 3
2 381 53.30660 -0.54649 2004 1 4
我想创建一个新列 df['Date'],其中 year、month 和 day 列按 yyyy-md 格式组合.
and I want to create a new column df['Date'] where the year, month, and day columns are combined according to the format yyyy-m-d.
在这篇文章之后,我做到了:
`df['Date']=pd.to_datetime(df['year']*10000000000
+df['month']*100000000
+df['day']*1000000,
format='%Y-%m-%d%')`
结果不是我预期的,因为它是从 1970 年而不是 2004 年开始的,而且它还包含我没有指定的小时戳:
The result is not what I expected, as it starts from 1970 instead of 2004, and it also contains the hour stamp, which I did not specify:
id lat lon year month day Date
0 381 53.30660 -0.54649 2004 1 2 1970-01-01 05:34:00.102
1 381 53.30660 -0.54649 2004 1 3 1970-01-01 05:34:00.103
2 381 53.30660 -0.54649 2004 1 4 1970-01-01 05:34:00.104
由于日期应该是 2004-1-2 格式,我做错了什么?
As the dates should be in the 2004-1-2 format, what am I doing wrong?
解决方案
有一个更简单的方法:
In [250]: df['Date']=pd.to_datetime(df[['year','month','day']])
In [251]: df
Out[251]:
id lat lon year month day Date
0 381 53.3066 -0.54649 2004 1 2 2004-01-02
1 381 53.3066 -0.54649 2004 1 3 2004-01-03
2 381 53.3066 -0.54649 2004 1 4 2004-01-04
来自 文档:
从 DataFrame 的多列中组装日期时间.按键可以是常见的缩写,如 [year、month、day、minute、second、ms、us、ns])或相同的复数形式
Assembling a datetime from multiple columns of a DataFrame. The keys can be common abbreviations like [
year,month,day,minute,second,ms,us,ns]) or plurals of the same
相关文章