Cartesian Product using data.table package
Solution 1:
If you first construct full names from the first and last in the cust
-dataframe, you can then use CJ
(cross-join). You cannot use all three vectors since there would be 99 items and teh first names would get inappropriately mixed with last names.
> nrow(CJ(dates$date, cust$first.name, cust$last.name ) )
[1] 99
This returns the desired data.table object:
> CJ(dates$date,paste(cust$first.name, cust$last.name) )
V1 V2
1: 2012-08-28 George Smith
2: 2012-08-28 Henry Smith
3: 2012-08-28 John Doe
4: 2012-08-29 George Smith
5: 2012-08-29 Henry Smith
6: 2012-08-29 John Doe
7: 2012-08-30 George Smith
8: 2012-08-30 Henry Smith
9: 2012-08-30 John Doe
10: 2012-08-31 John Doe
11: 2012-08-31 George Smith
12: 2012-08-31 Henry Smith
13: 2012-09-01 John Doe
14: 2012-09-01 George Smith
15: 2012-09-01 Henry Smith
16: 2012-09-02 George Smith
17: 2012-09-02 Henry Smith
18: 2012-09-02 John Doe
19: 2012-09-03 Henry Smith
20: 2012-09-03 John Doe
21: 2012-09-03 George Smith
22: 2012-09-04 Henry Smith
23: 2012-09-04 John Doe
24: 2012-09-04 George Smith
25: 2012-09-05 George Smith
26: 2012-09-05 Henry Smith
27: 2012-09-05 John Doe
28: 2012-09-06 George Smith
29: 2012-09-06 Henry Smith
30: 2012-09-06 John Doe
31: 2012-09-07 George Smith
32: 2012-09-07 Henry Smith
33: 2012-09-07 John Doe
V1 V2
Solution 2:
merge.data.table(x, y)
is a convenience function that wraps a call to x[y]
, so the merge needs to be based on columns that are in both data.table
s. (That's what that error message is trying to tell you).
One work-around is to add a dummy column to both data.tables, whose only purpose is to make the merge possible:
## Add a column "k", and append it to each data.table's vector of keyed columns.
setkeyv(cust.dt[,k:=1], c(key(cust.dt), "k"))
setkeyv(dates.dt[,k:=1], c(key(dates.dt), "k"))
## Merge and then remove the dummy column
res <- merge(dates.dt, cust.dt, by="k")
head(res[,k:=NULL])
# date first.name last.name
# 1: 2012-08-28 George Smith
# 2: 2012-08-28 Henry Smith
# 3: 2012-08-28 John Doe
# 4: 2012-08-29 George Smith
# 5: 2012-08-29 Henry Smith
# 6: 2012-08-29 John Doe
## Maybe also clean up cust.dt and dates.dt
# cust.dt[,k:=NULL]
# dates.dt[,k=NULL]