第一个问题的解决方案(查找每个国家销售最多的产品,总结此产品的价值)使用
dplyr
:
library(tidyverse)
df %>%
group_by(Country, Product_name) %>%
summarise(sum_value = sum(Value, na.rm = TRUE)) %>%
ungroup() %>%
group_by(Country) %>%
filter(sum_value == max(sum_value))
# A tibble: 3 x 3
# Groups: Country [3]
Country Product_name sum_value
<fctr> <fctr> <int>
1 Denmark Apple 887
2 Japan Juice 650
3 Sweden Apple 344
df %>%
group_by(Country, Product_name, Year) %>%
summarise(sum_value = sum(Value, na.rm = TRUE)) %>%
ungroup() %>%
group_by(Country, Year) %>%
arrange(desc(sum_value), .by_group = TRUE) %>%
slice(., 1:2)
为了得到一个像样的输出,必须对数据做一点修改,所以这里是所有年份都设置为1987年的输出(修改表中的2)
1:2
n
):
# A tibble: 6 x 4
# Groups: Country, Year [3]
Country Product_name Year sum_value
<fctr> <fctr> <int> <int>
1 Denmark Apple 1987 887
2 Denmark Pie 1987 320
3 Japan Juice 1987 650
4 Japan Pie 1987 544
5 Sweden Apple 1987 344
6 Sweden Banana 1987 310