如何在PIVOT_WIDTER中而不是列表中获得总值的总和?

人气:399 发布:2022-10-16 标签: r dplyr tidyverse

问题描述

    data <- data.frame(row_id = 0:19, Prediction = c(4.20631885375613, 
0.677197140556434, 0.889543113836738, 37.8093227242093, 105.860956599905, 
17.2609337360412, 0.41323004743284, 6.94073422786919, 2.08635131353358, 
72.7283615643886, 12.2655072861912, 3.77794122863612, 4.50660941933039, 
0.877724474431314, 2.86251575017408, 31.3229122662926, 2.32802608836313, 
0.616664152263578, 2.00202294742939, 1.39842036444256), Explanation.1.Strength = c("", 
"", "", "", "+++", "", "--", "", "", "+++", "", "", "", "", "", 
"", "", "", "", ""), Explanation.1.Feature = c("", "", "", "", 
"is_overnight_shipping", "", "number_items", "", "", "is_overnight_shipping", 
"", "", "", "", "", "", "", "", "", ""), Explanation.1.Value = c("", 
"", "", "", "'1'", "", "'1'", "", "", "'1'", "", "", "", "", 
"", "", "", "", "", ""))

我使用的代码:

data %>% 
  mutate(Explanation.1.Strength = if_else(Explanation.1.Strength == "", "unknown", Explanation.1.Strength)) %>%
  pivot_wider(Explanation.1.Feature, names_from = Explanation.1.Strength, values_from = Explanation.1.Value)

我正在使用列表获取输出,如何获取值的总和而不是列表输出?

推荐答案

我们可以使用values_fninpivot_wider返回sum。如果这些元素是NA元素,则使用sum(!is.na(.)),如果只是空白(""),则使用sum(nzchar(.))

library(dplyr)
library(tidyr)
data %>% 
mutate(Explanation.1.Strength = if_else(Explanation.1.Strength == "", "unknown",
  Explanation.1.Strength)) %>%
  pivot_wider(Explanation.1.Feature, names_from = Explanation.1.Strength, values_from = Explanation.1.Value,
      values_fn = list(Explanation.1.Value = ~sum(!is.na(.), na.rm = TRUE)))
# A tibble: 3 x 4
#  Explanation.1.Feature   unknown `+++`  `--`
#  <chr>                     <int> <int> <int>
#1 ""                           17    NA    NA
#2 "is_overnight_shipping"      NA     2    NA
#3 "number_items"               NA    NA     1

163