Tidyr: How to nest into individual list-columns vs list of data frames?

Created on 6 May 2016  ·  3Comments  ·  Source: tidyverse/tidyr

As directed (https://github.com/hadley/dplyr/issues/1800), I'm posting this question here.

What's the best way to do this variation on nesting a data frame? I didn't want to use tidyr::nest() because I didn't want the other variables buried in data frames -- I wanted them as individual list columns. Sthg like this:

library(dplyr)
mtcars %>%
  group_by(cyl) %>%
  summarize(gear = split(gear, cyl),
            wt = split(wt, cyl))
#> Source: local data frame [3 x 3]
#> 
#>     cyl       gear         wt
#>   <dbl>     <list>     <list>
#> 1     4 <dbl [11]> <dbl [11]>
#> 2     6  <dbl [7]>  <dbl [7]>
#> 3     8 <dbl [14]> <dbl [14]>

Since then, I found another way to do it using purrr::transpose():

library(purrr)
library(dplyr)
library(tidyr)

mtcars %>% 
  group_by(cyl) %>% 
  nest(gear, wt) %>% 
  bind_cols(transpose(.$data)) %>% 
  select(-data)
#> Source: local data frame [3 x 3]
#> 
#>     cyl       gear         wt
#>   <dbl>     <list>     <list>
#> 1     6  <dbl [7]>  <dbl [7]>
#> 2     4 <dbl [11]> <dbl [11]>
#> 3     8 <dbl [14]> <dbl [14]>

Most helpful comment

I think the simplest way is probably:

library(dplyr)

mtcars %>%
  group_by(cyl) %>%
  summarize(gear = list(gear), wt = list(wt))

I'm not sure if this is worthy of a new top-level verb or not.

Could you give a bit more info on the broader context?

All 3 comments

I think the simplest way is probably:

library(dplyr)

mtcars %>%
  group_by(cyl) %>%
  summarize(gear = list(gear), wt = list(wt))

I'm not sure if this is worthy of a new top-level verb or not.

Could you give a bit more info on the broader context?

Your way to create the result is much cleaner.

The immediate trigger was creating examples to figure out why I could not unnest() certain data frames, which turned out to be an issue with dplyr::combine() (https://github.com/hadley/dplyr/issues/1780).

I will close. If this matters in real life, it will come up again.

Seeing the last solution makes me realize summarise_each() is another option.

library(dplyr)
mtcars %>%
  group_by(cyl) %>%
  summarise_each("list", gear, wt)
#> Source: local data frame [3 x 3]
#> 
#>     cyl       gear         wt
#>   <dbl>     <list>     <list>
#> 1     4 <dbl [11]> <dbl [11]>
#> 2     6  <dbl [7]>  <dbl [7]>
#> 3     8 <dbl [14]> <dbl [14]>
Was this page helpful?
0 / 5 - 0 ratings