i've csv file. contains output of previous r operations, filled index numbers (such [1], [[1]]). when read r, looks this, example:
v1 1 [1] 789 2 [[1]] 3 [1] "png" "d115" "dx06" "slz" 4 [1] 787 5 [[1]] 6 [1] "d010" "hc" 7 [1] 949 8 [[1]] 9 [1] "hc" "dx06" (i don't know why wasted space between line number , output data)
i need above data appear follows (without [1] or [[1]] or " " , data placed beside corresponding number, like):
789 png,d115,dx06,slz 787 d010,hc 949 hc,dx06 (possibly 789 , corresponding data png,d115,dx06,slz should separated tab.. , each row)
how achieve in r?
we create grouping variable ('indx'), split 'v1' column using grouping index after removing parentheses part in beginning quotes within string ". assuming need first column numeric element, , second column non-numeric part, can use regex replace space , (as showed in expected result, , rbind list elements.
indx <- cumsum(c(grepl('\\[\\[', df1$v1)[-1], false)) do.call(rbind,lapply(split(gsub('"|^.*\\]', '', df1$v1), indx), function(x) data.frame(ind=x[1], val=gsub('\\s+', ',', gsub('^\\s+|\\s+$', '',x[-1][x[-1]!='']))))) # ind val #1 789 png,d115,dx06,slz #2 787 d010,hc #3 949 hc,dx06 data
df1 <- structure(list(v1 = c("[1] 789", "[[1]]", "[1] \"png\" \"d115\" \"dx06\" \"slz\"", "[1] 787", "[[1]]", "[1] \"d010\" \"hc\"", "[1] 949", "[[1]]", "[1] \"hc\" \"dx06\"")), .names = "v1", class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9"))
Comments
Post a Comment