Creating one variable from a list of variables in R? Announcing the arrival of Valued...

Does Prince Arnaud cause someone holding the Princess to lose?

2 sample t test for sample sizes - 30,000 and 150,000

Why these surprising proportionalities of integrals involving odd zeta values?

Is Mathematical Biology analogous to Mathematical Physics?

Protagonist's race is hidden - should I reveal it?

Can 'non' with gerundive mean both lack of obligation and negative obligation?

What is the difference between 准时 and 按时?

Is there a verb for listening stealthily?

Unix AIX passing variable and arguments to expect and spawn

Lights are flickering on and off after accidentally bumping into light switch

Converting a text document with special format to Pandas DataFrame

Why did Bronn offer to be Tyrion Lannister's champion in trial by combat?

Is my guitar’s action too high?

How to ask rejected full-time candidates to apply to teach individual courses?

Why are two-digit numbers in Jonathan Swift's "Gulliver's Travels" (1726) written in "German style"?

Why does my GNOME settings mention "Moto C Plus"?

How to break 信じようとしていただけかも知れない into separate parts?

Coin Game with infinite paradox

How is an IPA symbol that lacks a name (e.g. ɲ) called?

How to create a command for the "strange m" symbol in latex?

How to leave only the following strings?

Why did Israel vote against lifting the American embargo on Cuba?

A journey... into the MIND

Trying to enter the Fox's den

Creating one variable from a list of variables in R?

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)

Data science time! April 2019 and salary with experience

The Ask Question Wizard is Live!R dplyr/tidyr: “mutate” new columns with data from other observationsFunction for Tidy chisq.test Output for Visualizing or Filtering P-ValuesShiny: Create reactive filter using different variables.Create a Table with Alternating Total Rows Followed by Sub-Rows Using Dplyr and TidyverseUsing switch statement within dplyr's mutateConditional Recoding - Using a Vector of Columns within Mutate_at Together with If_else and Dplyr::RecodeCreating and using new variables in function in R: NSE programing error in the tidyversedplyr mutate-ifelse combination not creating correct conditional variableTidyverse — integrating mutate select and case when to likert scalesCan I create a new numerical variable using dplyr and <= and >= operators to subset values from an existing vector?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I have a sequence of variables in a dataframe (over 100) and I would like to create an indicator variable for if particular text patterns are present in any of the variables. Below is an example with three variables. One solution I've found is using tidyr::unite() followed by dplyr::mutate(), but I'm interested in a solution where I do not have to unite the variables.

c1<-c("T1", "X1", "T6", "R5")

c2<-c("R4", "C6", "C7", "X3")

c3<-c("C5", "C2", "X4", "T2")



df<-data.frame(c1, c2, c3)



  c1 c2 c3

1 T1 R4 C5

2 X1 C6 C2

3 T6 C7 X4

4 R5 X3 T2



code.vec<-c("T1", "T2", "T3", "T4") #Text patterns of interest

code_regex<-paste(code.vec, collapse="|")



new<-df %>% 

  unite(all_c, c1:c3, remove=FALSE) %>% 

  mutate(indicator=if_else(grepl(code_regex, all_c), 1, 0)) %>% 

  select(-(all_c))



  c1 c2 c3 indicator

1 T1 R4 C5 1

2 X1 C6 C2 0

3 T6 C7 X4 0

4 R5 X3 T2 1

Above is an example that produces the desired result, however I feel as if there should be a way of doing this in tidyverse without having to unite the variables. This is something that SAS handles very easily using an ARRAY statement and a DO loop, and I'm hoping R has a good way of handling this.

The real dataframe has many additional variables besides from the "c" fields to search, so a solution that involves searching every column would require subsetting the dataframe to first only contain the variables I want to search, and then joining the data back with the other variables.

edited 11 hours ago

asked 11 hours ago

patward5656

425

You said you don't want to use unite, but it's worth noting that passing the argument remove = FALSE has unite create a column of the united variables leaving the others intact. Might be convenient in this case.

– camille
11 hours ago

Yes, it is convenient. And it does work. I just feel like there may be a simpler approach I'm missing that doesn't need to create a united variable.

– patward5656
11 hours ago

add a comment |

c1<-c("T1", "X1", "T6", "R5")

c2<-c("R4", "C6", "C7", "X3")

c3<-c("C5", "C2", "X4", "T2")



df<-data.frame(c1, c2, c3)



  c1 c2 c3

1 T1 R4 C5

2 X1 C6 C2

3 T6 C7 X4

4 R5 X3 T2



code.vec<-c("T1", "T2", "T3", "T4") #Text patterns of interest

code_regex<-paste(code.vec, collapse="|")



new<-df %>% 

  unite(all_c, c1:c3, remove=FALSE) %>% 

  mutate(indicator=if_else(grepl(code_regex, all_c), 1, 0)) %>% 

  select(-(all_c))



  c1 c2 c3 indicator

1 T1 R4 C5 1

2 X1 C6 C2 0

3 T6 C7 X4 0

4 R5 X3 T2 1

edited 11 hours ago

asked 11 hours ago

patward5656

425

You said you don't want to use unite, but it's worth noting that passing the argument remove = FALSE has unite create a column of the united variables leaving the others intact. Might be convenient in this case.

– camille
11 hours ago

Yes, it is convenient. And it does work. I just feel like there may be a simpler approach I'm missing that doesn't need to create a united variable.

– patward5656
11 hours ago

add a comment |

c1<-c("T1", "X1", "T6", "R5")

c2<-c("R4", "C6", "C7", "X3")

c3<-c("C5", "C2", "X4", "T2")



df<-data.frame(c1, c2, c3)



  c1 c2 c3

1 T1 R4 C5

2 X1 C6 C2

3 T6 C7 X4

4 R5 X3 T2



code.vec<-c("T1", "T2", "T3", "T4") #Text patterns of interest

code_regex<-paste(code.vec, collapse="|")



new<-df %>% 

  unite(all_c, c1:c3, remove=FALSE) %>% 

  mutate(indicator=if_else(grepl(code_regex, all_c), 1, 0)) %>% 

  select(-(all_c))



  c1 c2 c3 indicator

1 T1 R4 C5 1

2 X1 C6 C2 0

3 T6 C7 X4 0

4 R5 X3 T2 1

edited 11 hours ago

asked 11 hours ago

patward5656

425

c1<-c("T1", "X1", "T6", "R5")

c2<-c("R4", "C6", "C7", "X3")

c3<-c("C5", "C2", "X4", "T2")



df<-data.frame(c1, c2, c3)



  c1 c2 c3

1 T1 R4 C5

2 X1 C6 C2

3 T6 C7 X4

4 R5 X3 T2



code.vec<-c("T1", "T2", "T3", "T4") #Text patterns of interest

code_regex<-paste(code.vec, collapse="|")



new<-df %>% 

  unite(all_c, c1:c3, remove=FALSE) %>% 

  mutate(indicator=if_else(grepl(code_regex, all_c), 1, 0)) %>% 

  select(-(all_c))



  c1 c2 c3 indicator

1 T1 R4 C5 1

2 X1 C6 C2 0

3 T6 C7 X4 0

4 R5 X3 T2 1

r dplyr tidyverse mutate

edited 11 hours ago

asked 11 hours ago

patward5656

425

edited 11 hours ago

asked 11 hours ago

patward5656

425

edited 11 hours ago

asked 11 hours ago

patward5656

425

asked 11 hours ago

patward5656

425

asked 11 hours ago

patward5656

425

You said you don't want to use unite, but it's worth noting that passing the argument remove = FALSE has unite create a column of the united variables leaving the others intact. Might be convenient in this case.

– camille
11 hours ago

Yes, it is convenient. And it does work. I just feel like there may be a simpler approach I'm missing that doesn't need to create a united variable.

– patward5656
11 hours ago

add a comment |

You said you don't want to use unite, but it's worth noting that passing the argument remove = FALSE has unite create a column of the united variables leaving the others intact. Might be convenient in this case.

– camille
11 hours ago

Yes, it is convenient. And it does work. I just feel like there may be a simpler approach I'm missing that doesn't need to create a united variable.

– patward5656
11 hours ago

You said you don't want to use unite, but it's worth noting that passing the argument remove = FALSE has unite create a column of the united variables leaving the others intact. Might be convenient in this case.

– camille
11 hours ago

Yes, it is convenient. And it does work. I just feel like there may be a simpler approach I'm missing that doesn't need to create a united variable.

– patward5656
11 hours ago

add a comment |

3 Answers
3

active

oldest

votes

We can use tidyverse

library(tidyverse)

df %>%

    mutate_all(str_detect, pattern = code_regex) %>%

    reduce(`+`) %>% 

    mutate(df, indicator = .)

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

Or using base R

Reduce(`+`, lapply(df, grepl, pattern = code_regex))

#[1] 1 0 0 1

answered 11 hours ago

akrun

424k13209287

This tidyverse solution seems to only work in the scenario where all of the columns are being searched. I have other variables in my real dataset, and when using it for that the output is all NA. Does this have something to do with the reduce function?

– patward5656
10 hours ago

@patward5656 That is an easy fix. df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce("+") %>% mutate(df, indicator = .)

– akrun
10 hours ago

c1<-c("T1", "X1", "T6", "R5") c2<-c("R4", "C6", "C7", "X3") c3<-c("C5", "C2", "X4", "T2") z1<-c("C5", "C2", "X4", "T2") df<-data.frame(c1, c2, c3, z1) df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+) %>% mutate(df, indicator = .) c1 c2 c3 z1 indicator 1 T1 R4 C5 C5 NA 2 X1 C6 C2 C2 NA 3 T6 C7 X4 X4 NA 4 R5 X3 T2 T2 NA Warning message: In Ops.factor(.x, .y) : ‘+’ not meaningful for factors This produced NAs, it seems.

– patward5656
10 hours ago

1

@patward5656 I would use transmute_at instead of mutate_at df %>% transmute_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+)

– akrun
10 hours ago

1

Thanks. I believe transmute_at() solves it perfectly.

– patward5656
10 hours ago

|
show 2 more comments

Using base R, we can use sapply and use grepl to find pattern in every column and assign 1 to rows where there is more than 0 matches.

df$indicator <- as.integer(rowSums(sapply(df, grepl, pattern = code_regex)) > 0)



df

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

If there are few other columns and we are interested to apply it only for columns which start with "c" we can use grep to filter them.

cols <- grep("^c", names(df))

as.integer(rowSums(sapply(df[cols], grepl, pattern = code_regex)) > 0)

Using dplyr we can do

library(dplyr)



df$indicator <- as.integer(df %>%

              mutate_at(vars(c1:c3), ~grepl(code_regex, .)) %>%

              rowSums() > 0)

edited 11 hours ago

answered 11 hours ago

Ronak Shah

49k104370

This is a good solution, but in the real data there are additional variables that I do not want to pattern search, so this would require me to index the dataframe to include only the columns I want to search first. Will edit my original post to include this information.

– patward5656
11 hours ago

The purr solution looks like what I was looking for--one line of code that doesn't involve uniting the variables.

– patward5656
11 hours ago

@patward5656 I think the purrr solution would not give you the expected output. I changed it to use mutate_at which should work on range of columns. Moreover, you can use column numbers directly in cols for sapply ., say columns 3:5 or 1:3 to find pattern in those column.

– Ronak Shah
11 hours ago

add a comment |

Base R with apply

apply(df[cols], 1, function(x) sum(grepl(code_regex, x)))

# [1] 1 0 0 1

answered 10 hours ago

nsinghs

1,262621

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55795925%2fcreating-one-variable-from-a-list-of-variables-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

We can use tidyverse

library(tidyverse)

df %>%

    mutate_all(str_detect, pattern = code_regex) %>%

    reduce(`+`) %>% 

    mutate(df, indicator = .)

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

Or using base R

Reduce(`+`, lapply(df, grepl, pattern = code_regex))

#[1] 1 0 0 1

answered 11 hours ago

akrun

424k13209287

This tidyverse solution seems to only work in the scenario where all of the columns are being searched. I have other variables in my real dataset, and when using it for that the output is all NA. Does this have something to do with the reduce function?

– patward5656
10 hours ago

@patward5656 That is an easy fix. df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce("+") %>% mutate(df, indicator = .)

– akrun
10 hours ago

c1<-c("T1", "X1", "T6", "R5") c2<-c("R4", "C6", "C7", "X3") c3<-c("C5", "C2", "X4", "T2") z1<-c("C5", "C2", "X4", "T2") df<-data.frame(c1, c2, c3, z1) df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+) %>% mutate(df, indicator = .) c1 c2 c3 z1 indicator 1 T1 R4 C5 C5 NA 2 X1 C6 C2 C2 NA 3 T6 C7 X4 X4 NA 4 R5 X3 T2 T2 NA Warning message: In Ops.factor(.x, .y) : ‘+’ not meaningful for factors This produced NAs, it seems.

– patward5656
10 hours ago

1

@patward5656 I would use transmute_at instead of mutate_at df %>% transmute_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+)

– akrun
10 hours ago

1

Thanks. I believe transmute_at() solves it perfectly.

– patward5656
10 hours ago

|
show 2 more comments

We can use tidyverse

library(tidyverse)

df %>%

    mutate_all(str_detect, pattern = code_regex) %>%

    reduce(`+`) %>% 

    mutate(df, indicator = .)

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

Or using base R

Reduce(`+`, lapply(df, grepl, pattern = code_regex))

#[1] 1 0 0 1

answered 11 hours ago

akrun

424k13209287

This tidyverse solution seems to only work in the scenario where all of the columns are being searched. I have other variables in my real dataset, and when using it for that the output is all NA. Does this have something to do with the reduce function?

– patward5656
10 hours ago

@patward5656 That is an easy fix. df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce("+") %>% mutate(df, indicator = .)

– akrun
10 hours ago

c1<-c("T1", "X1", "T6", "R5") c2<-c("R4", "C6", "C7", "X3") c3<-c("C5", "C2", "X4", "T2") z1<-c("C5", "C2", "X4", "T2") df<-data.frame(c1, c2, c3, z1) df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+) %>% mutate(df, indicator = .) c1 c2 c3 z1 indicator 1 T1 R4 C5 C5 NA 2 X1 C6 C2 C2 NA 3 T6 C7 X4 X4 NA 4 R5 X3 T2 T2 NA Warning message: In Ops.factor(.x, .y) : ‘+’ not meaningful for factors This produced NAs, it seems.

– patward5656
10 hours ago

1

@patward5656 I would use transmute_at instead of mutate_at df %>% transmute_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+)

– akrun
10 hours ago

1

Thanks. I believe transmute_at() solves it perfectly.

– patward5656
10 hours ago

|
show 2 more comments

We can use tidyverse

library(tidyverse)

df %>%

    mutate_all(str_detect, pattern = code_regex) %>%

    reduce(`+`) %>% 

    mutate(df, indicator = .)

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

Or using base R

Reduce(`+`, lapply(df, grepl, pattern = code_regex))

#[1] 1 0 0 1

answered 11 hours ago

akrun

424k13209287

We can use tidyverse

library(tidyverse)

df %>%

    mutate_all(str_detect, pattern = code_regex) %>%

    reduce(`+`) %>% 

    mutate(df, indicator = .)

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

Or using base R

Reduce(`+`, lapply(df, grepl, pattern = code_regex))

#[1] 1 0 0 1

answered 11 hours ago

akrun

424k13209287

answered 11 hours ago

akrun

424k13209287

answered 11 hours ago

akrun

424k13209287

answered 11 hours ago

akrun

424k13209287

This tidyverse solution seems to only work in the scenario where all of the columns are being searched. I have other variables in my real dataset, and when using it for that the output is all NA. Does this have something to do with the reduce function?

– patward5656
10 hours ago

@patward5656 That is an easy fix. df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce("+") %>% mutate(df, indicator = .)

– akrun
10 hours ago

c1<-c("T1", "X1", "T6", "R5") c2<-c("R4", "C6", "C7", "X3") c3<-c("C5", "C2", "X4", "T2") z1<-c("C5", "C2", "X4", "T2") df<-data.frame(c1, c2, c3, z1) df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+) %>% mutate(df, indicator = .) c1 c2 c3 z1 indicator 1 T1 R4 C5 C5 NA 2 X1 C6 C2 C2 NA 3 T6 C7 X4 X4 NA 4 R5 X3 T2 T2 NA Warning message: In Ops.factor(.x, .y) : ‘+’ not meaningful for factors This produced NAs, it seems.

– patward5656
10 hours ago

1

@patward5656 I would use transmute_at instead of mutate_at df %>% transmute_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+)

– akrun
10 hours ago

1

Thanks. I believe transmute_at() solves it perfectly.

– patward5656
10 hours ago

|
show 2 more comments

This tidyverse solution seems to only work in the scenario where all of the columns are being searched. I have other variables in my real dataset, and when using it for that the output is all NA. Does this have something to do with the reduce function?

– patward5656
10 hours ago

@patward5656 That is an easy fix. df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce("+") %>% mutate(df, indicator = .)

– akrun
10 hours ago

c1<-c("T1", "X1", "T6", "R5") c2<-c("R4", "C6", "C7", "X3") c3<-c("C5", "C2", "X4", "T2") z1<-c("C5", "C2", "X4", "T2") df<-data.frame(c1, c2, c3, z1) df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+) %>% mutate(df, indicator = .) c1 c2 c3 z1 indicator 1 T1 R4 C5 C5 NA 2 X1 C6 C2 C2 NA 3 T6 C7 X4 X4 NA 4 R5 X3 T2 T2 NA Warning message: In Ops.factor(.x, .y) : ‘+’ not meaningful for factors This produced NAs, it seems.

– patward5656
10 hours ago

1

@patward5656 I would use transmute_at instead of mutate_at df %>% transmute_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+)

– akrun
10 hours ago

1

Thanks. I believe transmute_at() solves it perfectly.

– patward5656
10 hours ago

This tidyverse solution seems to only work in the scenario where all of the columns are being searched. I have other variables in my real dataset, and when using it for that the output is all NA. Does this have something to do with the reduce function?

– patward5656
10 hours ago

@patward5656 That is an easy fix. df %>% mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce("+") %>% mutate(df, indicator = .)

– akrun
10 hours ago

c1<-c("T1", "X1", "T6", "R5") c2<-c("R4", "C6", "C7", "X3") c3<-c("C5", "C2", "X4", "T2") z1<-c("C5", "C2", "X4", "T2")  df<-data.frame(c1, c2, c3, z1)  df %>%   mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>%   reduce(

) %>%    mutate(df, indicator = .)    c1 c2 c3 z1 indicator 1 T1 R4 C5 C5        NA 2 X1 C6 C2 C2        NA 3 T6 C7 X4 X4        NA 4 R5 X3 T2 T2        NA Warning message: In Ops.factor(.x, .y) : ‘+’ not meaningful for factors

This produced NAs, it seems.

– patward5656
10 hours ago

c1<-c("T1", "X1", "T6", "R5") c2<-c("R4", "C6", "C7", "X3") c3<-c("C5", "C2", "X4", "T2") z1<-c("C5", "C2", "X4", "T2")  df<-data.frame(c1, c2, c3, z1)  df %>%   mutate_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>%   reduce(

) %>%    mutate(df, indicator = .)    c1 c2 c3 z1 indicator 1 T1 R4 C5 C5        NA 2 X1 C6 C2 C2        NA 3 T6 C7 X4 X4        NA 4 R5 X3 T2 T2        NA Warning message: In Ops.factor(.x, .y) : ‘+’ not meaningful for factors

This produced NAs, it seems.

– patward5656
10 hours ago

@patward5656 I would use transmute_at instead of mutate_at df %>% transmute_at(vars(starts_with("c")), str_detect, pattern = code_regex) %>% reduce(+)

– akrun
10 hours ago

Thanks. I believe transmute_at() solves it perfectly.

– patward5656
10 hours ago

|
show 2 more comments

Using base R, we can use sapply and use grepl to find pattern in every column and assign 1 to rows where there is more than 0 matches.

df$indicator <- as.integer(rowSums(sapply(df, grepl, pattern = code_regex)) > 0)



df

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

If there are few other columns and we are interested to apply it only for columns which start with "c" we can use grep to filter them.

cols <- grep("^c", names(df))

as.integer(rowSums(sapply(df[cols], grepl, pattern = code_regex)) > 0)

Using dplyr we can do

library(dplyr)



df$indicator <- as.integer(df %>%

              mutate_at(vars(c1:c3), ~grepl(code_regex, .)) %>%

              rowSums() > 0)

edited 11 hours ago

answered 11 hours ago

Ronak Shah

49k104370

This is a good solution, but in the real data there are additional variables that I do not want to pattern search, so this would require me to index the dataframe to include only the columns I want to search first. Will edit my original post to include this information.

– patward5656
11 hours ago

The purr solution looks like what I was looking for--one line of code that doesn't involve uniting the variables.

– patward5656
11 hours ago

@patward5656 I think the purrr solution would not give you the expected output. I changed it to use mutate_at which should work on range of columns. Moreover, you can use column numbers directly in cols for sapply ., say columns 3:5 or 1:3 to find pattern in those column.

– Ronak Shah
11 hours ago

add a comment |

Using base R, we can use sapply and use grepl to find pattern in every column and assign 1 to rows where there is more than 0 matches.

df$indicator <- as.integer(rowSums(sapply(df, grepl, pattern = code_regex)) > 0)



df

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

If there are few other columns and we are interested to apply it only for columns which start with "c" we can use grep to filter them.

cols <- grep("^c", names(df))

as.integer(rowSums(sapply(df[cols], grepl, pattern = code_regex)) > 0)

Using dplyr we can do

library(dplyr)



df$indicator <- as.integer(df %>%

              mutate_at(vars(c1:c3), ~grepl(code_regex, .)) %>%

              rowSums() > 0)

edited 11 hours ago

answered 11 hours ago

Ronak Shah

49k104370

This is a good solution, but in the real data there are additional variables that I do not want to pattern search, so this would require me to index the dataframe to include only the columns I want to search first. Will edit my original post to include this information.

– patward5656
11 hours ago

The purr solution looks like what I was looking for--one line of code that doesn't involve uniting the variables.

– patward5656
11 hours ago

@patward5656 I think the purrr solution would not give you the expected output. I changed it to use mutate_at which should work on range of columns. Moreover, you can use column numbers directly in cols for sapply ., say columns 3:5 or 1:3 to find pattern in those column.

– Ronak Shah
11 hours ago

add a comment |

Using base R, we can use sapply and use grepl to find pattern in every column and assign 1 to rows where there is more than 0 matches.

df$indicator <- as.integer(rowSums(sapply(df, grepl, pattern = code_regex)) > 0)



df

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

If there are few other columns and we are interested to apply it only for columns which start with "c" we can use grep to filter them.

cols <- grep("^c", names(df))

as.integer(rowSums(sapply(df[cols], grepl, pattern = code_regex)) > 0)

Using dplyr we can do

library(dplyr)



df$indicator <- as.integer(df %>%

              mutate_at(vars(c1:c3), ~grepl(code_regex, .)) %>%

              rowSums() > 0)

edited 11 hours ago

answered 11 hours ago

Ronak Shah

49k104370

Using base R, we can use sapply and use grepl to find pattern in every column and assign 1 to rows where there is more than 0 matches.

df$indicator <- as.integer(rowSums(sapply(df, grepl, pattern = code_regex)) > 0)



df

#  c1 c2 c3 indicator

#1 T1 R4 C5         1

#2 X1 C6 C2         0

#3 T6 C7 X4         0

#4 R5 X3 T2         1

If there are few other columns and we are interested to apply it only for columns which start with "c" we can use grep to filter them.

cols <- grep("^c", names(df))

as.integer(rowSums(sapply(df[cols], grepl, pattern = code_regex)) > 0)

Using dplyr we can do

library(dplyr)



df$indicator <- as.integer(df %>%

              mutate_at(vars(c1:c3), ~grepl(code_regex, .)) %>%

              rowSums() > 0)

edited 11 hours ago

answered 11 hours ago

Ronak Shah

49k104370

edited 11 hours ago

answered 11 hours ago

Ronak Shah

49k104370

answered 11 hours ago

Ronak Shah

49k104370

answered 11 hours ago

Ronak Shah

49k104370

This is a good solution, but in the real data there are additional variables that I do not want to pattern search, so this would require me to index the dataframe to include only the columns I want to search first. Will edit my original post to include this information.

– patward5656
11 hours ago

The purr solution looks like what I was looking for--one line of code that doesn't involve uniting the variables.

– patward5656
11 hours ago

@patward5656 I think the purrr solution would not give you the expected output. I changed it to use mutate_at which should work on range of columns. Moreover, you can use column numbers directly in cols for sapply ., say columns 3:5 or 1:3 to find pattern in those column.

– Ronak Shah
11 hours ago

add a comment |

This is a good solution, but in the real data there are additional variables that I do not want to pattern search, so this would require me to index the dataframe to include only the columns I want to search first. Will edit my original post to include this information.

– patward5656
11 hours ago

The purr solution looks like what I was looking for--one line of code that doesn't involve uniting the variables.

– patward5656
11 hours ago

@patward5656 I think the purrr solution would not give you the expected output. I changed it to use mutate_at which should work on range of columns. Moreover, you can use column numbers directly in cols for sapply ., say columns 3:5 or 1:3 to find pattern in those column.

– Ronak Shah
11 hours ago

This is a good solution, but in the real data there are additional variables that I do not want to pattern search, so this would require me to index the dataframe to include only the columns I want to search first. Will edit my original post to include this information.

– patward5656
11 hours ago

The purr solution looks like what I was looking for--one line of code that doesn't involve uniting the variables.

– patward5656
11 hours ago

@patward5656 I think the purrr solution would not give you the expected output. I changed it to use mutate_at which should work on range of columns. Moreover, you can use column numbers directly in cols for sapply ., say columns 3:5 or 1:3 to find pattern in those column.

– Ronak Shah
11 hours ago

add a comment |

Base R with apply

apply(df[cols], 1, function(x) sum(grepl(code_regex, x)))

# [1] 1 0 0 1

answered 10 hours ago

nsinghs

1,262621

add a comment |

Base R with apply

apply(df[cols], 1, function(x) sum(grepl(code_regex, x)))

# [1] 1 0 0 1

answered 10 hours ago

nsinghs

1,262621

add a comment |

Base R with apply

apply(df[cols], 1, function(x) sum(grepl(code_regex, x)))

# [1] 1 0 0 1

answered 10 hours ago

nsinghs

1,262621

Base R with apply

apply(df[cols], 1, function(x) sum(grepl(code_regex, x)))

# [1] 1 0 0 1

answered 10 hours ago

nsinghs

1,262621

answered 10 hours ago

nsinghs

1,262621

answered 10 hours ago

nsinghs

1,262621

answered 10 hours ago

nsinghs

1,262621

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Tjyylli