我已经编写了一个函数,用于增加特定行中某些列的值。我通过编写一个函数来实现这个功能,该函数会遍历我的数据框,找到需要的行(通过检查性别、年龄、贫困程度、然后是伴侣数量),然后根据这些风险因素,在我需要的列上增加数字,接着计算风险(我的代码用于性传播感染测试)。
然而,这并不会改变我的现有数据框中的新值,而是创建了一个新的变量patientRow来保存这些新值。我需要帮助来搞清楚如何将这些更改融入到我现有的数据框中。谢谢!
adaptRisk <- function(dataframe, sexNum, ageNum, deprivationNum, partnerNum, testResult){sexRisk = subset(dataframe, sex == sexNum)ageRisk = subset(sexRisk, age == ageNum)depRisk = subset(ageRisk, deprivation == deprivationNum)patientRow = subset(depRisk, partners == partnerNum) if (testResult == "positive") { patientRow$tested <- patientRow$tested + 1 patientRow$infected <- patientRow$infected + 1} else if (testResult == "negative") { patientRow$tested <- patientRow$tested + 1}patientRow <- transform(patientRow, risk = infected/tested)return(patientRow)}
这是我的数据框的头部,以给您一个概念:
sex age deprivation partners tested infected risk1 Female 16-19 1-2 0-1 132 1 0.0075757582 Female 16-19 1-2 2 25 1 0.0400000003 Female 16-19 1-2 >=3 30 1 0.0333333334 Female 16-19 3 0-1 80 2 0.0250000005 Female 16-19 3 2 12 1 0.0833333336 Female 16-19 3 >=3 18 1 0.055555556
我的数据的dput是:
structure(list(sex = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Female", "Male"), class = "factor"), age = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("16-19", "20-24", "25-34", "35-44"), class = "factor"), deprivation = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("1-2", "3", "4-5"), class = "factor"), partners = structure(c(2L, 3L, 1L, 2L, 3L, 1L), .Label = c(">=3", "0-1", "2"), class = "factor"), tested = c(132L, 25L, 30L, 80L, 12L, 18L), infected = c(1L, 1L, 1L, 2L, 1L, 1L), uninfected = c(131L, 24L, 29L, 78L, 11L, 17L), risk = c(0.00757575757575758, 0.04, 0.0333333333333333, 0.025, 0.0833333333333333, 0.0555555555555556)), .Names = c("sex", "age", "deprivation", "partners", "tested", "infected", "uninfected", "risk"), row.names = c(NA, 6L), class = "data.frame")
函数调用的示例:
adaptRisk(data, "Female", "16-19", 3, 2, "positive") sex age deprivation partners tested infected uninfected risk5 Female 16-19 3 2 13 2 11 0.1538462
回答:
我使用基础R语法调整了你的函数(请见下方全部内容)。它完成了任务,但并不是最优雅的代码。
问题:子集操作会创建很多额外(且不必要的)数据框,而不是在条件匹配时替换内部值。而且返回的是一个不同的数据框,所以现有的数据框无法正确处理它。
我进行了调整,使过滤操作仅在你想要更改的对象上进行。
transform可能会产生意想不到的副作用,而且你之前是在重新计算整个风险列。现在只重新计算受影响的值。
你可能希望在过滤返回超过一条记录的情况下内置一些警告/停止措施。
你现在可以使用df <- adaptRisk(df, "Female", "16-19", "3", "2", "positive")
来替换你提供给函数的数据框中的值
示例
# 影响第5行adaptRisk(df, "Female", "16-19", "3", "2", "positive") sex age deprivation partners tested infected uninfected risk1 Female 16-19 1-2 0-1 132 1 131 0.0075757582 Female 16-19 1-2 2 25 1 24 0.0400000003 Female 16-19 1-2 >=3 30 1 29 0.0333333334 Female 16-19 3 0-1 80 2 78 0.0250000005 Female 16-19 3 2 13 2 11 0.1538461546 Female 16-19 3 >=3 18 1 17 0.055555556# 影响第5行 adaptRisk(df, "Female", "16-19", "3", "2", "negative") sex age deprivation partners tested infected uninfected risk1 Female 16-19 1-2 0-1 132 1 131 0.0075757582 Female 16-19 1-2 2 25 1 24 0.0400000003 Female 16-19 1-2 >=3 30 1 29 0.0333333334 Female 16-19 3 0-1 80 2 78 0.0250000005 Female 16-19 3 2 13 1 11 0.0769230776 Female 16-19 3 >=3 18 1 17 0.055555556
函数:
adaptRisk <- function(data, sexNum, ageNum, deprivationNum, partnerNum, testResult){ if (testResult == "positive") { data$tested[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] <- data$tested[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] + 1 data$infected[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] <- data$infected[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] + 1 data$risk[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] <- data$infected[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum]/data$tested[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] } else if (testResult == "negative") { data$tested[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] <- data$tested[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] + 1 data$risk[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] <- data$infected[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum]/data$tested[data$sex == sexNum & data$age == ageNum & data$deprivation == deprivationNum & data$partners == partnerNum] } return(data)}