r - PCA on Control and treated data for different timepoints with replicates -
i new pca, , have confusion. have data has 12 samples of 6 control , 6 treated. there 2 time-point each control , treated , 3 replicates each time-points makes total 12 samples.
my data looks :
c21 c22 c23 c41 c42 c43 t21 t22 t23 t41 t42 t43 ensg00000000003 660 451 493 355 495 444 743 259 422 204 149 623 ensg00000000005 0 0 0 0 0 0 0 0 0 0 0 0 ensg00000000419 978 928 1161 641 810807 1265 361 998 326 239 1055 ensg00000000457 234 248 444 192 218 326 615 122 395 134 100 406 ensg00000000460 1096 919 1253 693 907 1185 1648 381 1119 422 269 1267
now want carry out pca on data, showing every gene , point control samples , point treated samples (to calculate euclidean distance between genes control , treated). first 6 samples should taken control point , last 6 samples should taken treated. note: need genes plotted on pca graph control , treated samples (not samples self).
i did pca aready takes data , gives on 1 point each gene, not separate point control , treated every gene. how can deal this? can help?
df <- read.table( text = " c21 c22 c23 c41 c42 c43 t21 t22 t23 t41 t42 t43 ensg00000000003 660 451 493 355 495 444 743 259 422 204 149 623 ensg00000000005 0 0 0 0 0 0 0 0 0 0 0 0 ensg00000000419 978 928 1161 641 810 807 1265 361 998 326 239 1055 ensg00000000457 234 248 444 192 218 326 615 122 395 134 100 406 ensg00000000460 1096 919 1253 693 907 1185 1648 381 1119 422 269 1267", header = true)
simply rearrange input data prior pca. control , treatment observations should below each other.
dfc <- df[, 1:6] dft <- df[, 7:12] names(dfc) <- gsub("[[:alpha:]*]", "", names(dfc)) names(dft) <- gsub("[[:alpha:]*]", "", names(dft)) rownames(dft) <- paste0(rownames(dft), "_t") df1 <- rbind(dfc, dft) summary(pca <- princomp(df1)) biplot(pca)
note answer not endorse statistical approach , answers programming question.
Comments
Post a Comment