galgo accepts an expression matrix and a survival object to find robust gene expression signatures related to a given outcome

galgo (population = 30, generations = 2, nCV = 5,
distancetype = "pearson", TournamentSize = 2, period = 1825,
OS, prob_matrix, res_dir = "", start_galgo_callback = callback_default,
end_galgo_callback = callback_base_return_pop,
report_callback = callback_base_report,
start_gen_callback = callback_default,
end_gen_callback = callback_default,
verbose = 2)

Arguments

population

a number indicating the number of solutions in the population of solutions that will be evolved

generations

a number indicating the number of iterations of the galgo algorithm

nCV

number of cross-validation sets

distancetype

character, it can be 'pearson' (centered pearson), 'uncentered' (uncentered pearson), 'spearman' or 'euclidean'

TournamentSize

a number indicating the size of the tournaments for the selection procedure

period

a number indicating the outcome period to evaluate the RMST

OS

a survival object (see Surv function from the survival package)

prob_matrix

a matrix or data.frame. Must be an expression matrix with features in rows and samples in columns

res_dir

a character string indicating where to save the intermediate and final output of the algorithm

start_galgo_callback

optional callback function for the start of the galgo execution

end_galgo_callback

optional callback function for the end of the galgo execution

report_callback

optional callback function

start_gen_callback

optional callback function for the beginning of the run

end_gen_callback

optional callback function for the end of the run

verbose

select the level of information printed during galgo execution

Value

an object of type 'galgo.Obj' that corresponds to a list with the elements $Solutions and $ParetoFront. $Solutions is a \(l x (n + 5)\) matrix where \(n\) is the number of features evaluated and \(l\) is the number of solutions obtained. The submatrix \(l x n\) is a binary matrix where each row represents the chromosome of an evolved solution from the solution population, where each feature can be present (1) or absent (0) in the solution. Column \(n +1\) represent the \(k\) number of clusters for each solutions. Column \(n+2\) to \(n+5\) shows the SC Fitness and Survival Fitness values, the solution rank, and the crowding distance of the solution in the final pareto front respectively. For easier interpretation of the 'galgo.Obj', the output can be reshaped using the to_list and to_dataframe functions

Examples

# load example dataset library(breastCancerTRANSBIG) data(transbig) Train <- transbig rm(transbig)
#> Warning: object 'transbig' not found
expression <- Biobase::exprs(Train) clinical <- Biobase::pData(Train) OS <- survival::Surv(time = clinical$t.rfs, event = clinical$e.rfs) # We will use a reduced dataset for the example expression <- expression[sample(seq_len(nrow(expression)), 100), ] # Now we scale the expression matrix expression <- t(scale(t(expression))) # Run galgo output <- GSgalgoR::galgo(generations = 5, population = 15, prob_matrix = expression, OS = OS)
#> Using CPU for computing pearson distance
#> Generation 1 Non-dominated solutions:
#> k rnkIndex CrowD #> result.3 2 0.14882360 69.46322 1 Inf #> result.4 9 0.00656122 264.62890 1 Inf #> result.10 4 0.07773017 192.97399 1 1.695167
#> Generation 2 Non-dominated solutions:
#> k rnkIndex CrowD #> result.3 2 0.14882360 69.46322 1 Inf #> result.1 7 0.02571842 372.07256 1 Inf #> result.10 4 0.07773017 192.97399 1 1.106057 #> result.13 5 0.03596436 205.87446 1 0.824781
#> Generation 3 Non-dominated solutions:
#> k rnkIndex CrowD #> result.3 2 0.14882360 69.46322 1 Inf #> result.1 7 0.02571842 372.07256 1 Inf #> 4 0.06304767 212.49021 1 0.8401363 #> 3 0.08357203 77.11589 1 0.8206758 #> result.10 4 0.07773017 192.97399 1 0.5063178
#> Generation 4 Non-dominated solutions:
#> k rnkIndex CrowD #> result.3 2 0.14882360 69.46322 1 Inf #> result.1 7 0.02571842 372.07256 1 Inf #> 4 0.06304767 212.49021 1 0.8494499 #> 3 0.08357203 77.11589 1 0.8334064 #> result.10 4 0.07773017 192.97399 1 0.5099930
#> Generation 5 Non-dominated solutions:
#> k rnkIndex CrowD #> result.1 7 0.02571842 372.07256 1 Inf #> 2 0.16568616 44.93058 1 Inf #> 2 0.13112577 85.75740 1 0.6696392 #> 8 0.04581630 287.44268 1 0.6687258 #> 2 0.08329750 164.24513 1 0.6272345 #> 4 0.06304767 212.49021 1 0.4575554 #> result.3 2 0.14882360 69.46322 1 0.3282574 #> result.10 4 0.07773017 192.97399 1 0.2585055
outputDF <- to_dataframe(output) outputList <- to_list(output)