Example, the aforementioned Stouffer method) to ACY 241 cost aggregate the module memberships across
Example, the aforementioned Stouffer method) to aggregate the module memberships across the ten data sets. Here, we used the average aggregation implemented in the WGCNA function consensusKME. Module membership measures allow one to efficiently annotate all methylation profiles on the array [51]. Further details on the consensus module approach can be found in [23,47]. Numerous network inference algorithms have been developed, including ARACNE [52] and BANJO [53]. A comparison of different network inference algorithms lies beyond the scope of this biology paper. A recent review article compares the performance of WGCNA to ARACNE and other algorithms [49]. Advantages of WGCNA include i) that it provides module preservation statistics that are being used in this article, ii) powerful functions for consensus PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/28607003 module analysis, iii) the availability of module membership measures, and iv) proven methods for finding modules.Module preservation analysisOur module preservation analysis is based on the approach described in [24] and implemented in the modulePreservation R function implemented in the WGCNA R package. The modulePreservation R function implements several powerful network-based statistics for evaluating module preservation. For each module in the reference data (for example, a brain methylation data set) one observes a value of a module preservation statistic in the test data (for example, the MSC methylation data set). An advantage of these network-based preservation statistics is that they make few assumptions regarding module definition and module properties. Traditional cross-tabulation-basedHorvath et al. Genome Biology 2012, 13:R97 http://genomebiology.com/2012/13/10/RPage 15 ofstatistics are inferior for the purposes of our study. While cross-tabulation approaches are intuitive, they have several disadvantages. To begin with, they are only applicable if the module assignment in the test data results from applying a module detection procedure to the test data. Even when modules are defined using a module detection procedure, cross-tabulation-based approaches face potential pitfalls. A module found in the reference data set will be deemed non-reproducible in the test data set if no matching module can be identified by the module detection approach in the test data set. Such nonpreservation may be called weak non-preservation: `the module cannot be found using the current parameter settings of the module detection procedure’. On the other hand, here we are interested in establishing strong nonpreservation: `the module cannot be found irrespective of the parameter settings of the module detection procedure’. Strong non-preservation is difficult to establish using cross-tabulation approaches that rely on module assignment in the test data set. A second disadvantage of a cross-tabulation-based approach is that it requires that for each reference module one finds a matching test module. This may be difficult when a reference module overlaps with several test modules or when the overlaps are small. A third disadvantage is that cross-tabulating module membership between two networks may miss the fact that the patterns of density or connectivity between module nodes are highly preserved between the two networks. The correlation network-based statistics implemented in the modulePreservation function do not require the module assignment in the test network but require the user to input DNA methylation data underlying a reference data set a.