Centering, scaling, and transformations: improving the biological information content of metabolomics data

Author(s): van den Berg RA, Hoefsloot HC, Westerhuis JA, Smilde AK, van der Werf MJ

Abstract

Background: Extracting relevant biological information from large data sets is a major challenge in functional genomics research. Different aspects of the data hamper their biological interpretation. For instance, 5000-fold differences in concentration for different metabolites are present in a metabolomics data set, while these differences are not proportional to the biological relevance of these metabolites. However, data analysis methods are not able to make this distinction. Data pretreatment methods can correct for aspects that hinder the biological interpretation of metabolomics data sets by emphasizing the biological information in the data set and thus improving their biological interpretability.

Results: Different data pretreatment methods, i.e. centering, autoscaling, pareto scaling, range scaling, vast scaling, log transformation, and power transformation, were tested on a real-life metabolomics data set. They were found to greatly affect the outcome of the data analysis and thus the rank of the, from a biological point of view, most important metabolites. Furthermore, the stability of the rank, the influence of technical errors on data analysis, and the preference of data analysis methods for selecting highly abundant metabolites were affected by the data pretreatment method used prior to data analysis.

Conclusion: Different pretreatment methods emphasize different aspects of the data and each pretreatment method has its own merits and drawbacks. The choice for a pretreatment method depends on the biological question to be answered, the properties of the data set and the data analysis method selected. For the explorative analysis of the validation data set used in this study, autoscaling and range scaling performed better than the other pretreatment methods. That is, range scaling and autoscaling were able to remove the dependence of the rank of the metabolites on the average concentration and the magnitude of the fold changes and showed biologically sensible results after PCA (principal component analysis).In conclusion, selecting a proper data pretreatment method is an essential step in the analysis of metabolomics data and greatly affects the metabolites that are identified to be the most important.

Similar Articles

The early natural history of nephropathy in Type 1 Diabetes: III

Author(s): Steinke JM, Sinaiko AR, Kramer MS, Suissa S, Chavers BM, et al.

The pathogenesis of diabetic nephropathy

Author(s): Dronavalli S, Duka I, Bakris GL

Patterns of renal injury in NIDDM patients with microalbuminuria

Author(s): Fioretto P, Mauer M, Brocco E, Velussi M, Frigato F, et al.

New and old markers of progression of diabetic nephropathy

Author(s): Jerums G, Premaratne E, Panagiotopoulos S, Clarke S, Power DA, et al.

Recent and potential developments of biofluid analyses in metabolomics

Author(s): Zhang A, Sun H, Wang P, Han Y, Wang X

A metabolomic comparison of urinary changes in type 2 diabetes in mouse, rat, and human

Author(s): Salek RM, Maguire ML, Bentley E, Rubtsov DV, Hough T, et al.

A metabonomic comparison of urinary changes in Zucker and GK rats

Author(s): Zhao LC, Zhang XD, Liao SX, Gao HC, Wang HY, et al.

Scaling and normalization effects in NMR spectroscopic metabonomic data sets

Author(s): Craig A, Cloarec O, Holmes E, Nicholson JK, Lindon JC

Normalization of urinary drug concentrations with specific gravity and creatinine

Author(s): Cone EJ, Caplan YH, Moser F, Robert T, Shelby MK, et al.

Normalization strategies for metabonomic analysis of urine samples

Author(s): Warrack BM, Hnatyshyn S, Ott KH, Reily MD, Sanders M, et al.

Comprehensive profiling and quantitation of amine group containing metabolites

Author(s): Boughton BA, Callahan DL, Silva C, Bowne J, Nahid A, et al.

Proposed minimum reporting standards for data analysis in metabolomics

Author(s): Goodacre R, Broadhurst D, Smilde A, Kristal B, Baker J, et al.

A gentle guide to the analysis of metabolomic data

Author(s): Steuer R, Morgenthal K, Weckwerth W, Selbig J

Multiple hypothesis testing

Author(s): Shaffer JP

Comparison of specific gravity and creatinine for normalizing urinary reproductive hormone concentrations

Author(s): Miller RC, Brindle E, Holman DJ, Shofer J, Klein NA, et al.

Metabolite profiles and the risk of developing diabetes

Author(s): Wang TJ, Larson MG, Vasan RS, Cheng S, Rhee EP, et al.

Quantitative metabolomics by H-NMR and LC-MS/MS confirms altered metabolic pathways in diabetes

Author(s): Lanza IR, Zhang S, Ward LE, Karakelides H, Raftery D, et al.

Taurine intestinal absorption and renal excretion test in diabetic patients: a pilot study

Author(s): Merheb M, Daher RT, Nasrallah M, Sabra R, Ziyadeh FN, et al.

Preventive effect of taurine on experimental type II diabetic nephropathy

Author(s): Lin S, Yang J, Wu G, Liu M, Luan X, et al.