92
Views
0
CrossRef citations to date
0
Altmetric
Original Research

Local in Time Statistics for detecting weak gene expression signals in blood – illustrated for prediction of metastases in breast cancer in the NOWAC Post-genome Cohort

, , &
Pages 11-28 | Published online: 10 Jul 2017

Figures & data

Table 1 Number of case-control pairs with gene expression data Xg,p in each stratum and year before diagnosis in the screening group

Table 2 Number of case-control pairs with gene expression data Xg,p in each stratum and year before diagnosis in the clinical group

Figure 1 Overview of hypothesis tests, prediction methods, variables, and strata.

Notes: Illustration of the association between the data Xg,p, the different hypothesis tests, the prediction methods, the variables used in these tests and methods, and the strata.
Figure 1 Overview of hypothesis tests, prediction methods, variables, and strata.

Table 3 Association between tumor size and metastases

Figure 2 The statistics sp,(g) and mp,(g).

Notes: Plots are shown for not standardized data for sp,(g) and mp,(g), and also for standardized data for sp,(g). Curves are shown for the data in the period closest to and furthest from diagnosis. (A) Data from the screening group. (B) Data from the clinical group.
Abbreviations: st.dev., standard deviation; w.r.t, with respect to.
Figure 2 The statistics sp,(g) and mp,(g).

Figure 3 The statistic wp,(g) and hypothesis H0-node.

Notes: Plot of statistic (A) and plot of p-values against time (B) for the statistic wp,(g) for the screening (left panel) and clinical group (right panel). (A) Plot of the statistic wp,(g) where the two periods contain 50 (screening) or 25 (clinical) case-control pairs where the case is without metastases. (B) Plot of p-values against time for the statistic wp,(g) (H0-node). In each plot, there is one curve for genes with order 50 (black), 200 (red), 500 (green), 1000 (blue), and 2000 (light blue), respectively. p-value for time point t is equal to the p-value for the time period with middle point closest to t (after the p-values have been smoothed using a median filter with window size 99). The resulting curve is then smoothed using a mean filter with a window size of 1 month. The dotted horizontal line indicates a 0.05 level of significance, while the long vertical lines indicate the years before diagnosis.
Abbreviation: w.r.t, with respect to.
Figure 3 The statistic wp,(g) and hypothesis H0-node.

Figure 4 Hypotheses H0-case-ctrl and H0-time.

Notes: Plots of p-values against time for the hypotheses H0-case-ctrl and H0-time where the datasets with cases without metastases are used. (A) Results for data from the screening group. (B) Results for data from the clinical group. In each plot, there is one curve for genes with order 50 (black), 200 (red), 500 (green), 1000 (blue), and 2000 (light blue), respectively. p-value for time point t is equal to the p-value for the time period with middle point closest to t (after the p-values have been smoothed using a median filter with window size 99). The resulting curve is then smoothed using a mean filter with a window size of 1 month. The dotted horizontal line indicates a 0.05 level of significance, while the long vertical lines indicate the years before diagnosis.
Figure 4 Hypotheses H0-case-ctrl and H0-time.

Table 4 Number of correctly and wrongly classified cases in the screening group and the clinical group

Figure 5 Prediction results.

Notes: (A) Correctly (green) or wrongly (red) classified cases plotted against time to diagnosis for the screening (upper panel) and the clinical group (lower panel). A circle is plotted above every fifth case. Long vertical lines are plotted to indicate the years. Cases with metastases are plotted on the line labeled “With”, while cases without metastases are plotted on the line labeled “Without”. (B) Fraction of correctly classified cases with (red) and without (black) metastases over time for the screening (upper panel) and the clinical group (lower panel). The fraction for each point in time is computed using a moving window of 1 year (clinical) or 100 days (screening). The resulting curve is then smoothed using a median filter with a window size of 1 year (clinical) or 100 days (screening).
Figure 5 Prediction results.

Table 5 Top 10 differentially expressed genes in clinically and screening-detected cases when comparing cases with metastases to cases without metastases

Figure 6 ROC curves.

Notes: ROC curves obtained when predicting the metastasis status of the cases. (A) ROC curve for the screening group in year 1 before diagnosis. (B) ROC curve for the clinical group in year 3–4 before diagnosis.
Abbreviation: ROC, receiver operating characteristic.
Figure 6 ROC curves.

Figure S1 Boxplots illustrating how the score used in the predictor depends on the number of genes included in the score.

Notes: The score has been normalized by dividing with the number of genes included in the score. The score for the cases with metastases should be positive (lower panel), while the scores for the cases without metastases should be negative (upper panel). (A) Scores for case-control pairs around 6 months from the screening group. (B) Scores for case-control pairs around 2 years and 6 months from the clinical group.
Figure S1 Boxplots illustrating how the score used in the predictor depends on the number of genes included in the score.