2,740
Views
3
CrossRef citations to date
0
Altmetric
Original Research

Non-small cell lung cancer patients treated with Anti-PD1 immunotherapy show distinct microbial signatures and metabolic pathways according to progression-free survival and PD-L1 status

, , , , , , , , , , & show all
Article: 2204746 | Received 19 Oct 2022, Accepted 16 Apr 2023, Published online: 12 May 2023

ABSTRACT

Due to the high variance in response rates concerning anti-PD1 immunotherapy (IT), there is an unmet need to discover innovative biomarkers to predict immune checkpoint inhibitor (ICI)-efficacy. Our study included 62 Caucasian advanced-stage non-small cell lung cancer (NSCLC) patients treated with anti-PD1 ICI. Gut bacterial signatures were evaluated by metagenomic sequencing and correlated with progression-free survival (PFS), PD-L1 expression and other clinicopathological parameters. We confirmed the predictive role of PFS-related key bacteria with multivariate statistical models (Lasso- and Cox-regression) and validated on an additional patient cohort (n = 60). We find that alpha-diversity showed no significant difference in any comparison. However, there was a significant difference in beta-diversity between patients with long- (>6 months) vs. short (≤6 months) PFS and between chemotherapy (CHT)-treated vs. CHT-naive cases. Short PFS was associated with increased abundance of Firmicutes (F) and Actinobacteria phyla, whereas elevated abundance of Euryarchaeota was specific for low PD-L1 expression. F/Bacteroides (F/B) ratio was significantly increased in patients with short PFS. Multivariate analysis revealed an association between Alistipes shahii, Alistipes finegoldii, Barnesiella visceriola, and long PFS. In contrast, Streptococcus salivarius, Streptococcus vestibularis, and Bifidobacterium breve were associated with short PFS. Using Random Forest machine learning approach, we find that taxonomic profiles performed superiorly in predicting PFS (AUC = 0.74), while metabolic pathways including Amino Acid Synthesis and Fermentation were better predictors of PD-L1 expression (AUC = 0.87). We conclude that specific metagenomic features of the gut microbiome, including bacterial taxonomy and metabolic pathways might be suggestive of ICI efficacy and PD-L1 expression in NSCLC patients.

Introduction

The combination of anti-PD1 immune checkpoint inhibitor- (ICI) and chemotherapy (CHT) represents the backbone for current combination strategies in the front-line setting of advanced-stage non-small cell lung cancer (NSCLC) patients. Nevertheless, even today, only about 20% of all NSCLC patients exhibit stable disease or respond properly to immunotherapy (IT), and only a limited number of patients experience durable benefitsCitation1,Citation2. In recent years, multiple studies have shown that the gut microbiome might influence ICI efficacy, and CHT might modulate the gut flora. CHT enhances IT efficacy and acts through cancer neoantigen production that T-cells might recognizeCitation3. Antigens of commensal microbiota can pass through the intestinal barrier and result in T-cell priming, cytokine and interferon production stimulation, and Toll-like receptor activation through the gut-lung axisCitation4,Citation5. This phenomenon is called molecular mimicry, where epitopes produced by microbial species in the gut as part of their natural gene expression programs can resemble tumor neoantigens promoting “autoreactive” T-cells and potent anti-tumor immunityCitation6.

Others identified recently a key role for Bacteroidales in the immunostimulation associated with ICICitation7,Citation8 Others showed a connection between the composition of the gut microbiome and ICI efficacy in malignant melanomaCitation9–12, kidney-Citation13, colorectal and gastrointestinal cancersCitation14,Citation15, and NSCLCCitation16,Citation17, or in syngeneic mouse modelsCitation18,Citation19. In a large-scale, multi-cancer cohort of epithelial malignancies, fecal microbiota transplantation (FMT) was also shown to increase ICI efficacy in experimental animalsCitation20.

PD-L1 signaling affects gut mucosa toleranceCitation21. However, it is still unclear if there is a direct linkage between the host microbiome and tumor PD-L1 expression. Jang and colleagues hypothesize that through the microbiome-gut-lung axis metagenomic signatures might interact with the tumor and its immune microenvironmentCitation22. Despite the well-established positive predictive value of tumor PD-L1 immunohistochemistry (IHC) expression in NSCLCCitation23, clinical evidence shows that more than 50% of PD-L1 high-expressing patients might not respond to PD-1/PD-L1 blockadeCitation24. Altered gut microbiome was also associated with ICI toxicityCitation25 and other treatments, such as antibioticsCitation26, steroidsCitation27, or proton-pump inhibitors (PPI)Citation28. Others reported that the administration of CHT profoundly disrupts the gut microbiome and affects adverse event frequency, progression-free- (PFS), and overall survival (OS) of cancer patientsCitation29,Citation30.

Given the high variance in response rates concerning immunotherapy, there is an unmet need to discover innovative biomarkers to efficiently select patients for ICI treatments. Our study analyzed the gut microbiome of 62 immunotherapy-treated NSCLC patients using shotgun metagenomics. In this study, we have evaluated the gut bacterial diversity, taxonomy, and metagenome pathways according to PFS, PD-L1 expression, and other clinicopathological parameters, and the effects of antibiotic-, antacid- and steroid therapy. To our knowledge, this is the most comprehensive fecal metagenome analysis of Caucasian lung cancer patients treated with anti-PD ICI.

Methods

Study population and treatments

We enrolled consecutive patients with histologically confirmed adenocarcinoma (ADC), squamous cell carcinoma, and NSCLC not otherwise specified (NSCLC-NOS). All patients were diagnosed with advanced-stage NSCLC (stage IIIB/IV). The clinical TNM (Tumor, Node, Metastasis) stage was determined according to the Union for International Cancer Control (8th edition) at the time of diagnosis. Clinicopathological data included age at the time of diagnosis, gender, smoking pack year (PY), line of immunotherapy (first-line (CHT-naïve) vs. subsequent line (CHT-treated)), the administration of thoracic radiotherapy (RT), tumor PD-L1 expression (IHC, <50% vs.≥50%), the occurrence of treatment-related adverse events (trAEs, toxicity) and PFS (STable 1a and b). The detailed inclusion and exclusion criteria of our study cohorts are shown in the Supplemental Methods.

For irAEs the Common Terminology Criteria for Adverse Events (CTCAE) v5.0 was applied. In line with othersCitation17, ICI-treated patients with complete response (CR), partial response (PR), or stable disease (SD) lasting for at least six months were classified as long PFS (>6 months). Likewise, patients who experienced progressive disease within six months of treatment initiation were classified as short PFS (≤6 months). Of note, this classification estimates better the long-term benefits and is more accurate and rigorous in estimating disease progression for a subset of patients who suffer pseudo-progression. PFS was defined as the elapsed time from the commencement of ICI therapy to disease progression according to the aforementioned RECIST 1.1 criteria. The date of the last follow-up included in this analysis was 1st of December, 2021. Treatments across all centers were conducted under the current National Comprehensive Cancer Network guidelines. Treatments including antibiotics, steroids, and antacids such as proton pump inhibitors and histamine-blocker administration (PPI/H-blocker) were also included. Antibiotic (AB) use, steroid, and PPI/H-blocker treatments prior to the initiation ICI therapy (60 days before) were also recorded. All patients were assessed with 0–1 ECOG performance status at the initiation of IT. After signing informed consent, patients with baseline stool samples defined as collected within seven days, either before or after the first iv dose of ICI administration were included. On the collection day, all samples were placed in the −80°C freezer until microbiome isolation and sequencing was initiated.

STable 2a and b show the systemic therapy that patients received at the time of sampling for the Discovery and Validation cohorts, respectively, approved by the Institutional Oncology Teams between 2017 and 2019 at the National Koranyi Institute of Pulmonology, Budapest, Hungary, and at the County Hospital of Pulmonology, Torokbalint, Hungary. STable 2c shows the comparison of clinicopathological parameters according to the type of IT drug in the Discovery cohort. STable 3 compares first-line (CHT-naïve) versus subsequent-line (CHT-treated) patients in the Discovery cohort. A total of n = 9 patients received Chemotherapy-Immunotherapy combination (CHT+IO) treatment, and a total of n = 4 received Chemoradiation-followed by Immunotherapy combination (CRT+IO) treatment, listed in STable 2a-b. Patients treated with PD-1 inhibitors durvalumab (n = 7) and atezolizumab (n = 4) participated in phase III clinical trials included in this cohort. The sample collection is not a therapeutic intervention and does not require listing on clinicaltrials.gov.

Taxonomic assignment

The reads were adaptor-trimmed and quality-filtered for a minimum mean Q-score of 30. Quality check was performed using fastQC, to remove the adapter regions, low-quality reads, and human DNA contaminations (bwa (version 0.7.4-r385) passing per sequence quality score, per base N content, and adapter contentCitation31.

Kraken2 was used (version 2.0.8)Citation32 with the MiniKraken2 database for the taxonomic assignment. The result files were merged into a data matrix with KrakenTools (v1.2) combine_kreports.py script. The read counts were normalized by rarefaction using the smallest sample as minimum depth with inclusion criteria of min. 1 read in min. 1 sample per taxa. A significant proportion of the reads had fallen in the unclassified bin (mean = 0.58, SD = 0.086). Taxa stratified the results for statistical analysis. In further analyses, only taxa within the domains of Bacteria and Archaea were included; we excluded all viral and eukaryotic taxonomic units. Rare (present maximum 10% in all samples) and low abundance (support of less than 0.01% abundance) taxa were discarded from the subsequent analysis. After filtering, a Bayesian-Multiplicative replacement of zeros was carried out using the z Composition R package, followed by central log-ratio (CLR) transformation of count and ratio values as implemented in scikit-bio. In CLR-transformation, sample vectors undergo a transformation based on the logarithm of the ratio between the individual elements and the geometric mean of the vector. The Shannon index was used to measure alpha diversities, which quantifies the entropy of the distributions of taxa and functional proportions.

Statistical analyses

The visualization of the microbiome’s taxonomic and pathway composition (beta-diversity) was carried out with Uniform Manifold Approximation and Projection (UMAP) using the CLR values as input matrix (scikit-learn v0.24)Citation33. The compositional similarities between different groups were investigated with permutational analysis of variance (PERMANOVA), and the differential abundance testing was done using the Wilcoxon rank-sum test. The associations between CLR-normalized abundances of taxa were investigated with Spearman’s rank correlation for network analyses. Survival analysis was carried out using Kaplan-Meier (KM) curves, and survival curves were compared using the log-rank test. Cut-offs for KM curves and specificity/sensitivity values for taxa were defined by Receiver Operating Characteristic (ROC) curve analysis using the binary outcomes of short vs. long PFS.

The least absolute shrinkage and selection operator (Lasso) regression was used to select the most predictive markers from our high-dimensional data and reduce the interaction between markers to avoid overfitting. When fitting the generalized linear model, Lasso regression is distinguished by variable selection and complexity regularization. The optimal hyperparameters were selected via a 5-fold cross-validation procedure using ROC-AUC as the target metric. To identify relevant predictor factors, Cox-proportional hazard regression was performed. The analysis was two-sided, with a significance threshold of = 0.05. The backwards elimination method was used in multivariate Cox regression, where parameters (p < 0.1) were included. Harrel’s C-index was calculated to assess the fit quality of our multivariate model that performed above 0.7 (fair) in all analyses.

Processing of fecal samples, DNA extraction, sequencing steps, and the methodology of metagenome pathway assessment, PD-L1 IHC, and machine learning approach are described in Supplemental Methods.

Results

A total of n = 62 consecutive advanced-stage NSCLC patients treated with ICI were enrolled in our Discovery cohort. (STable 1a). There were 46 patients with long- and 16 with short PFS. 30 patients received first-line IT (CHT-naïve), and 32 patients received ICIs in a subsequent-line (CHT-treated) (STable 3). We also analyzed an ICI-treated Validation cohort of advanced-stage NSCLC patients treated between 2017 and 2018 at the same institutions with the same treatments and guidelines to validate the relative abundance of key bacterial species associated with long- or short PFS (n = 60, STable 1b). STable 4 shows the comparison in clinical parameters between the Discovery and Validation cohorts, where the proportion of PD-L1 low IHC expressing patients (p = 0.001) and CHT-treated patients (p < 0.001) were significantly higher compared to the Discovery cohort. SFig 1A shows the study design on a flow chart. In the Discovery cohort, CHT-naive and PD-L1-high patients showed significantly better PFS compared to CHT-treated (p = 0.0016) and PD-L1-low patients (p = 0.0041), respectively (SFig 1C-D). No significant difference was detected in PFS according to gender (p = 0.055, SFig 1B), toxicity (p = 0.872, SFig 1E), antibiotic treatment (p = 0.247, SFig 1F), antacid medication (p = 0.88, SFig 1 G) and steroid treatment (p = 0.0894, SFig 1 H). STable 5a-b shows uni- and multivariate Cox hazard regression with the baseline clinical parameters, where PD-L1 status (high vs. low) was the only significant predictor in the multivariate analysis (p = 0.005, HR: [3.861]).

The composition of bacterial communities differs significantly according to progression-free survival and chemotherapy treatment.

First, alpha-diversity was assessed in fecal samples according to PFS, PD-L1 expression, and CHT treatment for all major phylogenetic levels, including phylum, class, genus, and species. Shannon index showed no significant differences in any comparisons (). UMAP plot and beta-diversity analyses showed that short PFS patients represent a significantly different bacterial composition compared to long PFS patients (p < 0.001, ), whereas patient groups do not differ significantly according to PD-L1 expression (p = 0.508, ). Interestingly, front-line CHT-treated patients exhibit a significantly different bacterial community than CHT-naive patients (p = 0.015, ).

Figure 1. Alpha diversity and composition of bacterial communities (beta diversity). Shannon diversity index was calculated at phylum, class, genus and species taxonomy level according to long vs short PFS (cutoff 6 months), PD-L1 expression (≥50% high, <50% low) and front-line chemotherapy (CHT)-treatment prior to ICI. There was no significant difference in Shannon diversity index regarding PFS (A), PD-L1 expression (B) and CHT-treatment (C) using either taxonomic level. Diversity (Shannon and Simpson) indices and p-values for all alpha-diversity comparisons are listed in STable 3. Ordination plot using UMAP was generated from normalized, CLR-transformed bacterial abundances in the same comparisons. Permanova analysis was used to assess significant differences between the composition of bacterial communities. There was a significant difference between long- vs short PFS patients (F = 3.379, p = 0.0006, D) and between CHT-treated vs CHT-naive patients (F = 2.139, p = 0.0156, F), whereas no significant difference was detected between PD-L1 high vs PD-L1 low patients (F = 0.916, p = 0.0156, D) regarding the composition of bacterial species. Metric data are shown as mean and corresponding standard deviation (SD). Statistical significance *P < 0.05; **P < 0.01, ***P<.001. N/A: data not available.

Figure 1. Alpha diversity and composition of bacterial communities (beta diversity). Shannon diversity index was calculated at phylum, class, genus and species taxonomy level according to long vs short PFS (cutoff 6 months), PD-L1 expression (≥50% high, <50% low) and front-line chemotherapy (CHT)-treatment prior to ICI. There was no significant difference in Shannon diversity index regarding PFS (A), PD-L1 expression (B) and CHT-treatment (C) using either taxonomic level. Diversity (Shannon and Simpson) indices and p-values for all alpha-diversity comparisons are listed in STable 3. Ordination plot using UMAP was generated from normalized, CLR-transformed bacterial abundances in the same comparisons. Permanova analysis was used to assess significant differences between the composition of bacterial communities. There was a significant difference between long- vs short PFS patients (F = 3.379, p = 0.0006, D) and between CHT-treated vs CHT-naive patients (F = 2.139, p = 0.0156, F), whereas no significant difference was detected between PD-L1 high vs PD-L1 low patients (F = 0.916, p = 0.0156, D) regarding the composition of bacterial species. Metric data are shown as mean and corresponding standard deviation (SD). Statistical significance *P < 0.05; **P < 0.01, ***P<.001. N/A: data not available.

Firmicutes and Actinobacteria phyla are more abundant in patients with short PFS, along with an increased F/B ratio.

shows phylum composition for every patient, where all phyla with CLR≥-1 were included. For further analysis, we removed all non-bacterial taxonomic units such as fungi (Ascomycota), protozoa (Apicomplexa), viruses, chordata (human host), and plants (Streptophyta). Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, and Verrucomicrobia represent high abundance phyla (CLR≥0). In contrast, low abundance phyla (0>CLR≥-1) included Euryarchaeota, Cyanobacteria, Spirochetes, and Tenericutes in the gut microbiome of our patient cohort (). Firmicutes (p = 0.011, ) and Actinobacteria (p = 0.004, ) were significantly more abundant in patients with short PFS. Euryarchaeota were significantly more abundant in PD-L1-low patients than in PD-L1 high expressors (p = 0.001, SFig 2A), and Cyanobacteria were significantly more abundant in CHT-naive patients (compared to CHT-treated, p = 0.029, SFig 2B). An increased Firmicutes/Bacteroidetes (F/B) ratio was associated with short PFS (p = 0.013, ). Furthermore, we detected an increased Actinobacteria to Proteobacteria ratio (A/P ratio) in patients with short PFS (p = 0.011, ).

Figure 2. Differentially abundant phyla and outcomes. Phylum composition (fraction of total reads) of all patients in the Discovery cohort value is shown in the stacked bar chart (A). After removing phyla with minimal abundance (Clr<-1) and excluding non-Bacteria and non-Archaea taxa, Wilcoxon rank-sum tests were performed to compare their abundance in patients with long- vs short PFS (B). Firmicutes and Actinobacteria are significantly more abundant in patients with short PFS vs long PFS (4.704 vs 3.612 and 6.213 vs 5.935, p = 0.0114 and p = 0.0046, respectively, C and D). Moreover, both Firmicutes/Bacteroidetes (F/B, 1.294 vs 1.132, p = 0.0137, E) and Actinobacteria/Proteobacteria ratio (A/P, 1.068 vs 0.827, p = 0.0113, F) were significantly increased in patients with short PFS. Metric data are shown as mean and corresponding standard deviation (SD). Statistical significance *P < 0.05; **P < 0.01, ***P<.001.

Figure 2. Differentially abundant phyla and outcomes. Phylum composition (fraction of total reads) of all patients in the Discovery cohort value is shown in the stacked bar chart (A). After removing phyla with minimal abundance (Clr<-1) and excluding non-Bacteria and non-Archaea taxa, Wilcoxon rank-sum tests were performed to compare their abundance in patients with long- vs short PFS (B). Firmicutes and Actinobacteria are significantly more abundant in patients with short PFS vs long PFS (4.704 vs 3.612 and 6.213 vs 5.935, p = 0.0114 and p = 0.0046, respectively, C and D). Moreover, both Firmicutes/Bacteroidetes (F/B, 1.294 vs 1.132, p = 0.0137, E) and Actinobacteria/Proteobacteria ratio (A/P, 1.068 vs 0.827, p = 0.0113, F) were significantly increased in patients with short PFS. Metric data are shown as mean and corresponding standard deviation (SD). Statistical significance *P < 0.05; **P < 0.01, ***P<.001.

Multiple genera and species are differentially abundant in patients with long- vs. short progression-free survival

We performed Wilcoxon rank-sum test for all detected genera and species (high abundance: CLR≥0; low abundance 0>CLR≥-1) and 11 genera and 26 species showed a significant difference in abundance according to PFS (). We labeled these taxa as key genera and species. High-abundance genera Streptococcus (p = 0.017), Bifidobacterium (p = 0.002), Collinsella (p = 0.024), and Lactobacillus (p = 0.043) were more abundant in patients with short PFS, whereas Alistipes (p = 0.018), Paraprevotella (p = 0.031) and Barnesiella (p = 0.001) were more abundant in patients with long PFS. Among low-abundance genera, Spiroplasma (p = 0.034), Helicobacter (p = 0.033), and Buchnera (p = 0.021) were rather associated with long PFS, while Methanospaera (p = 0.02) was more associated with short PFS (). Multiple species showed increased abundance in patients with long PFS (), from which the most significant taxa included Alistipes shahii (p < 0.001), Barnesiella visceriola (p < 0.001), Butyricimonas faecalis (p = 0.001), Bacteroides sp. A1C1 (p = 0.003) and Alistipes finegoldii (p = 0.004). Patients with short PFS patients were associated with a significantly increased abundance of variety of Streptococcus species, such as S. salivarius (p = 0.002), S. vestibularis (p = 0.005), and S. parasanguinis (p = 0.01); Bifidobacteria, such as B. longum (p = 0.017), B. adolescentis (p = 0.006) and B. breve (p = 0.007). Moreover, there was an increased abundance of Collinsella aerofaciens (p = 0.004) and low-abundance Streptococci in short PFS patients ().

Figure 3. Differentially abundant genera and species and outcomes. Cluster analyses. Horizontal bar charts show the abundance of key genera and species with significantly different abundance between patients with long- vs short PFS, Wilcoxon ranksum test, p < 0.05 (A). Only genera and species (Clr>-1) from the Bacteria and Archaea domains were included in the analyses. High abundance taxa (CLR≥0) and low abundance taxa (0 > CLR ≥ −1) are separately labeled. Heatmap displays Z-scores (blue=low, red=high) for every cell generated from individual CLR-transformed abundances from key species for all patients (B). Axis X shows patients (IDs) in the Discovery cohort, whereas indicator bars on top reflect their PFS (red/blue, short vs long), PD-L1 (green/red, high vs low) and front-line CHT-treatment (purple/pink, CHT-treated vs CHT-naive) as previously described (B). Axis Y shows key bacterial species clustered to representative groups (STable 7). There were two outlier patients (S2, S57), in the whole cohort and two outlier patients in cluster B (S50, S53), who cannot be clustered to any groups. Patient clusters are compared shown in stacked bar charts (C) according to their composition of long vs short PFS, PD-L1 high vs low and CHT-treated vs naive patients. Cluster a represents significantly more patients with long PFS (compared to cluster B, p = 0.003, C) with an increased abundance of beneficial α and γ bacteria, a decreased abundance of β and a decreased- or variable abundance of δ bacteria; cluster B is characterized by a decreased abundance of α and γ, an increased abundance of δ and an increased or variable abundance of β bacteria. Cluster A1a and A1b represent a PD-L1-low and high subcluster in cluster A, with no significant difference in patients according to PFS. Cluster A1b consists of significantly more CHT-naive patients than cluster B in general. Fisher’s exact test was used to calculate differences among all clusters and subclusters. Metric data are shown as mean and corresponding standard deviation (SD). Statistical significance *P < 0.05; **P < 0.01, ***P<.001.

Figure 3. Differentially abundant genera and species and outcomes. Cluster analyses. Horizontal bar charts show the abundance of key genera and species with significantly different abundance between patients with long- vs short PFS, Wilcoxon ranksum test, p < 0.05 (A). Only genera and species (Clr>-1) from the Bacteria and Archaea domains were included in the analyses. High abundance taxa (CLR≥0) and low abundance taxa (0 > CLR ≥ −1) are separately labeled. Heatmap displays Z-scores (blue=low, red=high) for every cell generated from individual CLR-transformed abundances from key species for all patients (B). Axis X shows patients (IDs) in the Discovery cohort, whereas indicator bars on top reflect their PFS (red/blue, short vs long), PD-L1 (green/red, high vs low) and front-line CHT-treatment (purple/pink, CHT-treated vs CHT-naive) as previously described (B). Axis Y shows key bacterial species clustered to representative groups (STable 7). There were two outlier patients (S2, S57), in the whole cohort and two outlier patients in cluster B (S50, S53), who cannot be clustered to any groups. Patient clusters are compared shown in stacked bar charts (C) according to their composition of long vs short PFS, PD-L1 high vs low and CHT-treated vs naive patients. Cluster a represents significantly more patients with long PFS (compared to cluster B, p = 0.003, C) with an increased abundance of beneficial α and γ bacteria, a decreased abundance of β and a decreased- or variable abundance of δ bacteria; cluster B is characterized by a decreased abundance of α and γ, an increased abundance of δ and an increased or variable abundance of β bacteria. Cluster A1a and A1b represent a PD-L1-low and high subcluster in cluster A, with no significant difference in patients according to PFS. Cluster A1b consists of significantly more CHT-naive patients than cluster B in general. Fisher’s exact test was used to calculate differences among all clusters and subclusters. Metric data are shown as mean and corresponding standard deviation (SD). Statistical significance *P < 0.05; **P < 0.01, ***P<.001.

Significant taxonomical associations with PD-L1 expression and front-line CHT treatment were revealed using the previous methodology, where multiple genera and species showed a significant difference in PD-L1 expression (SFig 3A). SFig 3B show Receiver Operator Characteristic (ROC) curves and a table for genera with the highest AUC to predict in patients PD-L1 high vs. low expression. Methanobrevibacter smithii showed the best AUC to predict PD-L1 phenotype among species (SFig 3B). Differentially abundant genera and species in CHT-treated vs. CHT-naive patients are shown in SFig 3C and D.

CHT-treated patients exhibit a significantly different taxonomic composition from CHT-naive () patients; therefore we evaluated differential abundance between patients with short vs. long PFS in CHT-treated patients (STable 6). A. shahii, S. salivarius, S. vestibularis, and B. adolescentis still showed differential abundance between patients with long- vs. short PFS. Interestingly, in this selected CHT-treated population, Escherichia coli showed significantly increased abundance in patients with long PFS and Ruminococcus bicirculans in patients with short PFS but not in the whole cohort.

Next, a heatmap was generated to reveal clusters formed by key bacterial species and patients. Bacteria were clustered into four groups labeled with Greek letters (). Cluster α constitutes high-abundance Bacteroides, including B. uniformis, B. sp. A1C1, and B. ovatus. Cluster β includes B. breve and a variety of Streptococci. Cluster γ represents a heterogeneous microbial community with bacteria beneficial for PFS, including Bacteroides, Alistipes, and Butyricimonas species. Short PFS-associated species B. adolescentis, B. longum, C. aerofaciens, and S. salivarius comprise Cluster δ. Bacterial clusters and their representation among patients are described in STable 7. Of note, we observed that at least two bacterial clusters are altered in patients with short PFS, compared to patients with long PFS, including underrepresentation of beneficial (cluster α, γ), or overrepresentation of detrimental bacteria (clusters β). Underrepresentation of a singular beneficial- or overrepresentation of a singular detrimental bacterial cluster was not essentially associated with short PFS. Characteristics and statistical comparison of sub-clusters derived from patients relative to PFS, PD-L1 expression, and CHT are shown in . Altogether, patients with long PFS are significantly overrepresented in cluster A, and patients with short PFS patients in cluster B ().

Networks of bacterial communities are significantly different in patients with long vs. short PFS.

Interaction networks were generated using Spearman’s coefficients among bacterial taxa, including all samples (n = 62), and only samples of patients with long (n = 46) or short PFS (n = 16). shows legends for network diagrams. In the whole cohort, Actinobacteria vs. Firmicutes showed a moderate positive correlation (r = 0.406), whereas there was a moderate negative correlation between Euryarchaeota vs. Firmicutes (r = −0.522), Euryarchaeota vs. Bacteroidetes (r = −0.412) and Bacteroidetes vs. Actinobacteria (r = −0.409) phyla. Proteobacteria and Verrucomicrobia exhibited no significant associations with other phyla (). We also compared bacterial correlations in patients with short vs. long PFS. According to differential analysis, where the long PFS group was set as reference, Verrucomicrobia’s correlation has decreased with Firmicutes and increased with Bacteroidetes. Bacteroidetes’ correlation has decreased, and Euryarchaeota’s has increased with Actinobacteria (). show a network of critical genera in the whole cohort (D) and on the Tiffany diagram (E). Interestingly, Actinobacteria, such as Bifidobacterium, Actinomyces, and Collinsella, correlate positively with Firmicutes, such as Lactobacillus and Streptococcus. In contrast, Bacteroidetes Butyricimonas, Barnesiella, and Alistipes negatively correlate with key Firmicutes and Actinobacteria genera (). The connectivity network and differential network for all genera (CLR>0) are shown in SFig 4A-B.

Figure 4. Bacterial networks in patients with long- vs short PFS. Network maps show the correlation, phylogenetic origin and representation in patients with long- vs short PFS of key genera and species. Nodes represent taxa, where association with long (blue)- or short (red) PFS and phylogenetic classification are color-coded (A). Size of nodes reflects median CLR value of the taxon in the whole cohort. Edges represent significant (p < 0.01) positive (red) or negative (blue) correlations (r > [0.4]) among nodes. Thickness of edges reflect rho values according to Spearman’s correlation coefficients. Correlations r < [0.4] and p ≥ 0.01 are not shown in networks. Nodes and taxa without minimum one significant correlation (r > [0.4], p < 0.01) and a minimum median CLR of 0 (genera) or − 1 (species) are not shown in networks (B,D,F,G). Using the Diffany network analyzer module from Cytoscape, differential analysis of long- vs short PFS networks were performed, where red arrows depict a decreasing level of association-, while green arrows denote an increasing level of association between two taxa in bacterial networks. Diffany diagrams use the long PFS population as reference point and shows how the short PFS population differs from it (C,E,H). Long PFS patient-networks were used as reference for Diffany graphs, so diagrams depict how the network differs in patients with short PFS. Panels show whole cohort network and Diffany diagram for phyla (B,C), and for key genera (D,E). Panel F shows whole cohort network for key species. Number of nodes, edges, average number of neighbors, characteristic path length between nodes, clustering coefficient, density, heterogeneity and centralization of networks are shown in the whole cohort (G) and compared between long- and short PFS patient networks (H). Dendrogram depicts clusters in the whole cohort species, revealing a cluster (A) with high betweenness centrality and a cluster (B) with relatively low betweenness centrality (I).

Figure 4. Bacterial networks in patients with long- vs short PFS. Network maps show the correlation, phylogenetic origin and representation in patients with long- vs short PFS of key genera and species. Nodes represent taxa, where association with long (blue)- or short (red) PFS and phylogenetic classification are color-coded (A). Size of nodes reflects median CLR value of the taxon in the whole cohort. Edges represent significant (p < 0.01) positive (red) or negative (blue) correlations (r > [0.4]) among nodes. Thickness of edges reflect rho values according to Spearman’s correlation coefficients. Correlations r < [0.4] and p ≥ 0.01 are not shown in networks. Nodes and taxa without minimum one significant correlation (r > [0.4], p < 0.01) and a minimum median CLR of 0 (genera) or − 1 (species) are not shown in networks (B,D,F,G). Using the Diffany network analyzer module from Cytoscape, differential analysis of long- vs short PFS networks were performed, where red arrows depict a decreasing level of association-, while green arrows denote an increasing level of association between two taxa in bacterial networks. Diffany diagrams use the long PFS population as reference point and shows how the short PFS population differs from it (C,E,H). Long PFS patient-networks were used as reference for Diffany graphs, so diagrams depict how the network differs in patients with short PFS. Panels show whole cohort network and Diffany diagram for phyla (B,C), and for key genera (D,E). Panel F shows whole cohort network for key species. Number of nodes, edges, average number of neighbors, characteristic path length between nodes, clustering coefficient, density, heterogeneity and centralization of networks are shown in the whole cohort (G) and compared between long- and short PFS patient networks (H). Dendrogram depicts clusters in the whole cohort species, revealing a cluster (A) with high betweenness centrality and a cluster (B) with relatively low betweenness centrality (I).

depicts the whole cohort network using key PFS-associated species, where Bacteroidetes species strongly correlate with each other and show a strong negative correlation with C. aerofaciens. Streptococcus species showed a similar reciprocal correlation within their taxonomic niche, whereas S. vestibularis, S. anginosus, and S. gordonii were negatively correlated with multiple Bacteroidetes species (). SFig 4C shows the differences between networks generated from patients with short and long PFS.

Characteristics of the whole cohort network and comparison of key species’ long vs. short PFS networks are shown in . The Long PFS network is more interconnected with higher number of edges for the same number of nodes, decreased characteristic path length, increased network density, heterogeneity, and centralization (). Dendrogram displays nodes (taxa) according to betweenness centrality. Cluster A bacteria, such as S. vestibularis, B. uniformis, B. faecalis, A. shahii, B. visceriola, B. sp. A1C1, and B. heparinolyticus are considered central hubs in the PFS-related gut microbiome with a multitude of connections. Paraprevotella xylanophila is the most isolated species, with the lowest number of associations with other species from cluster B (SFig 4I).

Key genera and species according to CHT-regimen and PD-L1 status (SFig 3) were also used to generate interaction networks. Genera and species overrepresented within the CHT-naïve or treated population positively correlate with each other (SFig 5A-D). Regarding PD-L1 status, Methanobrevibacter smithii, the only species significantly overrepresented in PD-L1-low patients shows negative correlation with PD-L1-high status associated taxa Lachnoclostridium sp.YL32 and Ruminococcus gnavus (SFig 5D).

Taxa associated with ICI toxicity and history of Antibiotic-, Steroid, and PPI/H-blocker treatments

Next, we aimed to reveal differentially abundant taxa according to the presence of ICI adverse events (toxicity) and medications are taken before IT, including antibiotics, antacids, and steroids. Treatment toxicity was associated with a slight decrease in the abundance of genera Absiella and Blautia () and a pronounced increase in the abundance of Prevotella dentalis (). Antibiotic treatment affected multiple taxa, significantly decreasing the abundance of Anaerostipes, Christensenella, Longibaculum, Lachnospira, Anaerostipes Hadrus, and Erysipelothrichaceae bacterium GAM147 (). In contrast, Eggerthella (E) and E. lenta, Bifidobacterium bifidum, and Bacteroides xylanisolvens were significantly overrepresented in patients with a history of antibiotic treatment (). Antacid medication was associated with a modest decrease in Synergistetes phylum () and a marked increase in species Streptococcus (S) equinus, S. parasanguinis, and S. salivarius (), whose increased abundance was also detected in short PFS patients (). The phylum Proteobacteria and its prominent genus Escherichia (E) and species E. coli were significantly more abundant in steroid-treated patients, similarly to Longibaculum and Erysipelothrichaceae bacterium GAM147 ().

Figure 5. Differentially abundant taxa according to ICI-toxicity and antibiotic, antacid, or steroid treatment before immunotherapy. Bar charts show the CLR-normalized abundance of taxa with significantly different abundance (Wilcoxon ranksum test, p < 0.05) according to ICI-toxicity (A-A’), Antibiotic (AB)-treatment prior IT (B-B’), Antacid treatment prior or during IT (C-C’) and steroid treatment prior or during IT (D-D’). Only genera and species (Clr>-1) from the Bacteria and Archaea domains were included in the analyses. Genera Absiella and Blautia showed significantly decreased abundance in patients with no ICI-toxicity (p = 0.031 and p = 0.032, A), whereas species Prevotella dentalis was significantly more abundant in patients with ICI-toxicity (p = 0.0046, A’). Genera Anaerostipes (p = 0.0156), Christensenella (p = 0.0032), Lachnospira (p = 0.0323) and Longibaculum (p < 0.001) showed significant depletion, whereas Eggerthella (p = 0.0012) was significantly increased in AB-treated patients (B). Regarding species, Anaerostipes hadrus (p = 0.0244) and Erysipelothrichaceae bacterium GAM147 (p < 0.001) showed a significantly decreased abundance, whereas Bacteroides xylanisolvens (p = 0.0295), Bifidobacterium Bifidum (p = 0.0096) and Eggerthella lentha (p = 0.0053) showed a significantly increased abundance in AB-treated patients (B’). Low-abundance phylum Synergistetes showed a slight, but significant decrease in patients with a history of- or ongoing Antacid medication (C), whereas Streptococci (S) S. equinus (p < 0.001), S. parasanguinis (p = 0.0021) and S. salivarius (p = 0.0018) exhibited significantly increased abundances in Antacid treated patients (C’). Steroid treatment prior or during IT was associated with a significantly increased abundance of Proteobacteria phylum (p = 0.017) and genera Escherichia (p = 0.0108) and Longibaculum (p = 0.0103), whereas abundance of genus Desulfovibrio (p = 0.0217) was significantly decreased in the same group of patients (D). Two species showed significantly increased abundance in steroid-treated patients: Erysipelothrichaceae bacterium GAM147 (p = 0.0266) and Escherichia coli (p = 0.0298) (D’). Horizontal colored bars above charts show the phylogenetic origin of taxa (see description in Figure 4A).

Figure 5. Differentially abundant taxa according to ICI-toxicity and antibiotic, antacid, or steroid treatment before immunotherapy. Bar charts show the CLR-normalized abundance of taxa with significantly different abundance (Wilcoxon ranksum test, p < 0.05) according to ICI-toxicity (A-A’), Antibiotic (AB)-treatment prior IT (B-B’), Antacid treatment prior or during IT (C-C’) and steroid treatment prior or during IT (D-D’). Only genera and species (Clr>-1) from the Bacteria and Archaea domains were included in the analyses. Genera Absiella and Blautia showed significantly decreased abundance in patients with no ICI-toxicity (p = 0.031 and p = 0.032, A), whereas species Prevotella dentalis was significantly more abundant in patients with ICI-toxicity (p = 0.0046, A’). Genera Anaerostipes (p = 0.0156), Christensenella (p = 0.0032), Lachnospira (p = 0.0323) and Longibaculum (p < 0.001) showed significant depletion, whereas Eggerthella (p = 0.0012) was significantly increased in AB-treated patients (B). Regarding species, Anaerostipes hadrus (p = 0.0244) and Erysipelothrichaceae bacterium GAM147 (p < 0.001) showed a significantly decreased abundance, whereas Bacteroides xylanisolvens (p = 0.0295), Bifidobacterium Bifidum (p = 0.0096) and Eggerthella lentha (p = 0.0053) showed a significantly increased abundance in AB-treated patients (B’). Low-abundance phylum Synergistetes showed a slight, but significant decrease in patients with a history of- or ongoing Antacid medication (C), whereas Streptococci (S) S. equinus (p < 0.001), S. parasanguinis (p = 0.0021) and S. salivarius (p = 0.0018) exhibited significantly increased abundances in Antacid treated patients (C’). Steroid treatment prior or during IT was associated with a significantly increased abundance of Proteobacteria phylum (p = 0.017) and genera Escherichia (p = 0.0108) and Longibaculum (p = 0.0103), whereas abundance of genus Desulfovibrio (p = 0.0217) was significantly decreased in the same group of patients (D). Two species showed significantly increased abundance in steroid-treated patients: Erysipelothrichaceae bacterium GAM147 (p = 0.0266) and Escherichia coli (p = 0.0298) (D’). Horizontal colored bars above charts show the phylogenetic origin of taxa (see description in Figure 4A).

Multivariate statistical models

ROC curves were generated, and AUC values were measured to reveal the predictive power of key bacterial taxa regarding long or short PFS and to assess their prospect as clinical tests in predicting ICI efficacy (). Results of Wilcoxon tests ROC-analyses are shown in Stable 8. B. faecalis, S. parasanguinis, B. breve, and B. visceriola represent high-specificity, but mediocre sensitivity taxa, while S. vestibularis, S. salivarius, A. finegoldii, B. adolescentis, and B. sp. A1C1 represent a high-sensitivity, but mediocre specificity taxa. A. shahii is the only species showing strong specificity and sensitivity to predict PFS.

Figure 6. Multivariate and survival analysis. XY chart visualize key species according to AUC value from ROC analysis (axis Y) and p-value of Wilcoxon rank sum test (axis X) relative to the binary classification of long vs short PFS (A). Association of bacterial species with PFS is depicted in red (short) and blue (long). The normal value of Lasso coefficient is depicted in full (>0.5), half (0.1–0.5) and empty (not significant) spheres (A). Sensitivity-Specificity chart plots the same species, where specificity (axis Y), and sensitivity (axis X) are denoted in percentage (B). Association of bacterial species with PFS is depicted in red (short) and blue (long). The p-value from Cox (M) analysis is depicted in full (p < 0.05) and empty (p ≥ 0.05) spheres (B). Abundance of A. shahii (p = 0.0181) and A. finegoldii (p = 0.041) was significantly increased in patients with long- (vs short) PFS, whereas abundance of S. salivarius (p = 0.018), S. vestibularis (p = 0.0029) and B. breve (p = 0.0051) was significantly increased in patients with short- (vs long) PFS according to validation on an additional patient cohort. There was no significant difference in the case of B. adolescentis (p = 0.692). There were no abundance data available for B. visceriola, and A. dispar in the validation cohort’s Metaphlan2 dataset (C). Hierarchical pyramid-model shows key bacterial species according to long (blue) vs short (red) PFS (E): Level 1: Significant (p < 0.05) Wilcoxon Rank sum test; Level 2: Level 1 requirement and at least fair (Auc>0.7) ROC curve or at least one multivariate analysis (Lasso or Cox) with significant result; Level 3: Level 1 requirement and at least fair (Auc>0.7) ROC curve and at least one multivariate analysis (Lasso or Cox) showing significant result; Level 4: Level 1 requirement and at least fair (Auc>0.7) ROC curve and both multivariate analyses (Lasso and Cox) showing significant results (E). Bacteria on Level 3 and 4 were validated on an additional patient cohort (C) and KM curves were generated with cutoffs derived from ROC analysis for all species on levels 3 and 4 (D). A. shahii, A. finegoldii and B. visceriola – high patients showed significantly increased PFS compared to patients with low abundance of these species. Patients with increased abundances of S. vestibularis, S. salivarius and B. breve exhibited decreased PFS, compared to patients with low abundance of these species. Panel D displays KM curves with median PFS in months, p-values of Log-rank tests and patient numbers at risk in every comparison. X-axis represents elapsed (progression-free) time in months. Y-axis shows progression-free survival probability in percent. Encircled: Successfully validated; Underlined: KM curve analysis shows significant difference (Log-rank test).

Figure 6. Multivariate and survival analysis. XY chart visualize key species according to AUC value from ROC analysis (axis Y) and p-value of Wilcoxon rank sum test (axis X) relative to the binary classification of long vs short PFS (A). Association of bacterial species with PFS is depicted in red (short) and blue (long). The normal value of Lasso coefficient is depicted in full (>0.5), half (0.1–0.5) and empty (not significant) spheres (A). Sensitivity-Specificity chart plots the same species, where specificity (axis Y), and sensitivity (axis X) are denoted in percentage (B). Association of bacterial species with PFS is depicted in red (short) and blue (long). The p-value from Cox (M) analysis is depicted in full (p < 0.05) and empty (p ≥ 0.05) spheres (B). Abundance of A. shahii (p = 0.0181) and A. finegoldii (p = 0.041) was significantly increased in patients with long- (vs short) PFS, whereas abundance of S. salivarius (p = 0.018), S. vestibularis (p = 0.0029) and B. breve (p = 0.0051) was significantly increased in patients with short- (vs long) PFS according to validation on an additional patient cohort. There was no significant difference in the case of B. adolescentis (p = 0.692). There were no abundance data available for B. visceriola, and A. dispar in the validation cohort’s Metaphlan2 dataset (C). Hierarchical pyramid-model shows key bacterial species according to long (blue) vs short (red) PFS (E): Level 1: Significant (p < 0.05) Wilcoxon Rank sum test; Level 2: Level 1 requirement and at least fair (Auc>0.7) ROC curve or at least one multivariate analysis (Lasso or Cox) with significant result; Level 3: Level 1 requirement and at least fair (Auc>0.7) ROC curve and at least one multivariate analysis (Lasso or Cox) showing significant result; Level 4: Level 1 requirement and at least fair (Auc>0.7) ROC curve and both multivariate analyses (Lasso and Cox) showing significant results (E). Bacteria on Level 3 and 4 were validated on an additional patient cohort (C) and KM curves were generated with cutoffs derived from ROC analysis for all species on levels 3 and 4 (D). A. shahii, A. finegoldii and B. visceriola – high patients showed significantly increased PFS compared to patients with low abundance of these species. Patients with increased abundances of S. vestibularis, S. salivarius and B. breve exhibited decreased PFS, compared to patients with low abundance of these species. Panel D displays KM curves with median PFS in months, p-values of Log-rank tests and patient numbers at risk in every comparison. X-axis represents elapsed (progression-free) time in months. Y-axis shows progression-free survival probability in percent. Encircled: Successfully validated; Underlined: KM curve analysis shows significant difference (Log-rank test).

Next, we established a gut bacteria-based multivariate model for predicting IT response, where Lasso regression and Cox proportional hazard regression were used. Univariate- (U) and multivariate (M) models were used for Cox regression. Univariate analysis (Cox (U)) was performed for all key species () with a significant, ROC curve (AUC>0.7). In multivariate Cox models, key species with significant (p < 0.05) Cox (U) predictive values and confounding parameters such as gender, PD-L1 expression, and CHT-treatment were included (). To exemplify the hierarchy of critical species according to their association with PFS, a pyramid model is shown in .

Table 1. Results of Uni- and Multivariate Cox regression for key species. Green color represents significant (<0.05) p-values and dark green color shows only a trend. Cox (M) analysis was performed only for taxa with a significant result (p < 0.1) in Cox (U) analysis. Grey colored taxa are low-abundance. For Lasso regression, Lasso coeff (n) is only shown for significant (p < 0.05) predictors.

An additional patient cohort (n = 60, STable 1b) was used for validation. According to our hierarchical pyramid model, we aimed to validate level 3 and 4 species with our validation cohort, where at least one multivariate regression model (Lasso or multivariate Cox) showed a significant predictive role. We confirmed species A. shahii (p = 0.0181) and A. finegoldii (p = 0.041) that were significantly increased in patients with long vs. short PFS, and species S. salivarius (p = 0.018), S. vestibularis (p = 0.0029) and B. breve (p = 0.0051), that were significantly increased in patients with short vs. long PFS (). Abundance of B. adolescentis did not show a significant difference in the validation cohort (SFigure 6A). Cut-off values were generated based on the ROC curves, and KM analysis was performed for level 3 and 4 species. An increased abundance of A. shahii (p < 0.001), A. finegoldii (p = 0.0084), and B. visceriola (p = 0.015) was associated with significantly improved PFS. Conversely, increased abundance of S. salivarius (p = 0.0389), S. vestibularis (p = 0.0349), and B. breve (p = 0.0022) was associated with significantly shortened PFS (). According to KM analysis, B. adolescentis and A. dispar showed no significant difference in PFS (SFig 6B-C). Results of Cox regression for key phyla and genera are shown in STable 9. Lasso regression underpinned the negative predictive role of both Firmicutes and Actinobacteria, and Firmicutes were also identified as an independent negative predictor by multivariate Cox regression.

Metabolic Pathways and machine learning approach

In a sub-cohort of 37 patients, metagenome pathways were analyzed from the MetaCyc Metabolic Pathway Database. A total of 73 individual Metacyc Pathways were identified. First, Metacyc pathways were classified into 16 “Metacyc SuperPathways” to capture the essential metabolic routes in the bacterial domain (SFigure 7). The stacked bar chart () shows the distribution of SuperPathways in every patient, where only a limited fraction (11.23%) of high-quality reads were classified as unmapped or unintegrated.

Figure 7. Metagenome Pathways and machine learning approach. Abundance-distribution of Metacyc Superpathways per patient is shown in a stacked bar chart, where X axis shows patient IDs grouped according to PFS and Y axis represents the relative abundance of the correspondent Superpathway, color coded (A). Patients were grouped as long vs short PFS, PD-L1-high vs low and CHT-treated vs CHT-naive, as previously described. None of the superpathways showed significant difference according to PFS (B), whereas Nucleoside and Nucleotide degradation, Amino acid biosynthesis and Fermentation superpathways were significantly more abundant in PD-L1 low patients (compared to PD-L1 high patients, C). Regarding chemotherapy, chemo-naive (0) patients showed significantly increased abundance of the Glycolysis superpathway and significantly decreased abundance of the Amino acid degradation and Carbohydrate biosynthesis superpathways (compared to CHT-treated patients, D). Multiple individual Metacyc pathway are differentially abundant between different patient groups, including Adenosine Ribonucleotide de novo Biosynthesis (p = 0.024), CDP-diacylglycerol Biosynthesis I-II (p = 0.023) according to PFS (E); Guanosine Ribonucleotide de novo Biosynthesis (p = 0.037), Inosine 5’ Phosphate Degradation (p = 0.015) L-histidine Degradation III (p = 0.034), Pyruvate Fermentation to Isobutanol (p = 0.008), L-valine Biosynthesis (p = 0.008), GDP-mannose Biosynthesis (p = 0.002) and Anaerobic Gondoate Biosynthesis (p = 0.004) according to PD-L1 expression (E) and GDP-mannose Biosynthesis (p = 0.023), Adenosine- and Guanosine Deoxyribonucleotides de novo Biosynthesis II (p = 0.02, respectively), Superpathway of Guanosine Nucleotides de novo Biosynthesis II (p = 0.019), Glycolysis IV (p = 0.035) and CDP-diacylglycerol Biosynthesis I-II (p = 0.026, respectively) according to Chemotherapy-regime (E). 5-fold cross validation performed on Random Forest (RF) machine learning models show that PFS is best predicted with key species (AUC: 0.74; Recall: 0.64; F1: 0.56, F) and PD-L1 expression is best predicted with pathways (AUC: 0.88; Recall: 0.87; F1: 0.82, F’). The model fitted fairly to the prediction of first-line (CHT-treated) or subsequent-line (CHT-treated) IT for both key species (AUC: 0.76; Recall: 0.86; F1: 0.82, F”) and pathways (AUC: 0.8; Recall: 1; F1: 0.88, F”). Metric data are shown as mean and corresponding standard deviation (SD). Statistical significance *P < 0.05; **P < 0.01, ***P<.001.

Figure 7. Metagenome Pathways and machine learning approach. Abundance-distribution of Metacyc Superpathways per patient is shown in a stacked bar chart, where X axis shows patient IDs grouped according to PFS and Y axis represents the relative abundance of the correspondent Superpathway, color coded (A). Patients were grouped as long vs short PFS, PD-L1-high vs low and CHT-treated vs CHT-naive, as previously described. None of the superpathways showed significant difference according to PFS (B), whereas Nucleoside and Nucleotide degradation, Amino acid biosynthesis and Fermentation superpathways were significantly more abundant in PD-L1 low patients (compared to PD-L1 high patients, C). Regarding chemotherapy, chemo-naive (0) patients showed significantly increased abundance of the Glycolysis superpathway and significantly decreased abundance of the Amino acid degradation and Carbohydrate biosynthesis superpathways (compared to CHT-treated patients, D). Multiple individual Metacyc pathway are differentially abundant between different patient groups, including Adenosine Ribonucleotide de novo Biosynthesis (p = 0.024), CDP-diacylglycerol Biosynthesis I-II (p = 0.023) according to PFS (E); Guanosine Ribonucleotide de novo Biosynthesis (p = 0.037), Inosine 5’ Phosphate Degradation (p = 0.015) L-histidine Degradation III (p = 0.034), Pyruvate Fermentation to Isobutanol (p = 0.008), L-valine Biosynthesis (p = 0.008), GDP-mannose Biosynthesis (p = 0.002) and Anaerobic Gondoate Biosynthesis (p = 0.004) according to PD-L1 expression (E) and GDP-mannose Biosynthesis (p = 0.023), Adenosine- and Guanosine Deoxyribonucleotides de novo Biosynthesis II (p = 0.02, respectively), Superpathway of Guanosine Nucleotides de novo Biosynthesis II (p = 0.019), Glycolysis IV (p = 0.035) and CDP-diacylglycerol Biosynthesis I-II (p = 0.026, respectively) according to Chemotherapy-regime (E). 5-fold cross validation performed on Random Forest (RF) machine learning models show that PFS is best predicted with key species (AUC: 0.74; Recall: 0.64; F1: 0.56, F) and PD-L1 expression is best predicted with pathways (AUC: 0.88; Recall: 0.87; F1: 0.82, F’). The model fitted fairly to the prediction of first-line (CHT-treated) or subsequent-line (CHT-treated) IT for both key species (AUC: 0.76; Recall: 0.86; F1: 0.82, F”) and pathways (AUC: 0.8; Recall: 1; F1: 0.88, F”). Metric data are shown as mean and corresponding standard deviation (SD). Statistical significance *P < 0.05; **P < 0.01, ***P<.001.

According to PFS, none of the SuperPathways showed a significant difference in long- vs. short-term survival (). In contrast, SuperPathways Nucleoside and Nucleotide degradation (p = 0.012), Amino Acid Biosynthesis (p = 0.023), and Fermentation (p = 0.007) were significantly more abundant in PD-L1-low expressor patients (compared to PD-L1-high expressors, ). Patients were also assessed according to CHT treatment. There was an increased abundance of Glycolysis (p = 0.032) in CHT-naive patients, whereas there were increased abundances of Amino Acid Degradation (p = 0.035) and Carbohydrate Biosynthesis (p = 0.044) in CHT-treated patients (). Among individual MetaCyc Pathways, patients with short PFS exhibited a significantly increased abundance of Adenosine Ribonucleotide de novo Biosynthesis (p = 0.024). In contrast, CDP-diacylglycerol Biosynthesis I-II are significantly more abundant in patients with long PFS (p = 0.023) (). There was a more dominant shift in pathway representation according to PD-L1 expression and CHT-regime, where 7–7 pathways showed a significant difference in distinct patient groups, respectively ().

Next, a Random Forest (RF) model was built, where 5K-fold cross-validation was used to evaluate the performance of key species and Metacyc Pathway abundances in predicting short vs. long PFS, high vs. low PD-L1 expression, and CHT-treatment or naivete. PFS was best predicted with key species (AUC = 0.74), whereas pathways alone (AUC = 0.63) or combined with taxonomy (AUC = 0.63) gave a poor performance in predicting PFS (). In contrast, PD-L1 expression was best predicted with pathways (AUC = 0.87) and gave a lower performance for species (AUC = 0.63) (’). The machine learning algorithm predicted the CHT-regime with a good performance using pathways (AUC = 0.8) and with a fair performance using species alone or with species combined with pathways (AUC = 0.76, respectively) (”). These results suggest that the taxonomic profile is best suited to predict long or short PFS, while the pathway profile predicts better the PD-L1 phenotype of patients.

Discussion

The reasons behind the divergent response rates to cancer IT are still poorly understood, and even today, PD-L1 represents the most widely used predictive biomarker in clinics. Therefore, our study analyzed a large-scale, real-life cross-sectional cohort to uncover the theranostic role of the gut microbiome in NSCLC patients. To the best of our knowledge, our study is among the first to provide insights into the specific interactions of the gut microbiome with CHT and IT by analyzing CHT-naïve and CHT-pretreated patients. Our most important findings include that alpha-diversity is not associated with IT response, CHT treatment, or PD-L1 status in NSCLC; in contrast, beta-diversity shows significant association with PFS on IT. Phyla Firmicutes, Actinobacteria, and species Streptococcus salivarius, Streptococcus vestibularis, and Bifidobacterium breve are overrepresented in patients with short PFS. In contrast, Alistipes shahii and Alistipes finegoldii are overrepresented in patients with long PFS.

Others showed that a higher gut microbiota alpha diversity was associated with increased response rates to ICI therapy and improved PFSCitation16,Citation17. In contrast, we found no significant difference in alpha-diversity in any comparison, including PFS, CHT-treatment, and PD-L1 phenotype. However, beta-diversity significantly differed between patients with short and long PFS, as in the case of other NSCLCCitation16,Citation17 and melanoma studiesCitation9–11. Moreover, we detected a significant gut microbial compositional difference between CHT-treated and CHT-naive patients that underlines the effect of CHT on the gut microbiome. It is noteworthy that bacterial richness and alpha-diversity are parameters that are difficult to assess in the routine clinical practice, whereas detecting or quantifying individual species is cost-effective. Thus, we identified specific indicator species that could predict outcomes with high reliability and we established a hierarchical model stratifying taxa based on the extent of their predictive role regarding long or short PFS. Notably, apart from univariate statistical tests, we have also implemented various multivariate statistical models, such as Lasso- and Cox proportional hazard regression, to define the predictive relevance of a particular taxon based on solid evidence.

Accordingly, we analyzed the differences regarding major bacterial phyla and found that Firmicutes and Actinobacteria were associated with short PFS. The Firmicutes/Bacteroidetes (F/B) ratio is widely accepted to play an important role in maintaining normal intestinal homeostasis and is impaired in dysbiosisCitation34,Citation35. To our knowledge, this is the first NSCLC study to confirm that the F/B ratio is significantly increased in patients with short PFS. At the species level, multivariate analysis and validation on an additional cohort unequivocally confirmed the positive predictive role (performance to predict long PFS) of Alistipes shahii and Alistipes finegoldii and the negative predictive role (performance to predict short PFS) of Bifidobacterium breve, Streptococcus Salivarius, and Streptococcus vestibularis. A highlighted finding of our analysis is the consistent association of Streptococci with short PFS, where both the genus and multiple species show a stringent association with worse outcomes. In a mixed cohort of epithelial tumors, including NSCLC and renal cell carcinoma (RCC) specimens, Routy and colleaguesCitation20 associated taxa Akkermansia muciniphila and Alistipes with better- and Bifidobacteria with worse clinical outcomes, using shotgun metagenomics, similar to the current study. We used the latest version of the KRAKEN reference database that enabled us to discover more extensively the potential predictive role of rare- and low abundance taxa, including an array of Streptococcus species (S. gordonii, S. equinus, and S. anginosus), whose association with negative outcomes was also confirmed by multivariate Cox regression.

Two other NSCLC studies have interpreted the gut microbiome in the context of ICI-response so far. According to Jin et al., Alistipes putredinis, Bifidobacterium longum, and Prevotella copri were associated with better outcomes, while unclassified Ruminococcus was linked with impaired ICI-responseCitation16. In contrast, Zhang et al. identified Desulfovibrio, Actinomycetales, Bifidobacterium, Anaerostipes, Faecalibacterium, and Alistipes as overrepresented in responder patients and different Fusobacteria in non-respondersCitation17. However, none of these studies interpreted the taxonomic profiles at the species level. Both research groups assessed the gut microbiome in an East-Asian patient cohort, whose baseline gut commensal flora significantly differed from CaucasiansCitation36. Other possible explanations for the discrepancies include the different sample collection frame times, reference databases (Metaphlan vs. KRAKEN), and study endpoints. It is important to highlight that previous NSCLC studies interpreting alpha-diversity and taxonomic profiles according to ICI-responseCitation16,Citation17 used 16S rRNA sequencing works with distinct reference datasets and limited taxonomical scope compared to shotgun metagenomics.

Next, we showed that the network of bacterial communities is more interconnected in patients with long PFS, with stronger correlations between taxa. S. vestibularis, B. uniformis, B. faecalis, A. shahii, and B. visceriola were the central hubs in the community network. Our study is unique in terms of first interpreting taxonomic differences according to CHT treatment or naivete in the context of ICI efficacy. While multiple taxa show differential abundance between the CHT-naive and CHT-treated population, CHT-treatment does not shift the microbiome into a more short- or long PFS-like profile. The fact that differentially abundant taxa (in the context of PFS on ICI) in the CHT-treated patient subgroup and in the whole cohort significantly overlap underpins the principle that CHT pretreatment is not detrimental to the microbiome concerning ICI efficacy. CHT-pretreated patients exhibit slightly worse outcomes (vs. CHT-naive patients), possibly due to their tumor’s lower PD-L1 expression. Further studies are needed to confirm these findings.

We revealed that treatment toxicity is not associated with an extensive shift in the microbiome. Prevotella dentalis is the only species with a marked increase in patients with ICI-related trAEs. There is still a contradiction about whether AB treatment significantly affects ICI efficacy. While Pinato and colleagues claim that AB therapy reduced response rates prior to, but not during ICI treatmentCitation26, others reported the opposite or the lack of significance in multivariate analysisCitation37,Citation38. Our findings indicate a noticeable change in microbial signature, with the substantial alteration experienced in Eggerthellas; specifically, E. lentha, and no taxa associated with short PFS show significantly increased abundance in the AB-treated group. So far, only preclinical studies have reported the shift in the intestinal microbiome due to systemic corticosteroid treatmentCitation27,Citation39. We found only a minor bacterial conversion in steroid-treated patients with an increased abundance of Proteobacteria, including Escherichia and E.coli. In contrast, we observed a notable increase of short PFS-associated Streptococcus species in patients treated with antacids. This is supported by findings of a recent meta-analysis claiming that PPI medication disrupts the gut microbiome and worsens the outcomes of ICI therapyCitation40.

Analyzing functional metabolic profiles of the gut microbiome is an outstanding advantage of shotgun metagenomics compared to 16S RNA sequencing. SuperPathways Nucleoside and Nucleotide degradation, Amino Acid Biosynthesis, and Fermentation were significantly increased in PD-L1 low patients, whereas Amino Acid Degradation and Carbohydrate Biosynthesis were significantly increased in CHT-treated patients, and Glycolysis was significantly increased in CHT-naive patients. When analyzing taxonomic profiles, we revealed that Euryarchaeota phylumCitation41 and its prominent species Methanobrevibacter smithii showed a strong association with PD-L1 expression being overrepresented in low expressors, which might be connected to the fact that there was an increased rate of fermentationCitation41 in PD-L1-low patients.

Our microbiome-based machine learning approach integrated metagenome data to overcome limitations of previously used traditional tumor biomarkers and test more complex bacterial microbial associations. Lately, multiple machine learning-based biomarker study was published on radiographic imaging, proteomics, and transcriptomics to predict response to ICICitation42,Citation43 or showing a network-based method to select ICI-response-associated transcriptomic biomarkers effectivelyCitation44. Advanced machine learning models in the future can help researchers to identify complex microbial signatures that might help to select patients for ICI therapy. These models outperformed traditional biomarkers and have made robust machine learning-based personalized predictions even with fewer case number training sets compared to predictions based on other conventional ICI treatment biomarkers. Our Random Forest model confirmed that predicting PD-L1 phenotype has a higher accuracy using pathways. In contrast, taxonomic profiles performed better when predicting short vs. long PFS.

The strengths of our study include the short turnaround time of sample collection, the careful attention to related methods using the latest and most comprehensive databases, and the measurement of multiple confounding factors associated with host genetics and exposures. PFS-associated taxa are presented in a hierarchical manner supported by multivariate testing, a novel approach in our research. Limitations of our study include that we cannot assess whether altered microbiota contribute to or exist as a consequence of disease. However, this is a widespread issue in the field currently. Therefore, we are cautious about interpreting our results and encourage further studies with a larger sample size. Moreover, our Validation cohort included significantly more patients treated with ICI subsequent line than the Discovery cohort, which included more ICI first-line administration. Additionally, we did not have next-generation sequencing performed on patients’ tumor tissue samples to be able to discern the potential impact of tumor mutation burden (TMB). However, unlike PD-L1, TMB has not been confirmed to be predictive for OS in NSCLC.Citation45

Conclusion

Our study shows the complexity of the gut microbiome, a highly diverse system, in NSCLC. We were able to define an outcome-related common gut microbial signature with internal cross-validation and using an additional cohort. Multiple Streptococcus and Bifidobacteria species showed a stringent association with impaired ICI efficacy, whereas the presence of Alistipes and Barnesiella was linked to better outcomes. The machine learning approach revealed that the PD-L1 phenotype is best predicted from metabolic pathways, while the taxonomic profile is more suited to predict outcomes. To our knowledge, this is the first comprehensive data in NSCLC using metagenomic visualization tools to extract knowledge efficiently to support further studies in the field.

Author’s contributions

David Dora: Conceptualization, formal analysis, investigation, data curation, visualization, supervision, funding acquisition, writing-original draft, writing – review and editing

Balazs Ligeti data curation, formal analysis, investigation, visualization, funding acquisition

Tamas Kovacs formal analysis, investigation, visualization

Peter Revisnyei: data curation, formal analysis, visualization

Gabrielly Galffy: data curation, resources

Edit Dulka: data curation, resources

Daniel Krizsán: data curation, formal analysis

Regina Kalcsevszki: data curation, formal analysis

Zsolt Megyesfalvi: formal analysis, validation, writing – review and editing

Balazs Dome: investigation, validation, supervision, resources, funding acquisition, writing – review and editing

Glen J. Weiss: validation, supervision, resources, writing – review and editing

Zoltan Lohinai: Conceptualization, investigation, supervision, resources, funding acquisition, writing-original draft, writing – review and editing

Consent for publication

All authors have reviewed and approved the final version of the manuscript.

Ethics approval and consent to participate

This study was carried out under the World Medical Association’s Helsinki Declaration study criteria. The national ethics commission officially accepted the study (Hungarian Scientific and Research Ethics Committee of the Medical Research Council (ETTTUKEB - 50,302-2/2017/EKU)). All patients who participated in/were recruited for the study gave their written consent. After data collection, patient-IDs were removed so none of the included individuals can be recognized directly or indirectly.

List of Abbreviations

PD-L1=

programmed death – ligand 1

PD-1=

programmed death 1

ICI=

immune checkpoint inhibitor

CHT=

chemotherapy

NSCLC=

non-small cell lung cancer

IT=

Immunotherapy

IHC=

Immunohistochemistry

PFS=

progression-free survival

trAE=

treatment – related adverse events

AB=

antibiotic

CHT+IO=

chemotherapy – Immunotherapy combination

CRT+IO=

chemoradiation followed by Immunotherapy combination

UMAP=

Uniform Manifold Approximation and Projection

CLR=

centred-log ratio

AUC=

area under the curve

KN=

Kaplan Meier

RF=

Random Forest

Supplemental material

Supplemental Material

Download MS Word (32.6 MB)

Acknowledgments

The team thanks Emese Bato for her assistance in processing metagenomic sequencing data using the KRAKEN database.

Disclosure statement

GJW: is a current employee of SOTIO Biotech Inc., a former employee of Unum Therapeutics; reports personal fees from Imaging Endpoints II, MiRanostics Consulting, Paradigm, International Genomics Consortium, Angiex, GLG Council, Guidepoint Global, Genomic Health, Oncocare, Rafael Pharmaceuticals, Gossamer Bio, and SPARC, Harvest Integrated Research Organization-all outside this submitted work; has ownership interest in Unum Therapeutics (now Cogent Biosciences), MiRanostics Consulting, Exact Sciences, Moderna, Agenus, Aurinia Pharmaceuticals, and Circulogene-outside the submitted work; and has issued patents: PCT/US2008/072787, PCT/US2010/043777, PCT/US2011/020612, and PCT/US2011/037616-all outside the submitted work.

Data availability statement

The authors confirm that the data supporting the findings of this study are available within the article [and/or] its supplementary materials. Unprocessed and raw data that support the findings of this study are available from the corresponding author, [ZL], upon reasonable request.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/2162402X.2023.2204746

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

Additional information

Funding

ZL acknowledges funding from the Hungarian National Research, Development and Innovation Office (#124652, #129664 and #128666). ZL received funding from the 2018 LCFA-BMS/IASLC Young Investigator Scholarship Award. DD acknowledges funding from the Hungarian National Research, Development and Innovation Office (#142287) and was supported by the Hungarian Academy of Sciences (Bolyai Scientific Fellowship). BL acknowledges funding from the Hungarian National Research, Development and Innovation Office (#138055) and from the Thematic Excellence Program (TKP2020-NKA-11). BD and ZM acknowledge funding from the Hungarian National Research, Development and Innovation Office (KH130356 to BD; 2020-1.1.6-JÖVŐ, TKP2021-EGA-33 and FK-143751 to BD and ZM). BD was also supported by the Austrian Science Fund (FWF I3522, FWF I3977 and I4677). ZM was supported by the UNKP-20-3 and UNKP-21-3 New National Excellence Program of the Ministry for Innovation and Technology of Hungary and by the Hungarian Respiratory Society (MPA #2020). ZM is also a recipient of the IASLC/ILCF Young Investigator Grant 2022. 

References

  • Gettinger S, Horn L, Jackman D, Spigel D, Antonia S, Hellmann M, Powderly J, Heist R, Sequist LV, Smith DC, et al. Five-year follow-up of nivolumab in previously treated advanced non–small-cell lung cancer: results from the CA209-003 Study. J Clin Oncol. 2018 Jun 10;36(17):1675–15. 10.1200/JCO.2017.77.0412. Epub 2018 Mar 23. PMID: 29570421.
  • Garon EB, Hellmann MD, Rizvi NA, Carcereny E, Leighl NB, Ahn MJ, Eder JP, Balmanoukian AS, Aggarwal C, Horn L, et al. Five-year overall survival for patients with advanced non‒small-cell lung cancer treated with Pembrolizumab: results from the phase I KEYNOTE-001 study. J Clin Oncol. 2019 Oct 1;37(28):2518–2527. doi: 10.1200/JCO.19.00934. Epub 2019 Jun 2. PMID: 31154919; PMCID: PMC6768611.
  • Chen G, Emens LA Chemoimmunotherapy: reengineering tumor immunity. Cancer Immunol Immunother. 2013 Feb;62(2):203–216. Epub 2013 Feb 7. PMID: 23389507; PMCID: PMC3608094. doi:10.1007/s00262-012-1388-0.
  • Zitvogel L, Ayyoub M, Routy B, Kroemer G. Microbiome and anticancer Immunosurveillance. Cell. 2016 Apr 7;165(2):276–287. 10.1016/j.cell.2016.03.001. PMID: 27058662.
  • de Kivit S, Tobin MC, Forsyth CB, Keshavarzian A, Landay AL. Regulation of intestinal immune responses through TLR activation: implications for pro- and prebiotics. Front Immunol. 2014 Feb 18;5:60. 10.3389/fimmu.2014.00060. PMID: 24600450; PMCID: PMC3927311.
  • Boesch M, Baty F, Rothschild SI, Tamm M, Joerger M, Früh M, Brutsche MH Tumour neoantigen mimicry by microbial species in cancer immunotherapy. Br J Cancer. 2021 Aug;125(3):313–323. Epub 2021 Apr 6. PMID: 33824481; PMCID: PMC8329167. doi:10.1038/s41416-021-01365-2.
  • Yi M, Yu S, Qin S, Liu Q, Xu H, Zhao W, Chu Q, Wu K. Gut microbiome modulates efficacy of immune checkpoint inhibitors. J Hematol Oncol. 2018 Mar 27;11(1):47. doi: 10.1186/s13045-018-0592-6. PMID: 29580257; PMCID: PMC5870075.
  • Heshiki Y, Vazquez-Uribe R, Li J, Ni Y, Quainoo S, Imamovic L, Li J, Sørensen M, Chow BKC, Weiss GJ, et al. Predictable modulation of cancer treatment outcomes by the gut microbiota. Microbiome. 2020 Mar 5;8(1):28. 10.1186/s40168-020-00811-2. PMID: 32138779; PMCID: PMC7059390.
  • Gopalakrishnan V, Spencer CN, Nezi L, Reuben A, Andrews MC, Karpinets TV, Prieto PA, Vicente D, Hoffman K, Wei SC, et al. Gut microbiome modulates response to anti–PD-1 immunotherapy in melanoma patients. Science. 2018 Jan 5;359(6371):97–103. 10.1126/science.aan4236. Epub 2017 Nov 2. PMID: 29097493; PMCID: PMC5827966.
  • Matson V, Fessler J, Bao R, Chongsuwat T, Zha Y, Alegre ML, Luke JJ, Gajewski TF. The commensal microbiome is associated with anti-PD-1 efficacy in metastatic melanoma patients. Science. 2018 Jan 5;359(6371):104–108. doi: 10.1126/science.aao3290. PMID: 29302014; PMCID: PMC6707353.
  • Peters BA, Wilson M, Moran U, Pavlick A, Izsak A, Wechter T, Weber JS, Osman I, Ahn J. Relating the gut metagenome and metatranscriptome to immunotherapy responses in melanoma patients. Genome Med. 2019 Oct 9;11(1):61. doi: 10.1186/s13073-019-0672-4. PMID: 31597568; PMCID: PMC6785875.
  • Limeta A, Ji B, Levin M, Gatto F, Nielsen J. Meta-analysis of the gut microbiota in predicting response to cancer immunotherapy in metastatic melanoma. JCI Insight. 2020 Dec 3;5(23):e140940. 10.1172/jci.insight.140940. PMID: 33268597; PMCID: PMC7714408.
  • Derosa L, Routy B, Fidelle M, Iebba V, Alla L, Pasolli E, Segata N, Desnoyer A, Pietrantonio F, Ferrere G, et al. Gut bacteria composition drives primary resistance to cancer immunotherapy in Renal Cell Carcinoma patients. Eur Urol. 2020 Aug;78(2):195–206. Epub 2020 May 4. PMID: 32376136. doi:10.1016/j.eururo.2020.04.044.
  • Xu X, Lv J, Guo F, Li J, Jia Y, Jiang D, Wang N, Zhang C, Kong L, Liu Y, et al. Gut microbiome influences the efficacy of PD-1 Antibody immunotherapy on MSS-Type Colorectal Cancer via Metabolic pathway. Front Microbiol. 2020 Apr 30;11:814. 10.3389/fmicb.2020.00814. PMID: 32425919; PMCID: PMC7212380.
  • Peng Z, Cheng S, Kou Y, Wang Z, Jin R, Hu H, Zhang X, Gong JF, Li J, Lu M, et al. The gut microbiome is associated with clinical response to Anti-PD-1/PD-L1 immunotherapy in Gastrointestinal cancer. Cancer Immunol Res. 2020 Oct;8(10):1251–1261. Epub 2020 Aug 27. PMID: 32855157. doi:10.1158/2326-6066.CIR-19-1014.
  • Jin Y, Dong H, Xia L, Yang Y, Zhu Y, Shen Y, Zheng H, Yao C, Wang Y, Lu S The diversity of gut microbiome is associated with favorable responses to Anti-programmed death 1 immunotherapy in Chinese patients with NSCLC. J Thorac Oncol. 2019 Aug;14(8):1378–1389. Epub 2019 Apr 23. PMID: 31026576. doi:10.1016/j.jtho.2019.04.007.
  • Zhang C, Wang J, Sun Z, Cao Y, Mu Z, Ji X Commensal microbiota contributes to predicting the response to immune checkpoint inhibitors in non-small-cell lung cancer patients. Cancer Sci. 2021 Aug;112(8):3005–3017. Epub 2021 Jun 23. PMID: 34028936; PMCID: PMC8353904. doi:10.1111/cas.14979.
  • Vetizou M, Pitt JM, Daillere R, Lepage P, Waldschmitt N, Flament C, Rusakiewicz S, Routy B, Roberti MP, Duong CP, et al. Anticancer immunotherapy by CTLA-4 blockade relies on the gut microbiota. Science. 2015;350(6264):1079–1084. Epub 2015 Nov 5. PMID: 26541610; PMCID: PMC4721659. doi:10.1126/science.aad1329.
  • Sivan A, Corrales L, Hubert N, Williams JB, Aquino-Michaels K, Earley ZM, Benyamin FW, Lei YM, Jabri B, Alegre ML, et al. Commensal Bifidobacterium promotes antitumor immunity and facilitates anti-PD-L1 efficacy. Science. 2015 Nov 27;350(6264):1084–1089. doi: 10.1126/science.aac4255. Epub 2015 Nov 5. PMID: 26541606; PMCID: PMC4873287.
  • Routy B, Le Chatelier E, Derosa L, Duong CPM, Alou MT, Daillère R, Fluckiger A, Messaoudene M, Rauber C, Roberti MP, et al. Gut microbiome influences efficacy of PD-1–based immunotherapy against epithelial tumors. Science. 2018 Jan 5;359(6371):91–97. 10.1126/science.aan3706. Epub 2017 Nov 2. PMID: 29097494.
  • Chulkina M, Beswick EJ, Pinchuk IV., Pinchuk IV. Role of PD-L1 in gut mucosa tolerance and chronic inflammation. Int J Mol Sci. 2020 Dec 1;21(23):9165. 10.3390/ijms21239165. PMID: 33271941; PMCID: PMC7730745.
  • Jang HJ, Choi JY, Kim K, Yong SH, Kim YW, Kim SY, Kim EY, Jung JY, Kang YA, Park MS, et al. Relationship of the lung microbiome with PD-L1 expression and immunotherapy response in lung cancer. Respir Res. 2021;22(1):322. doi:10.1186/s12931-021-01919-1.
  • Tseng JS, Yang TY, Wu CY, Ku WH, Chen KC, Hsu KH, Huang YH, Su KY, Yu SL, Chang GC. Characteristics and predictive value of PD-L1 status in real-world Non-Small Cell Lung Cancer patients. J Immunother. 2018;41(6):292–299. doi:10.1097/CJI.0000000000000226.
  • Reck M, Rodriguez-Abreu D, Robinson AG, Hui R, Csoszi T, Fulop A, Gottfried M, Peled N, Tafreshi A, Cuffe S, et al. Pembrolizumab versus Chemotherapy for PD-L1–Positive Non–Small-Cell Lung Cancer. N Engl J Med. 2016;375(19):1823–1833. doi:10.1056/NEJMoa1606774.
  • Liu W, Ma F, Sun B, Liu Y, Tang H, Luo J, Chen H, Luo Z. Intestinal microbiome associated with immune-related adverse events for patients treated with Anti-PD-1 inhibitors, a real-world study. Front Immunol. 2021 Dec 16;12:756872. 10.3389/fimmu.2021.756872. PMID: 34975845; PMCID: PMC8716485.
  • Pinato DJ, Howlett S, Ottaviani D, Urus H, Patel A, Mineo T, Brock C, Power D, Hatcher O, Falconer A, et al. Association of prior antibiotic treatment with survival and response to immune checkpoint inhibitor therapy in patients with cancer. JAMA Oncol. 2019 Dec 1;5(12):1774–1778. Erratum in: JAMA Oncol. 2020 Feb 1;6(2):302. PMID: 31513236; PMCID: PMC6743060. doi:10.1001/jamaoncol.2019.2785.
  • Huang EY, Inoue T, Leone VA, Dalal S, Touw K, Wang Y, Musch MW, Theriault B, Higuchi K, Donovan S, et al. Using corticosteroids to reshape the gut microbiome: implications for inflammatory bowel diseases. Inflamm Bowel Dis. 2015 May;21(5):963–972. PMID: 25738379; PMCID: PMC4402247. doi:10.1097/MIB.0000000000000332.
  • Imhann F, Vich Vila A, Bonder MJ, Lopez Manosalva AG, Koonen DPY, Fu J, Wijmenga C, Zhernakova A, Weersma RK. The influence of proton pump inhibitors and other commonly used medication on the gut microbiota. Gut Microbes. 2017 Jul 4;8(4):351–358. 10.1080/19490976.2017.1284732. Epub 2017 Jan 24. Erratum for: Addendum to: Imhann F, Bonder MJ, Vich Vila A, Fu J, Mujagic Z, Vork L, Tigchelaar EF, Jankipersadsing SA, Cenit MC, Harmsen HJ, Dijkstra G, Franke L, Xavier RJ, Jonkers D, Wijmenga C, Weersma RK, Zhernakova A. Proton pump inhibitors affect the gut microbiome. Gut 2016. PMID: 28118083; PMCID: PMC5570416.
  • Galloway-Peña JR, Smith DP, Sahasrabhojane P, Wadsworth WD, Fellman BM, Ajami NJ, Shpall EJ, Daver N, Guindani M, Petrosino JF, et al. Characterization oF oral and gut microbiome temporal variability in hospitalized cancer patients. Genome Med. 2017;9(1):21–1. doi:10.1186/s13073-017-0409-1.
  • Deng X, Li Z, Li G, Li B, Jin X, Lyu G. Comparison of microbiota in patients treated by surgery or chemotherapy by 16S rRNA sequencing reveals potential biomarkers for colorectal cancer therapy. Front Microbiol. 2018;9:1607. doi:10.3389/fmicb.2018.01607.
  • Andrews S. FastQC: a quality control tool for high throughput sequence data. 2010. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc.
  • Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol. 2019 Nov 28;20(1):257. doi: 10.1186/s13059-019-1891-0. PMID: 31779668; PMCID: PMC6883579.
  • McInnes L, Healy J, Melville J. Umap: uniform manifold approximation and projection for dimension reduction. (2018). arXiv preprint arXiv:1802.03426
  • Shen ZH, Zhu CX, Quan YS, Yang ZY, Wu S, Luo WW, Tan B, Wang XY. Relationship between intestinal microbiota and ulcerative colitis: mechanisms and clinical application of probiotics and fecal microbiota transplantation. World J Gastroenterol. 2018 Jan 7;24(1):5–14. doi: 10.3748/wjg.v24.i1.5. PMID: 29358877; PMCID: PMC5757125.
  • Ley RE, Bäckhed F, Turnbaugh P, Lozupone CA, Knight RD, Gordon JI. Obesity alters gut microbial ecology. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):11070–11075. doi: 10.1073/pnas.0504978102. Epub 2005 Jul 20. PMID: 16033867; PMCID: PMC1176910.
  • Ang QY, Alba DL, Upadhyay V, Bisanz JE, Cai J, Lee HL, Barajas E, Wei G, Noecker C, Patterson AD, et al. The East Asian gut microbiome is distinct from colocalized white subjects and connected to metabolic health. Elife. 2021 Oct 7;10:e70349. PMID: 34617511; PMCID: PMC8612731. doi:10.7554/eLife.70349.
  • Khan U, Ho K, Hwang EK, Peña C, Brouwer J, Hoffman K, Betel D, Sonnenberg GF, Faltas B, Saxena A, et al. Impact of use of antibiotics on response to immune checkpoint inhibitors and tumor microenvironment. Am J Clin Oncol. 2021 Jun 1;44(6):247–253. doi: 10.1097/COC.0000000000000813. PMID: 33826550.
  • Hakozaki T, Okuma Y, Omori M, Hosomi Y Impact of prior antibiotic use on the efficacy of nivolumab for non-small cell lung cancer. Oncol Lett. 2019 Mar;17(3):2946–2952. Epub 2019 Jan 8. PMID: 30854072; PMCID: PMC6365976. doi:10.3892/ol.2019.9899.
  • Zhang J, Feng D, Law HK, Wu Y, Zhu GH, Huang WY, Kang Y, Claesen J. IntegrativE analysis of gut microbiota and fecal metabolites in rats after Prednisone treatment. Microbiol Spectr. 2021 Dec 22;9(3):e0065021. 10.1128/Spectrum.00650-21. Epub 2021 Nov 10. PMID: 34756061; PMCID: PMC8579919.
  • Chen B, Yang C, Dragomir MP, Chi D, Chen W, Horst D, Calin GA, Li Q. Association of proton pump inhibitor use with survival outcomes in cancer patients treated with immune checkpoint inhibitors: a systematic review and meta-analysis. Ther Adv Med Oncol. 2022 Jul 15;14:17588359221111703. 10.1177/17588359221111703. PMID: 35860836; PMCID: PMC9290095.
  • Horz HP, Conrads G. The discussion goes on: what is the role of Euryarchaeota in humans? Archaea. Dec 30 2010;2010:967271. PMID: 21253553; PMCID: PMC3021867. doi:10.1155/2010/967271.
  • Yin X, Liao H, Yun H, Lin N, Li S, Xiang Y, Ma X. Artificial intelligence-based prediction of clinical outcome in immunotherapy and targeted therapy of lung cancer. Semin Cancer Biol. 2022 Nov;86(Pt 2):146–159. doi:10.1016/j.semcancer.2022.08.002.
  • Jin W, Luo Q. When artificial intelligence meets PD-1/PD-L1 inhibitors: population screening, response prediction and efficacy evaluation. Comput Biol Med. 2022 Jun;145:105499 . Epub 2022 Apr 7. PMID: 35439641. doi:10.1016/j.compbiomed.2022.105499.
  • Kong J, Ha D, Lee J, Kim I, Park M, Im SH, Shin K, Kim S. Network-based machine learning approach to predict immunotherapy response in cancer patients. Nat Commun. 2022 Jun 28;13(1):3703. doi:10.1038/s41467-022-31535-6.
  • Addeo A, Friedlaender A, Banna GL, Weiss GJ. TMB or not TMB as a biomarker: that is the question. Crit Rev Oncol Hematol. 2021 Jul;163:103374 . Epub 2021 Jun 2. PMID: 34087341. doi:10.1016/j.critrevonc.2021.103374.