Get Data

This is nearly entirely based on the code in notebook 09 and that in 11.

We have latent variable expression analysis data - Latent Variable Feather File

For this data we are also using any data for which there are gene variants (cNFs, pNFs, MPNSTs): - Exome-Seq variants - WGS Variants

Let’s see if there are any LVs that split based on gene variant. Because we’re having trouble scaling with the number of latent variables, I only look at variants that occur in less than 5% of the population. notice this is a difference from notebook #11.

wgs.vars=synTableQuery("SELECT Hugo_Symbol,Protein_position,specimenID,IMPACT,FILTER,ExAC_AF,gnomAD_AF FROM syn20551862")$asDataFrame()
## 
Downloading  [#-------------------]3.11%   2.0MB/64.3MB (3.0MB/s) Job-99631424031479263121852348.csv     
Downloading  [#-------------------]6.22%   4.0MB/64.3MB (4.4MB/s) Job-99631424031479263121852348.csv     
Downloading  [##------------------]9.33%   6.0MB/64.3MB (5.4MB/s) Job-99631424031479263121852348.csv     
Downloading  [##------------------]12.44%   8.0MB/64.3MB (6.2MB/s) Job-99631424031479263121852348.csv     
Downloading  [###-----------------]15.55%   10.0MB/64.3MB (6.8MB/s) Job-99631424031479263121852348.csv     
Downloading  [####----------------]18.66%   12.0MB/64.3MB (7.0MB/s) Job-99631424031479263121852348.csv     
Downloading  [####----------------]21.77%   14.0MB/64.3MB (7.4MB/s) Job-99631424031479263121852348.csv     
Downloading  [#####---------------]24.88%   16.0MB/64.3MB (7.8MB/s) Job-99631424031479263121852348.csv     
Downloading  [######--------------]27.99%   18.0MB/64.3MB (7.8MB/s) Job-99631424031479263121852348.csv     
Downloading  [######--------------]31.10%   20.0MB/64.3MB (8.1MB/s) Job-99631424031479263121852348.csv     
Downloading  [#######-------------]34.21%   22.0MB/64.3MB (8.3MB/s) Job-99631424031479263121852348.csv     
Downloading  [#######-------------]37.32%   24.0MB/64.3MB (8.6MB/s) Job-99631424031479263121852348.csv     
Downloading  [########------------]40.43%   26.0MB/64.3MB (8.7MB/s) Job-99631424031479263121852348.csv     
Downloading  [#########-----------]43.54%   28.0MB/64.3MB (9.0MB/s) Job-99631424031479263121852348.csv     
Downloading  [#########-----------]46.65%   30.0MB/64.3MB (9.1MB/s) Job-99631424031479263121852348.csv     
Downloading  [##########----------]49.76%   32.0MB/64.3MB (9.2MB/s) Job-99631424031479263121852348.csv     
Downloading  [###########---------]52.88%   34.0MB/64.3MB (9.4MB/s) Job-99631424031479263121852348.csv     
Downloading  [###########---------]55.99%   36.0MB/64.3MB (9.5MB/s) Job-99631424031479263121852348.csv     
Downloading  [############--------]59.10%   38.0MB/64.3MB (9.6MB/s) Job-99631424031479263121852348.csv     
Downloading  [############--------]62.21%   40.0MB/64.3MB (9.7MB/s) Job-99631424031479263121852348.csv     
Downloading  [#############-------]65.32%   42.0MB/64.3MB (9.8MB/s) Job-99631424031479263121852348.csv     
Downloading  [##############------]68.43%   44.0MB/64.3MB (10.0MB/s) Job-99631424031479263121852348.csv     
Downloading  [##############------]71.54%   46.0MB/64.3MB (10.0MB/s) Job-99631424031479263121852348.csv     
Downloading  [###############-----]74.65%   48.0MB/64.3MB (10.0MB/s) Job-99631424031479263121852348.csv     
Downloading  [################----]77.76%   50.0MB/64.3MB (10.1MB/s) Job-99631424031479263121852348.csv     
Downloading  [################----]80.87%   52.0MB/64.3MB (10.1MB/s) Job-99631424031479263121852348.csv     
Downloading  [#################---]83.98%   54.0MB/64.3MB (10.2MB/s) Job-99631424031479263121852348.csv     
Downloading  [#################---]87.09%   56.0MB/64.3MB (10.2MB/s) Job-99631424031479263121852348.csv     
Downloading  [##################--]90.20%   58.0MB/64.3MB (10.3MB/s) Job-99631424031479263121852348.csv     
Downloading  [###################-]93.31%   60.0MB/64.3MB (10.3MB/s) Job-99631424031479263121852348.csv     
Downloading  [###################-]96.42%   62.0MB/64.3MB (10.3MB/s) Job-99631424031479263121852348.csv     
Downloading  [####################]99.53%   64.0MB/64.3MB (10.3MB/s) Job-99631424031479263121852348.csv     
Downloading  [####################]100.00%   64.3MB/64.3MB (10.4MB/s) Job-99631424031479263121852348.csv Done...
exome.vars=synTableQuery("SELECT Hugo_Symbol,Protein_position,specimenID,IMPACT,FILTER,ExAC_AF,gnomAD_AF FROM syn20554939")$asDataFrame()
## 
Downloading  [--------------------]2.42%   2.0MB/82.5MB (12.1MB/s) Job-99631444498748415857238332.csv     
Downloading  [#-------------------]4.85%   4.0MB/82.5MB (12.0MB/s) Job-99631444498748415857238332.csv     
Downloading  [#-------------------]7.27%   6.0MB/82.5MB (11.6MB/s) Job-99631444498748415857238332.csv     
Downloading  [##------------------]9.69%   8.0MB/82.5MB (11.8MB/s) Job-99631444498748415857238332.csv     
Downloading  [##------------------]12.11%   10.0MB/82.5MB (10.8MB/s) Job-99631444498748415857238332.csv     
Downloading  [###-----------------]14.54%   12.0MB/82.5MB (9.3MB/s) Job-99631444498748415857238332.csv     
Downloading  [###-----------------]16.96%   14.0MB/82.5MB (9.6MB/s) Job-99631444498748415857238332.csv     
Downloading  [####----------------]19.38%   16.0MB/82.5MB (9.4MB/s) Job-99631444498748415857238332.csv     
Downloading  [####----------------]21.81%   18.0MB/82.5MB (9.1MB/s) Job-99631444498748415857238332.csv     
Downloading  [#####---------------]24.23%   20.0MB/82.5MB (9.0MB/s) Job-99631444498748415857238332.csv     
Downloading  [#####---------------]26.65%   22.0MB/82.5MB (9.1MB/s) Job-99631444498748415857238332.csv     
Downloading  [######--------------]29.07%   24.0MB/82.5MB (9.1MB/s) Job-99631444498748415857238332.csv     
Downloading  [######--------------]31.50%   26.0MB/82.5MB (9.0MB/s) Job-99631444498748415857238332.csv     
Downloading  [#######-------------]33.92%   28.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [#######-------------]36.34%   30.0MB/82.5MB (8.8MB/s) Job-99631444498748415857238332.csv     
Downloading  [########------------]38.77%   32.0MB/82.5MB (8.7MB/s) Job-99631444498748415857238332.csv     
Downloading  [########------------]41.19%   34.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [#########-----------]43.61%   36.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [#########-----------]46.03%   38.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [##########----------]48.46%   40.0MB/82.5MB (9.0MB/s) Job-99631444498748415857238332.csv     
Downloading  [##########----------]50.88%   42.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [###########---------]53.30%   44.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [###########---------]55.73%   46.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [############--------]58.15%   48.0MB/82.5MB (9.0MB/s) Job-99631444498748415857238332.csv     
Downloading  [############--------]60.57%   50.0MB/82.5MB (9.0MB/s) Job-99631444498748415857238332.csv     
Downloading  [#############-------]62.99%   52.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [#############-------]65.42%   54.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [##############------]67.84%   56.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [##############------]70.26%   58.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [###############-----]72.69%   60.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [###############-----]75.11%   62.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [################----]77.53%   64.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [################----]79.95%   66.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [################----]82.38%   68.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [#################---]84.80%   70.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [#################---]87.22%   72.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [##################--]89.65%   74.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [##################--]92.07%   76.0MB/82.5MB (8.9MB/s) Job-99631444498748415857238332.csv     
Downloading  [###################-]94.49%   78.0MB/82.5MB (8.8MB/s) Job-99631444498748415857238332.csv     
Downloading  [###################-]96.91%   80.0MB/82.5MB (8.8MB/s) Job-99631444498748415857238332.csv     
Downloading  [####################]99.34%   82.0MB/82.5MB (8.8MB/s) Job-99631444498748415857238332.csv     
Downloading  [####################]100.00%   82.5MB/82.5MB (8.8MB/s) Job-99631444498748415857238332.csv Done...
all.vars<-rbind(select(wgs.vars,'Hugo_Symbol','Protein_position','specimenID','IMPACT','gnomAD_AF'),
    select(exome.vars,'Hugo_Symbol','Protein_position','specimenID','IMPACT','gnomAD_AF'))%>%
  subset(gnomAD_AF<0.01)


#fn <- tempfile(pattern = "", fileext = ".feather")
#download.file('https://github.com/Sage-Bionetworks/nf-lv-viz/raw/master/data/filt_nf_mp_res.feather', destfile = fn)
mp_res<-synTableQuery("SELECT * FROM syn21046991")$asDataFrame()%>%
  filter(isCellLine != "TRUE")%>%
  select(latent_var,id,value,specimenID,tumorType,modelOf,diagnosis)
## 
Create CSV FileHandle [##########----------]50.29%   79344/157768       
Create CSV FileHandle [####################]100.00%   157768/157768   Done...    
Downloading  [#-------------------]7.43%   2.0MB/26.9MB (1.8MB/s) Job-99636763666563841204180715.csv     
Downloading  [###-----------------]14.85%   4.0MB/26.9MB (2.3MB/s) Job-99636763666563841204180715.csv     
Downloading  [####----------------]22.28%   6.0MB/26.9MB (2.6MB/s) Job-99636763666563841204180715.csv     
Downloading  [######--------------]29.71%   8.0MB/26.9MB (3.0MB/s) Job-99636763666563841204180715.csv     
Downloading  [#######-------------]37.13%   10.0MB/26.9MB (3.3MB/s) Job-99636763666563841204180715.csv     
Downloading  [#########-----------]44.56%   12.0MB/26.9MB (3.6MB/s) Job-99636763666563841204180715.csv     
Downloading  [##########----------]51.99%   14.0MB/26.9MB (3.8MB/s) Job-99636763666563841204180715.csv     
Downloading  [############--------]59.41%   16.0MB/26.9MB (4.1MB/s) Job-99636763666563841204180715.csv     
Downloading  [#############-------]66.84%   18.0MB/26.9MB (4.4MB/s) Job-99636763666563841204180715.csv     
Downloading  [###############-----]74.27%   20.0MB/26.9MB (4.7MB/s) Job-99636763666563841204180715.csv     
Downloading  [################----]81.69%   22.0MB/26.9MB (4.9MB/s) Job-99636763666563841204180715.csv     
Downloading  [##################--]89.12%   24.0MB/26.9MB (5.1MB/s) Job-99636763666563841204180715.csv     
Downloading  [###################-]96.54%   26.0MB/26.9MB (5.4MB/s) Job-99636763666563841204180715.csv     
Downloading  [####################]100.00%   26.9MB/26.9MB (5.5MB/s) Job-99636763666563841204180715.csv Done...

Merge data together

For the purposes of this analysis we want to have only those samples wtih genomic data and only those latent variables that are highly variable.

samps<-intersect(mp_res$specimenID,all.vars$specimenID)

mp_res<-mp_res%>%
  subset(specimenID%in%samps)%>%
  group_by(latent_var) %>%
  mutate(sd_value = sd(value)) %>%
  filter(sd_value > 0.025) %>%
  ungroup()

Retrieve Variant Data

Let’s retrieve the LV data and evaluate any correlations between scores and tumor size or patient age

data.with.var<-mp_res%>%subset(specimenID%in%samps)%>%
  left_join(all.vars,by='specimenID')

tab<-subset(data.with.var,!tumorType%in%c('Other','High Grade Glioma','Low Grade Glioma'))

top.genes=tab%>%group_by(tumorType)%>%
  mutate(numSamps=n_distinct(specimenID))%>%
      group_by(tumorType,Hugo_Symbol)%>%
    mutate(numMutated=n_distinct(specimenID))%>%
    ungroup()%>%
  subset(numMutated>1)%>%
      subset(numMutated<(numSamps-1))%>%
  select(tumorType,Hugo_Symbol,numSamps,numMutated)%>%distinct()

gene.count=top.genes%>%group_by(tumorType)%>%mutate(numGenes=n_distinct(Hugo_Symbol))%>%select(tumorType,numGenes)%>%distinct()

DT::datatable(gene.count)

## Test significance of each gene/immune population

Now we can loop through every tumor type and gene

red.genes<-c("NF1","SUZ12","CDKN2A","EED")##for testing

vals<-tab%>%#subset(Hugo_Symbol%in%red.genes)%>%
    mutate(mutated=ifelse(is.na(IMPACT),'WT','Mutated'))%>%
  select(latent_var,tumorType,value,Hugo_Symbol,specimenID,mutated)%>%
  distinct()%>%
  spread(key=Hugo_Symbol,value='mutated',fill='WT')

counts<-vals%>%
  gather(key=gene,value=status,-c(latent_var,tumorType,value,specimenID))%>% 
    select(latent_var,tumorType,value,gene,specimenID,status)%>%
    group_by(latent_var,tumorType,gene)%>%
    mutate(numVals=n_distinct(status))%>%
    subset(numVals==2)%>%ungroup()

#so now we have only 
with.sig<-counts%>%ungroup()%>%subset(gene%in%top.genes$Hugo_Symbol)%>%
    group_by(latent_var,gene)%>%
  mutate(pval=t.test(value~status)$p.value)%>%ungroup()%>%
  group_by(latent_var)%>%
  mutate(corP=p.adjust(pval))%>%ungroup()%>%
  select(latent_var,gene,pval,corP)%>%distinct()

sig.vals<-subset(with.sig,corP<0.05)

DT::datatable(sig.vals)

Interesting! Some genes actually pass p-value correction. What do they look like? Here let’s write the messiest possible code to print.

for(ct in unique(sig.vals$latent_var)){
    tplot<-sig.vals[which(sig.vals$latent_var==ct),]
    if(nrow(tplot)==0)
      next
    print(tplot)
    p<-counts%>%
    subset(latent_var==ct)%>%
    subset(gene%in%tplot$gene)%>%
    ggplot(aes(x=gene,y=value,col=status))+
    geom_boxplot(outlier.shape=NA)+
    geom_point(position=position_jitterdodge(),aes(group=status))+
    theme(axis.text.x = element_text(angle = 90, hjust = 1))+
    ggtitle(paste(ct,'scores'))
#    if(method=='cibersort')
#      p<-p+scale_y_log10()
    print(p)
  }
## # A tibble: 28 x 4
##    latent_var              gene                  pval   corP
##    <chr>                   <chr>                <dbl>  <dbl>
##  1 24,PID_DELTANP63PATHWAY AASDH         0.0000000955 0.0129
##  2 24,PID_DELTANP63PATHWAY ALDH4A1       0.0000000955 0.0129
##  3 24,PID_DELTANP63PATHWAY ASB6          0.0000000955 0.0129
##  4 24,PID_DELTANP63PATHWAY ASPSCR1       0.0000000955 0.0129
##  5 24,PID_DELTANP63PATHWAY CAPN14        0.0000000955 0.0129
##  6 24,PID_DELTANP63PATHWAY CCDC13        0.0000000955 0.0129
##  7 24,PID_DELTANP63PATHWAY CEP57L1       0.0000000955 0.0129
##  8 24,PID_DELTANP63PATHWAY CTNS          0.0000000955 0.0129
##  9 24,PID_DELTANP63PATHWAY DDIT3         0.0000000955 0.0129
## 10 24,PID_DELTANP63PATHWAY DKFZP434O1614 0.0000000955 0.0129
## # … with 18 more rows

## # A tibble: 7 x 4
##   latent_var gene        pval       corP
##   <chr>      <chr>      <dbl>      <dbl>
## 1 LV 804     ABCC11  3.02e-10 0.0000408 
## 2 LV 804     ATG2A   3.58e- 9 0.000483  
## 3 LV 804     CARD10  7.39e-11 0.00000996
## 4 LV 804     EPSTI1  3.02e-10 0.0000408 
## 5 LV 804     L3HYPDH 2.49e-11 0.00000336
## 6 LV 804     SCGN    2.49e-11 0.00000336
## 7 LV 804     WWC3    2.27e-11 0.00000307

## # A tibble: 2 x 4
##   latent_var             gene                pval     corP
##   <chr>                  <chr>              <dbl>    <dbl>
## 1 937,PID_HIF1_TFPATHWAY AC078925.1 0.00000000201 0.000272
## 2 937,PID_HIF1_TFPATHWAY ZDHHC11B   0.00000000201 0.000272

## # A tibble: 6 x 4
##   latent_var gene             pval     corP
##   <chr>      <chr>           <dbl>    <dbl>
## 1 LV 891     ADAMTS3 0.000000350   0.0472  
## 2 LV 891     PLA1A   0.000000350   0.0472  
## 3 LV 891     RASSF7  0.0000000892  0.0120  
## 4 LV 891     UNC45B  0.0000000892  0.0120  
## 5 LV 891     VCX3B   0.00000000501 0.000676
## 6 LV 891     ZNF335  0.000000350   0.0472

## # A tibble: 10 x 4
##    latent_var gene            pval   corP
##    <chr>      <chr>          <dbl>  <dbl>
##  1 LV 117     AP5M1    0.000000268 0.0362
##  2 LV 117     BRICD5   0.000000268 0.0362
##  3 LV 117     CYP4B1   0.000000268 0.0362
##  4 LV 117     GSDMC    0.000000268 0.0362
##  5 LV 117     MT1H     0.000000268 0.0362
##  6 LV 117     MUC6     0.000000268 0.0362
##  7 LV 117     OR4C5    0.000000268 0.0362
##  8 LV 117     OR5M11   0.000000268 0.0362
##  9 LV 117     SLC25A21 0.000000268 0.0362
## 10 LV 117     WDR52    0.000000268 0.0362

## # A tibble: 1 x 4
##   latent_var gene          pval    corP
##   <chr>      <chr>        <dbl>   <dbl>
## 1 LV 418     APC   0.0000000102 0.00138

## # A tibble: 1 x 4
##   latent_var gene           pval     corP
##   <chr>      <chr>         <dbl>    <dbl>
## 1 LV 588     APC   0.00000000370 0.000500

## # A tibble: 4 x 4
##   latent_var gene            pval    corP
##   <chr>      <chr>          <dbl>   <dbl>
## 1 LV 72      APC     0.0000000516 0.00697
## 2 LV 72      EIF2S3L 0.000000131  0.0177 
## 3 LV 72      HNRNPDL 0.0000000767 0.0103 
## 4 LV 72      MYCT1   0.000000131  0.0177

## # A tibble: 13 x 4
##    latent_var              gene                 pval    corP
##    <chr>                   <chr>               <dbl>   <dbl>
##  1 17,SVM NK cells resting ARHGAP29     0.0000000246 0.00332
##  2 17,SVM NK cells resting ATRX         0.0000000246 0.00332
##  3 17,SVM NK cells resting C3           0.0000000246 0.00332
##  4 17,SVM NK cells resting CCSER2       0.0000000246 0.00332
##  5 17,SVM NK cells resting CMIP         0.0000000246 0.00332
##  6 17,SVM NK cells resting IGIP         0.0000000246 0.00332
##  7 17,SVM NK cells resting NCF2         0.0000000246 0.00332
##  8 17,SVM NK cells resting RP1-241P17.4 0.0000000246 0.00332
##  9 17,SVM NK cells resting SAP30L       0.0000000246 0.00332
## 10 17,SVM NK cells resting SEMA3A       0.0000000246 0.00332
## 11 17,SVM NK cells resting SLC24A5      0.0000000246 0.00332
## 12 17,SVM NK cells resting SLC3A2       0.0000000246 0.00332
## 13 17,SVM NK cells resting SMCHD1       0.0000000246 0.00332

## # A tibble: 15 x 4
##    latent_var gene             pval        corP
##    <chr>      <chr>           <dbl>       <dbl>
##  1 LV 663     ARHGAP29     5.87e- 8 0.00791    
##  2 LV 663     ATRX         5.87e- 8 0.00791    
##  3 LV 663     C3           5.87e- 8 0.00791    
##  4 LV 663     CCSER2       5.87e- 8 0.00791    
##  5 LV 663     CMIP         5.87e- 8 0.00791    
##  6 LV 663     IGHV3-38     6.04e-12 0.000000815
##  7 LV 663     IGIP         5.87e- 8 0.00791    
##  8 LV 663     ITM2B        6.04e-12 0.000000815
##  9 LV 663     NCF2         5.87e- 8 0.00791    
## 10 LV 663     RP1-241P17.4 5.87e- 8 0.00791    
## 11 LV 663     SAP30L       5.87e- 8 0.00791    
## 12 LV 663     SEMA3A       5.87e- 8 0.00791    
## 13 LV 663     SLC24A5      5.87e- 8 0.00791    
## 14 LV 663     SLC3A2       5.87e- 8 0.00791    
## 15 LV 663     SMCHD1       5.87e- 8 0.00791

## # A tibble: 1 x 4
##   latent_var gene            pval   corP
##   <chr>      <chr>          <dbl>  <dbl>
## 1 LV 861     C9orf129 0.000000206 0.0278

## # A tibble: 3 x 4
##   latent_var              gene              pval   corP
##   <chr>                   <chr>            <dbl>  <dbl>
## 1 720,PID_FANCONI_PATHWAY CBR1      0.0000000934 0.0126
## 2 720,PID_FANCONI_PATHWAY PCMTD1    0.000000153  0.0207
## 3 720,PID_FANCONI_PATHWAY SPATA31A3 0.0000000934 0.0126

## # A tibble: 1 x 4
##   latent_var gene          pval   corP
##   <chr>      <chr>        <dbl>  <dbl>
## 1 LV 407     CEP170 0.000000207 0.0279

## # A tibble: 1 x 4
##   latent_var gene            pval     corP
##   <chr>      <chr>          <dbl>    <dbl>
## 1 LV 615     CNKSR2 0.00000000104 0.000140

## # A tibble: 3 x 4
##   latent_var gene            pval    corP
##   <chr>      <chr>          <dbl>   <dbl>
## 1 LV 88      EIF2S3L 0.0000000704 0.00950
## 2 LV 88      MYCT1   0.0000000704 0.00950
## 3 LV 88      SCAMP3  0.0000000598 0.00807

## # A tibble: 1 x 4
##   latent_var gene         pval   corP
##   <chr>      <chr>       <dbl>  <dbl>
## 1 LV 666     ESRRA 0.000000141 0.0190

## # A tibble: 1 x 4
##   latent_var gene         pval   corP
##   <chr>      <chr>       <dbl>  <dbl>
## 1 LV 839     ESRRA 0.000000219 0.0295

## # A tibble: 3 x 4
##   latent_var gene             pval   corP
##   <chr>      <chr>           <dbl>  <dbl>
## 1 LV 979     FAM160A2 0.0000000997 0.0135
## 2 LV 979     RALBP1   0.0000000997 0.0135
## 3 LV 979     SERPINB6 0.000000241  0.0326

## # A tibble: 1 x 4
##   latent_var                 gene             pval     corP
##   <chr>                      <chr>           <dbl>    <dbl>
## 1 4,REACTOME_NEURONAL_SYSTEM FAM185A 0.00000000426 0.000575

## # A tibble: 1 x 4
##   latent_var                                 gene        pval      corP
##   <chr>                                      <chr>      <dbl>     <dbl>
## 1 827,KEGG_B_CELL_RECEPTOR_SIGNALING_PATHWAY FAM185A 1.34e-10 0.0000180

## # A tibble: 1 x 4
##   latent_var gene           pval   corP
##   <chr>      <chr>         <dbl>  <dbl>
## 1 LV 100     FAM185A 0.000000178 0.0241

## # A tibble: 1 x 4
##   latent_var gene        pval      corP
##   <chr>      <chr>      <dbl>     <dbl>
## 1 LV 315     FAM185A 5.62e-10 0.0000758

## # A tibble: 1 x 4
##   latent_var gene        pval      corP
##   <chr>      <chr>      <dbl>     <dbl>
## 1 LV 331     FAM185A 5.40e-10 0.0000728

## # A tibble: 2 x 4
##   latent_var gene            pval   corP
##   <chr>      <chr>          <dbl>  <dbl>
## 1 LV 484     FAM185A  0.000000135 0.0182
## 2 LV 484     PPP1R13B 0.000000147 0.0199

## # A tibble: 1 x 4
##   latent_var gene            pval    corP
##   <chr>      <chr>          <dbl>   <dbl>
## 1 LV 529     FAM185A 0.0000000456 0.00615

## # A tibble: 3 x 4
##   latent_var gene            pval    corP
##   <chr>      <chr>          <dbl>   <dbl>
## 1 LV 9       FAM185A 0.0000000536 0.00723
## 2 LV 9       PHF3    0.0000000669 0.00903
## 3 LV 9       SCAMP3  0.000000117  0.0158

## # A tibble: 1 x 4
##   latent_var                               gene            pval    corP
##   <chr>                                    <chr>          <dbl>   <dbl>
## 1 517,REACTOME_SIGNALING_BY_EGFR_IN_CANCER GOLGA8S 0.0000000141 0.00190

## # A tibble: 2 x 4
##   latent_var gene         pval       corP
##   <chr>      <chr>       <dbl>      <dbl>
## 1 LV 120     GOLGA8T  5.24e-11 0.00000706
## 2 LV 120     TRBV11-1 1.57e- 7 0.0212

## # A tibble: 1 x 4
##   latent_var gene            pval    corP
##   <chr>      <chr>          <dbl>   <dbl>
## 1 LV 821     GOLGA8T 0.0000000709 0.00957

## # A tibble: 2 x 4
##   latent_var gene          pval   corP
##   <chr>      <chr>        <dbl>  <dbl>
## 1 LV 969     GPANK1 0.000000189 0.0255
## 2 LV 969     VCX3B  0.000000363 0.0490

## # A tibble: 3 x 4
##   latent_var gene          pval   corP
##   <chr>      <chr>        <dbl>  <dbl>
## 1 LV 326     GXYLT1 0.000000115 0.0155
## 2 LV 326     LRRC45 0.000000113 0.0153
## 3 LV 326     VCX3B  0.000000142 0.0192

## # A tibble: 1 x 4
##   latent_var gene          pval   corP
##   <chr>      <chr>        <dbl>  <dbl>
## 1 LV 870     HIVEP1 0.000000228 0.0308

## # A tibble: 3 x 4
##   latent_var gene          pval   corP
##   <chr>      <chr>        <dbl>  <dbl>
## 1 LV 754     HPS5   0.000000228 0.0307
## 2 LV 754     MYOM3  0.000000296 0.0399
## 3 LV 754     PNPLA1 0.000000296 0.0399

## # A tibble: 2 x 4
##   latent_var                          gene             pval    corP
##   <chr>                               <chr>           <dbl>   <dbl>
## 1 743,MIPS_55S_RIBOSOME_MITOCHONDRIAL IGHV3-38 0.0000000368 0.00497
## 2 743,MIPS_55S_RIBOSOME_MITOCHONDRIAL ITM2B    0.0000000368 0.00497

## # A tibble: 2 x 4
##   latent_var          gene              pval     corP
##   <chr>               <chr>            <dbl>    <dbl>
## 1 82,PID_RAC1_PATHWAY KRT18    0.00000000310 0.000418
## 2 82,PID_RAC1_PATHWAY TRBV11-1 0.000000122   0.0164

## # A tibble: 1 x 4
##   latent_var gene             pval    corP
##   <chr>      <chr>           <dbl>   <dbl>
## 1 LV 186     KRTAP2-2 0.0000000628 0.00847

## # A tibble: 1 x 4
##   latent_var gene            pval   corP
##   <chr>      <chr>          <dbl>  <dbl>
## 1 LV 904     KRTAP2-2 0.000000314 0.0423

## # A tibble: 2 x 4
##   latent_var            gene          pval    corP
##   <chr>                 <chr>        <dbl>   <dbl>
## 1 767,SVM B cells naive LRCH4 0.000000345  0.0465 
## 2 767,SVM B cells naive TPSD1 0.0000000251 0.00338

## # A tibble: 1 x 4
##   latent_var                     gene           pval    corP
##   <chr>                          <chr>         <dbl>   <dbl>
## 1 13,REACTOME_GLUCOSE_METABOLISM LRRC45 0.0000000279 0.00377

## # A tibble: 1 x 4
##   latent_var                      gene            pval    corP
##   <chr>                           <chr>          <dbl>   <dbl>
## 1 557,SVM Dendritic cells resting LRRC45 0.00000000836 0.00113

## # A tibble: 1 x 4
##   latent_var                                       gene         pval   corP
##   <chr>                                            <chr>       <dbl>  <dbl>
## 1 637,REACTOME_METABOLISM_OF_LIPIDS_AND_LIPOPROTE… LRRC45    1.48e-7 0.0199

## # A tibble: 1 x 4
##   latent_var                                      gene         pval    corP
##   <chr>                                           <chr>       <dbl>   <dbl>
## 1 879,REACTOME_HEPARAN_SULFATE_HEPARIN_HS_GAG_ME… LRRC…     2.07e-8 0.00280

## # A tibble: 2 x 4
##   latent_var                         gene             pval    corP
##   <chr>                              <chr>           <dbl>   <dbl>
## 1 975,SVM T cells CD4 memory resting LRRC45   0.0000000720 0.00971
## 2 975,SVM T cells CD4 memory resting TRBV11-1 0.0000000674 0.00910

## # A tibble: 1 x 4
##   latent_var gene           pval    corP
##   <chr>      <chr>         <dbl>   <dbl>
## 1 LV 479     LRRC45 0.0000000122 0.00164

## # A tibble: 1 x 4
##   latent_var gene           pval    corP
##   <chr>      <chr>         <dbl>   <dbl>
## 1 LV 504     LRRC45 0.0000000613 0.00827

## # A tibble: 2 x 4
##   latent_var gene              pval     corP
##   <chr>      <chr>            <dbl>    <dbl>
## 1 LV 745     LRRC45   0.00000000226 0.000304
## 2 LV 745     TRBV11-1 0.000000149   0.0201

## # A tibble: 2 x 4
##   latent_var gene           pval   corP
##   <chr>      <chr>         <dbl>  <dbl>
## 1 LV 884     LRRC45 0.0000000990 0.0134
## 2 LV 884     OR10G2 0.0000000911 0.0123

## # A tibble: 1 x 4
##   latent_var gene           pval    corP
##   <chr>      <chr>         <dbl>   <dbl>
## 1 LV 674     LRRIQ1 0.0000000372 0.00502

## # A tibble: 1 x 4
##   latent_var gene           pval     corP
##   <chr>      <chr>         <dbl>    <dbl>
## 1 LV 54      NAPA  0.00000000203 0.000274

## # A tibble: 3 x 4
##   latent_var                                   gene       pval      corP
##   <chr>                                        <chr>     <dbl>     <dbl>
## 1 840,MIPS_39S_RIBOSOMAL_SUBUNIT_MITOCHONDRIAL PHF3   5.35e-10 0.0000722
## 2 840,MIPS_39S_RIBOSOMAL_SUBUNIT_MITOCHONDRIAL RPL10L 1.15e- 8 0.00155  
## 3 840,MIPS_39S_RIBOSOMAL_SUBUNIT_MITOCHONDRIAL SCAMP3 7.65e- 8 0.0103

## # A tibble: 1 x 4
##   latent_var                      gene          pval    corP
##   <chr>                           <chr>        <dbl>   <dbl>
## 1 671,REACTOME_COLLAGEN_FORMATION PRDM2 0.0000000200 0.00270

## # A tibble: 1 x 4
##   latent_var gene          pval    corP
##   <chr>      <chr>        <dbl>   <dbl>
## 1 LV 595     PURA  0.0000000224 0.00302

## # A tibble: 1 x 4
##   latent_var gene         pval   corP
##   <chr>      <chr>       <dbl>  <dbl>
## 1 LV 163     RAB44 0.000000224 0.0303

## # A tibble: 1 x 4
##   latent_var              gene          pval    corP
##   <chr>                   <chr>        <dbl>   <dbl>
## 1 865,PID_FANCONI_PATHWAY RGPD8 0.0000000398 0.00537

## # A tibble: 1 x 4
##   latent_var gene       pval          corP
##   <chr>      <chr>     <dbl>         <dbl>
## 1 LV 742     SCAMP3 5.78e-14 0.00000000780

## # A tibble: 1 x 4
##   latent_var gene        pval       corP
##   <chr>      <chr>      <dbl>      <dbl>
## 1 LV 16      TBC1D3B 1.60e-11 0.00000216

## # A tibble: 1 x 4
##   latent_var gene            pval   corP
##   <chr>      <chr>          <dbl>  <dbl>
## 1 LV 192     TRBV11-1 0.000000347 0.0468

## # A tibble: 1 x 4
##   latent_var gene            pval   corP
##   <chr>      <chr>          <dbl>  <dbl>
## 1 LV 245     TRBV11-1 0.000000201 0.0272

## # A tibble: 1 x 4
##   latent_var gene             pval    corP
##   <chr>      <chr>           <dbl>   <dbl>
## 1 LV 282     TRBV11-1 0.0000000103 0.00139

## # A tibble: 1 x 4
##   latent_var gene              pval    corP
##   <chr>      <chr>            <dbl>   <dbl>
## 1 LV 423     TRBV11-1 0.00000000796 0.00107

## # A tibble: 1 x 4
##   latent_var gene            pval   corP
##   <chr>      <chr>          <dbl>  <dbl>
## 1 LV 465     TRBV11-1 0.000000131 0.0177

## # A tibble: 1 x 4
##   latent_var gene            pval   corP
##   <chr>      <chr>          <dbl>  <dbl>
## 1 LV 672     TRBV11-1 0.000000123 0.0166

## # A tibble: 1 x 4
##   latent_var gene              pval    corP
##   <chr>      <chr>            <dbl>   <dbl>
## 1 LV 940     TRBV11-1 0.00000000971 0.00131

## # A tibble: 1 x 4
##   latent_var   gene         pval   corP
##   <chr>        <chr>       <dbl>  <dbl>
## 1 42,DMAP_HSC1 VCX3B 0.000000138 0.0187

## # A tibble: 1 x 4
##   latent_var                                      gene         pval    corP
##   <chr>                                           <chr>       <dbl>   <dbl>
## 1 66,REACTOME_METABOLISM_OF_LIPIDS_AND_LIPOPROTE… VCX3B     3.79e-8 0.00511

## # A tibble: 1 x 4
##   latent_var                     gene         pval   corP
##   <chr>                          <chr>       <dbl>  <dbl>
## 1 70,MIPS_PA700_20S_PA28_COMPLEX VCX3B 0.000000272 0.0367

## # A tibble: 1 x 4
##   latent_var gene         pval   corP
##   <chr>      <chr>       <dbl>  <dbl>
## 1 LV 149     VCX3B 0.000000251 0.0339

## # A tibble: 1 x 4
##   latent_var gene          pval    corP
##   <chr>      <chr>        <dbl>   <dbl>
## 1 LV 571     VCX3B 0.0000000251 0.00339

## # A tibble: 1 x 4
##   latent_var gene      pval       corP
##   <chr>      <chr>    <dbl>      <dbl>
## 1 LV 629     VCX3B 2.02e-11 0.00000272

## # A tibble: 1 x 4
##   latent_var gene          pval    corP
##   <chr>      <chr>        <dbl>   <dbl>
## 1 LV 634     VCX3B 0.0000000188 0.00254

## # A tibble: 1 x 4
##   latent_var gene           pval    corP
##   <chr>      <chr>         <dbl>   <dbl>
## 1 LV 676     VCX3B 0.00000000985 0.00133

## # A tibble: 1 x 4
##   latent_var gene          pval    corP
##   <chr>      <chr>        <dbl>   <dbl>
## 1 LV 760     VCX3B 0.0000000284 0.00383

## # A tibble: 1 x 4
##   latent_var gene          pval    corP
##   <chr>      <chr>        <dbl>   <dbl>
## 1 LV 851     VCX3B 0.0000000291 0.00393

## # A tibble: 1 x 4
##   latent_var gene           pval    corP
##   <chr>      <chr>         <dbl>   <dbl>
## 1 LV 976     VCX3B 0.00000000814 0.00110

#}

Breaking down by tumor type

At first glance it seems that a lot of these are separating out cNFs (i.e. mast cell signaling) from other types. However, I’m getting the same error I get in notebook number 11, so am unsure about how to proceed.

#this is a failed attempt to group by tumor type
#with.sig<-counts%>%ungroup()%>%subset(gene%in%top.genes$Hugo_Symbol)%>%
#    group_by(latent_var,tumorType,gene)%>%
#  mutate(pval=t.test(value~status)$p.value)%>%
#  ungroup()%>%
#  group_by(latent_var)%>%
#  mutate(corP=p.adjust(pval))%>%ungroup()%>%
#  select(latent_var,tumorType,gene,pval,corP)%>%distinct()

#sig.vals<-subset(with.sig,corP<0.05)

#DT::datatable(sig.vals)