1 Map of sampling stations in the Okinawa Trough

Load required package:

library(marmap)

Import bathymetry data from NOAA:

b = getNOAA.bathy(lon1 = 115, lon2 = 140, lat1 = 35, lat2 = 20, resolution = 1)

Import dataframe with the latitude and longitude of sampling points from CTD (in decimal degrees):

pts <- read.csv("Mirai_CTD_lat_long.csv")
pts$Station<- as.character(pts$Station)
str(pts)

Make dataframe with map labels and their lat/long:

Clon = c(127.5,131.25, 121.6, 121.03, 120.8, 126)
Clat= c(26, 31.93, 31.1, 23, 27.2, 22)
CLabels= c("Okinawa", "Japan", "China", "Taiwan", "East China Sea", "Philippine Sea")
Countries = cbind.data.frame(Clon, Clat, CLabels)
str(Countries)

Make a dataframe with vent sites:

vents <- read.csv("VentSites.csv")
str(vents)

Make plot one layer at a time:

plot(b, image = TRUE, land = TRUE, xlim=c(120,135), ylim=c(22, 32), lwd = 0.03, bpal = list(c(0, max(b), grey(.7), grey(.9), grey(.95)),
c(min(b), 0, "darkblue", "lightblue"))) #bathymetry colors
plot(b, lwd = 0.8, deep = 0, shallow = 0, step = 0, add = TRUE) # highlight coastline with solid black line
plot(b, deep=-200, shallow=-200, lwd = 0.4, drawlabels=TRUE, add=TRUE)  # Add -200m isobath
plot(b, deep=-1000, shallow=-1000, lwd = 0.4, drawlabels=TRUE, add=TRUE)  # Add -1000m isobath
plot(b, deep=-2000, shallow=-2000, lwd = 0.4, drawlabels=TRUE, add=TRUE)  # Add -2000m isobath
plot(b, deep=-3000, shallow=-3000, lwd = 0.4, drawlabels=TRUE, add=TRUE)  # Add -3000m isobath # Add -3000m isobath
points(pts, pch = 16, col = 'red') # Add sampling points
text(pts, labels=pts$Station, adj= 1.5, cex=1.2) # Add sampling station numbers
text(Countries, labels=Countries$CLabels, cex= 1.7, adj = -0.2) # Add map labels 
points(vents, pch = 17, col = 'blue') # Add vent points

2 Sequence analysis

2.1 Identify and Count Amplicon Sequence Variants (ASVs)

Sample collection, DNA extraction, PCR, & sequencing

Seawater collected from the subsurface chlorophyll maximum (SCM), mid-water column, and near-bottom was collected in 10 L Niskin bottles at each sampling station and surface seawater was collected by bucket. Five Liter replicates from each depth (4.5 from surface) were sequentially filtered through 10.0 and 0.2 um pore-size Polytetrafluoroethylene (PTFE) filters. Filters were flash frozen in liquid nitrogen and stored at -80 C until DNA extraction. DNA was extracted following the manufacturers protocols for the Qiagen/MoBio DNeasy PowerWater Kit, including the optional heating step to lyse cells. Amplicon PCR and sequencing library preparation followed the procedures described in the Illumina 16S Metagenomic Sequencing Library Preparation manual modified to include universal eukaryotic primers for the v4 region of the 18S rRNA gene (F: Stoeck et al. 2010, R: Brisbin et al. 2018) and amplicon PCR conditions most appropriate for these primers (annealing at 58C). The 220 samples were separated into 4 sequencing runs on the Illumina MiSeq sequencing platform with v3 chemistry for 2x300-bp sequencing.

Bioinformatic processing with QIIME 2

Demultiplexed paired-end sequences provided by the OIST sequencing center were imported to QIIME 2 (v2018.11) software (Bolyen et al. 2019). The Divisive Amplicon Denoising Algorithm (DADA) was implemented with the DADA2 plug-in for Qiime2 for quality filtering, chimera removal, and feature table construction (Callahan et al. 2015) separately for each sequencing run before merging the four resulting feature tables. Taxonomy was assigned to Amplicon Sequence Variants (ASVs) in the feature table by training a classifier on the Protist Ribosomal Reference (PR2) database to classify ASVs (Guillo et al. 2012). The feature table and assigned taxonomy were then exported from Qiime2 for further analysis.

Below is an example Qiime2 workflow for 1 sequencing run:

Activate an anaconda environment for qiime2: source activate qiime2-2018.11

Import sequencing data (all sequences for eacg sequencing run must be in their own directory):

qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path ./seqs/seqrun1/ --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path run1_demux-paired-end.qza

Visualize squencing results (use these graphs to decide what length to trim sequences)

qiime demux summarize --i-data run1_demux-paired-end.qza --o-visualization run1_demux.qzv

Run the DADA2 algorithm:

qiime dada2 denoise-paired --i-demultiplexed-seqs run1_demux-paired-end.qza --output-dir ./dd2run1 --o-representative-sequences run1_rep-seqs --p-trim-left-f 10 --p-trim-left-r 10 --p-trunc-len-f 260 --p-trunc-len-r 250 --p-n-threads 3

Check results:

qiime feature-table summarize --i-table ./dd2run1/table.qza --o-visualization ./dd2run1/table.qzv --m-sample-metadata-file ~/desktop/MiraiFilters/sampleMaptxt.txt

qiime metadata tabulate --m-input-file dd2run1/denoising_stats.qza --o-visualization dd2run1/denoising_stats.qzv

Merge feature tables from multiple sequencing runs:

qiime feature-table merge --i-tables ./dd2run1/table.qza --i-tables ./dd2run2/table.qza --i-tables ./dd2run3/table.qza --i-tables ./dd2run4/table.qza --o-merged-table mergedtable.qza

qiime feature-table merge-seqs --i-data run1_rep-seqs.qza --i-data run2_rep-seqs.qza --i-data run3_rep-seqs.qza --i-data run4_rep-seqs.qza --o-merged-data merged-rep-seqs.qza

Import taxonomy database:

qiime tools import --type 'FeatureData[Sequence]' --input-path pr2_version_4.11.1_mothur.fasta --output-path pr2.qza

qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path pr2_version_4.11.1_mothur.tax -output-path pr2_tax.qza

Select V4 region from PR2 sequences:

qiime feature-classifier extract-reads --i-sequences pr2.qza --p-f-primer CCAGCASCYGCGGTAATTCC --p-r-primer ACTTTCGTTCTTGATYRA --o-reads pr2_v4.qza

Train the classifier:

qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads pr2_v4.qza --i-reference-taxonomy pr2_tax.qza --o-classifier pr2_v4_classifier.qza

Classify:

qiime feature-classifier classify-sklearn --i-classifier pr2_v4_classifier.qza --i-reads merged-rep-seqs.qza --o-classification taxonomy.qza

qiime metadata tabulate --m-input-file taxonomy.qza --o-visualization taxonomy.qzv

Export the .tsv taxonomy file from taxonomy.qzv for use in following steps.

2.2 Load data into R & prepare phyloseq objects

Load packages:

library(tidyverse)
library(qiime2R)
library('phyloseq')
library('vegan')
library('ggplot2')
library(RColorBrewer)
library(DESeq2)
library(reshape)
library("wesanderson")
library("gridExtra")
library("metagMisc")
library("wesanderson")
library("CoDaSeq")
library("ggbiplot")
library(ggpubr)

Set default colors:

ap<- c("#F1B6A1", "#D4A52A", "#E3E5DB", "#A5CFCC", "#0E899F", "#A83860", "#ED91BC", "#DB5339", "#F58851", "#42465C", "#1E479A", "#F7CDA4", "#CF529C", "#11638C")

Convert QIIME 2 artifacts to phyloseq objects:

phyloseq<-qza_to_phyloseq(features="mergedtable.qza")

Import sample data:

metatable <- read.csv("sampleMap.csv")
row.names(metatable) <- metatable[[1]]

metatable$Station <- as.character(metatable$Station)
metatable$filter <- as.character(metatable$filter)
metatable$VentANAHTM<-as.factor(metatable$VentANAHTM)

META<- sample_data(metatable)
str(META)

## 'data.frame':    212 obs. of  32 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
##   ..@ .Data    :List of 32
##   .. ..$ : Factor w/ 212 levels "A6-F173-DNA",..: 3 6 9 12 96 97 115 118 129 134 ...
##   .. ..$ : Factor w/ 212 levels "A6-F173-DNA",..: 3 6 9 12 96 97 115 118 129 134 ...
##   .. ..$ : chr  "2" "2" "2" "2" ...
##   .. ..$ : Factor w/ 8 levels "1","31","32",..: 7 7 8 8 1 1 4 4 6 6 ...
##   .. ..$ : Factor w/ 14 levels "C02","C03","C04",..: 1 1 1 1 1 1 1 1 1 1 ...
##   .. ..$ : Factor w/ 4 levels "Bottom","Mid",..: 4 4 4 4 1 1 2 2 3 3 ...
##   .. ..$ : Factor w/ 5 levels "Bottom","Deep",..: 5 5 5 5 1 1 3 3 4 4 ...
##   .. ..$ : Factor w/ 2 levels "D","S": 2 2 2 2 1 1 1 1 2 2 ...
##   .. ..$ : Factor w/ 109 levels "10Bott1","10Bott2",..: 71 71 72 72 65 65 67 67 69 69 ...
##   .. ..$ : int  0 0 0 0 1000 1000 700 700 92 92 ...
##   .. ..$ : chr  "10" "2" "10" "2" ...
##   .. ..$ : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 2 2 2 2 ...
##   .. ..$ : Factor w/ 4 levels "KG","MOT","NOT",..: 1 1 1 1 1 1 1 1 1 1 ...
##   .. ..$ : Factor w/ 3 levels "KG","NOT","SOT": 1 1 1 1 1 1 1 1 1 1 ...
##   .. ..$ : Factor w/ 3 levels "n","p","y": 3 3 3 3 3 3 3 3 3 3 ...
##   .. ..$ : Factor w/ 4 levels "KG","NOT","O",..: 1 1 1 1 1 1 1 1 1 1 ...
##   .. ..$ : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 2 2 2 2 ...
##   .. ..$ : num  0 0 0 0 1056 ...
##   .. ..$ : num  25.25 25.25 25.25 25.25 4.46 ...
##   .. ..$ : num  34.7 34.7 34.7 34.7 34.4 ...
##   .. ..$ : num  202.2 202.2 202.2 202.2 69.8 ...
##   .. ..$ : num  23.1 23.1 23.1 23.1 27.3 ...
##   .. ..$ : num  0.0698 0.0698 0.0698 0.0698 0.051 ...
##   .. ..$ : num  0.061 0.061 0.061 0.061 1.107 ...
##   .. ..$ : num  0.521 0.521 0.521 0.521 11.128 ...
##   .. ..$ : num  0.06 0.06 0.06 0.06 37.51 ...
##   .. ..$ : num  0.01 0.01 0.01 0.01 0.02 0.02 0 0 0.11 0.11 ...
##   .. ..$ : num  1.05 1.05 1.05 1.05 111.04 ...
##   .. ..$ : num  0.033 0.033 0.033 0.033 2.666 ...
##   .. ..$ : num  0.05 0.05 0.05 0.05 0.03 0.03 0.03 0.03 0.04 0.04 ...
##   .. ..$ : num  26.3 26.3 26.3 26.3 26.3 ...
##   .. ..$ : num  126 126 126 126 126 ...
##   ..@ names    : chr  "X" "SampleID" "Station" "Bottle" ...
##   ..@ row.names: chr  "A8-F17-DNA" "B8-F18-DNA" "C8-F19-DNA" "D8-F20-DNA" ...
##   ..@ .S3Class : chr "data.frame"

Load PR2 taxonomy:

taxonomy <- read.csv("taxonomy.csv", stringsAsFactors = FALSE)
names(taxonomy) <- c("row", "tax", "Confidence")
row.names(taxonomy) <-taxonomy[[1]]
taxonomy <- taxonomy[,(-1)]
taxonomy <-  separate(taxonomy, tax, c("D0","D1", "D2", "D3", "D4", "D5", "D6", "D7", "D8", "D9", "D10", "D11", "D12", "D13", "D14"), sep = ";", fill = "right")
taxonomy <- taxonomy[,c(1:8)]
taxmat <- as.matrix(taxonomy)
TAX = tax_table(taxmat)
str(TAX)

## Formal class 'taxonomyTable' [package "phyloseq"] with 1 slot
##   ..@ .Data: chr [1:23115, 1:8] "Eukaryota" "Eukaryota" "Eukaryota" "Eukaryota" ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:23115] "000a220fe8edd771dd82f6915627ef3e" "000c79760dd16d6692018deb8ea1b164" "000d9eaa21eaff08f34fb1a1a87601d5" "000e6f109e01181b2b1eda5d6c99319c" ...
##   .. .. ..$ : chr [1:8] "D0" "D1" "D2" "D3" ...

Add taxonomy to phyloseq object:

ps = merge_phyloseq(phyloseq, TAX, META)

2.3 Prevalence Plots

Prevalence plots display the total abundance of an ASV in the full data set v. the fraction of total samples that ASV is found in.

prevdf = apply(X = otu_table(ps),
               MARGIN = ifelse(taxa_are_rows(ps), yes = 1, no = 2),
               FUN = function(x){sum(x > 0)})

prevdf = data.frame(Prevalence = prevdf,
                    TotalAbundance = taxa_sums(ps),
                    tax_table(ps))

#plyr::ddply(prevdf, "D2", function(df1){cbind(mean(df1$Prevalence),sum(df1$Prevalence))})

Prevalence plot:

prevplot1<-ggplot(prevdf, aes(TotalAbundance, Prevalence / nsamples(ps),color=D2)) +  geom_point(size = 2, alpha = 0.7) + 
  theme_bw()+
  scale_x_log10() +  xlab("Total Abundance") + ylab("Prevalence [Frac. Samples]") +
  facet_wrap(~D1) + theme(legend.position="none")

prevplot1

We don’t apply prevalence filter because we are interested in rare ASVs in deep water masses and at hydrothermal vents.

2.4 Basic summary statistics

OTUs <- data.frame(otu_table(ps))

Total number of ASVs in the data set:

OTUsRS<- OTUs
OTUsRS$RowSum <- rowSums(OTUsRS)
OTUsRSnoZero <- OTUsRS$RowSum!=0
sum(OTUsRSnoZero)

## [1] 22656

Total number of ASVs per sample (range and mean):

OTUs0 <- OTUs!=0 #is this number not a zero? true (1) or false (0)
csums <- colSums(OTUs0) # col sums = observed ASV richness
csumdf <- as.data.frame(csums)
max(csumdf$csums) #1906

## [1] 1906

min(csumdf$csums) #49

## [1] 49

mean(csumdf$csums) #730

## [1] 729.9194

2.5 Alpha Diversity

2.5.1 All samples ~ depth

Observed Richness

plugin <- ps  %>%
            estimate_richness(measures = "Observed") %$% Observed
char <- ps %>% sample_data %$% Depth
filter <- ps %>% sample_data %$% filter

richness<- data.frame(plugin, char, filter )
names(richness) <- c("richness", "Depth", "filter")
richness$Depth<- factor(richness$Depth, levels = c("Bottom", "Mid", "SCM", "Surface"))

RichDepth<- ggplot(richness, aes(x=Depth, y=richness, fill = filter))+geom_boxplot() + theme_bw() + scale_fill_manual(values = c("#E3E5DB", "#11638C"), labels=c("10.0", "0.2")) + theme(text = element_text(size=14)) +ylab("Observed Richness") +ggtitle("A")

RichDepth

Significance testing for Observed Richness between depths:

pairwise.wilcox.test(richness$richness, richness$Depth)

## 
##  Pairwise comparisons using Wilcoxon rank sum test 
## 
## data:  richness$richness and richness$Depth 
## 
##         Bottom  Mid     SCM    
## Mid     1       -       -      
## SCM     3.9e-08 8.5e-08 -      
## Surface 1       1       1.2e-07
## 
## P value adjustment method: holm

SCM is significantly different (higher richness) than all other depths.

Shannon Index

plugin <- ps  %>%
            estimate_richness(measures = "Shannon") %$% Shannon
char <- ps %>% sample_data %$% Depth
filter <- ps %>% sample_data %$% filter

shan <- data.frame(plugin, char, filter )
names(shan) <- c("diversity", "Depth", "filter")
shan$Depth<- factor(shan$Depth, levels = c("Bottom", "Mid", "SCM", "Surface"))

ShanDepth<- ggplot(shan, aes(x=Depth, y=diversity, fill = filter))+geom_boxplot() + theme_bw() + scale_fill_manual(values = c("#E3E5DB", "#11638C"), labels=c("10.0", "0.2")) + theme(text = element_text(size=14)) +ylab("Shannon Index") + ggtitle("B")


ShanDepth

Significance testing for Shannon Index between depths:

pairwise.wilcox.test(shan$diversity, shan$Depth)

## 
##  Pairwise comparisons using Wilcoxon rank sum test 
## 
## data:  shan$diversity and shan$Depth 
## 
##         Bottom  Mid     SCM
## Mid     1       -       -  
## SCM     1.8e-07 1.4e-06 -  
## Surface 5.4e-10 2.3e-08 1  
## 
## P value adjustment method: holm

SCM and Surface are significantly different (higer Shannon Index) than Mid and Bottom. SCM and SUrface are not significantly different from eachother and Mid and Bottom are not significantly different from eachother

2.5.2 Bottom Water samples ~ station

bottomraw<- subset_samples(ps, Depth=="Bottom")

Observed Richness

plugin <- bottomraw  %>%
            estimate_richness(measures = "Observed") %$% Observed
char <- bottomraw %>% sample_data %$% Station
filter <- bottomraw %>% sample_data %$% filter
ventplume<- bottomraw %>% sample_data %$% VentPlume
vent <- bottomraw %>% sample_data %$% Vent

rich <- data.frame(plugin, char, filter, ventplume, vent )
names(rich) <- c("diversity", "Station", "filter", "VentPlume", "Vent")
rich$Station<- factor(rich$Station, levels=c("11", "10", "9", "8", "3", "4", "2", "5", "12", "13", "14", "15", "17", "18"))

RichBottom<- ggplot(rich, aes(x=Station, y=diversity, fill = filter))+geom_boxplot() + theme_bw() + scale_fill_manual(values = c("#E3E5DB", "#11638C"), labels=c("10.0", "0.2")) + theme(text = element_text(size=14)) +ylab("Observed Richness") +xlab("Station") + ggtitle("C")
RichBottom

pairwise.wilcox.test(rich$diversity, rich$VentPlume)

## 
##  Pairwise comparisons using Wilcoxon rank sum test 
## 
## data:  rich$diversity and rich$VentPlume 
## 
##   n    p   
## p 0.34 -   
## y 0.34 0.34
## 
## P value adjustment method: holm

Shannon Index

plugin <- bottomraw  %>%
            estimate_richness(measures = "Shannon") %$% Shannon
char <- bottomraw %>% sample_data %$% Station
filter <- bottomraw %>% sample_data %$% filter
ventplume<- bottomraw %>% sample_data %$% VentPlume
vent<- bottomraw %>% sample_data %$% Vent

shan <- data.frame(plugin, char, filter, ventplume, vent )
names(shan) <- c("diversity", "Station", "filter", "VentPlume", "Vent")
shan$Station<- factor(shan$Station, levels=c("11", "10", "9", "8", "3", "4", "2", "5", "12", "13", "14", "15", "17", "18"))

ShanBottom<- ggplot(shan, aes(x=Station, y=diversity, fill = filter))+geom_boxplot()+  theme_bw()+ scale_fill_manual(values = c("#E3E5DB", "#11638C"), labels=c("10.0", "0.2")) + theme(text = element_text(size=14)) +ylab("Shannon Index") +xlab("Station") +ggtitle("D")
ShanBottom

pairwise.wilcox.test(shan$diversity, shan$VentPlume)

## 
##  Pairwise comparisons using Wilcoxon rank sum test 
## 
## data:  shan$diversity and shan$VentPlume 
## 
##   n    p   
## p 1.00 -   
## y 0.41 1.00
## 
## P value adjustment method: holm

Put figures together

supp_alpha<-ggarrange(RichDepth, ShanDepth, RichBottom, ShanBottom, common.legend = TRUE, legend = "bottom")

supp_alpha

#ggsave("alphadiversity_panel.pdf")

2.6 Distance and Ordination

Taxonomic Filtering

Remove ASV’s without a D1 (“Kingdom”) taxonomy assignment
Remove ASV’s assigned as Metazoa at D2

#table(tax_table(psN)[, "D1"], exclude = NULL)
ps<-subset_taxa(ps, D1 != "")
ps<-subset_taxa(ps, D1 != "NA")
ps<-subset_taxa(ps, D2 != "Metazoa")

OTUs <- data.frame(otu_table(ps))

Check ASV richness to see how much it changes after filtering:

OTUs0 <- OTUs!=0
csums <- colSums(OTUs0)
csumdf <- as.data.frame(csums)
max(csumdf$csums) #1800

## [1] 1800

min(csumdf$csums) #45

## [1] 45

mean(csumdf$csums) #685

## [1] 684.3128

2.6.1 Atchison distance

compute CLR normalization, CLR = centered log-ratio, log(x/gx) where gx is the geomentric mean of vector x

OTU4clr<- data.frame(t(data.frame(otu_table(ps))))
row.names(OTU4clr) <- gsub("\\.", "", row.names(OTU4clr))
OTUs.clr <- codaSeq.clr(OTU4clr + 0.5, samples.by.row=TRUE)
OTU2 <- otu_table(as.matrix(OTUs.clr), taxa_are_rows = FALSE)

row.names(metatable) <- gsub("-", "", row.names(metatable))
META2<- sample_data(metatable)

psCLR <- phyloseq(OTU2,TAX,META2)

2.6.2 PCoA for all samples

ordu = ordinate(psCLR, "PCoA", "euclidean")
p<-plot_ordination(psCLR, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values=c(ap[8], ap[12], ap[5], ap[4])) + scale_shape_discrete(labels=c("10.0", "0.2")) +geom_point(size=3) +theme(text = element_text(size=16)) 
p

ggsave("pcoa_allsamples.tiff", width = 6, height = 4)

PERMANOVA by depth

meta <- metatable[row.names(metatable) %in% row.names(OTU2),]
set.seed(1)
adonis(vegdist(OTU2, method = "bray") ~ Depth, data = meta)

## 
## Call:
## adonis(formula = vegdist(OTU2, method = "bray") ~ Depth, data = meta) 
## 
## Permutation: free
## Number of permutations: 999
## 
## Terms added sequentially (first to last)
## 
##            Df  SumsOfSqs    MeanSqs F.Model      R2 Pr(>F)  
## Depth       3 1.0065e+35 3.3549e+34  3286.9 0.97944  0.043 *
## Residuals 207 2.1128e+33 1.0207e+31         0.02056         
## Total     210 1.0276e+35                    1.00000         
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

2.6.3 PCoA by depth (shows differences in filter-size)

The strongest determinant of clustering seems to be depth. To better observe clustering within depth groups, subset by depth and then filter type.

2.6.3.1 Surface

physeqSurf<- subset_samples(psCLR, Depth == "Surface")
ordu = ordinate(physeqSurf, "PCoA", "euclidean")
pSurf<-plot_ordination(physeqSurf, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values= c(ap[4]))+ geom_point(size=3) +ggtitle("A. Surface") +theme(text = element_text(size=16)) + guides(color = FALSE) + scale_shape_discrete(labels=c("10.0", "0.2"))
pSurf

2.6.3.2 SCM

physeqSCM<- subset_samples(psCLR, Depth == "SCM")
ordu = ordinate(physeqSCM, "PCoA", "euclidean")
pSCM<-plot_ordination(physeqSCM, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values=c(ap[5]))+ geom_point(size=3) +ggtitle("B. SCM") +theme(text = element_text(size=16))
pSCM

Again, strong clustering by filter type.

2.6.3.3 Mid

physeqMid<- subset_samples(psCLR, Depth == "Mid")
ordu = ordinate(physeqMid, "PCoA", "euclidean")
pMid<-plot_ordination(physeqMid, ordu, color="Depth", shape = "filter")+theme_bw() +scale_color_manual(values=ap[12])+ geom_point(size=3) +ggtitle("C. Mid")+theme(text = element_text(size=16))
pMid

Samples still cluster by filter pore-size, but now on the secondary axis.

2.6.3.4 Bottom

physeqBottom<- subset_samples(psCLR, Depth == "Bottom")
ordu = ordinate(physeqBottom, "PCoA", "euclidean")
pBottom<-plot_ordination(physeqBottom, ordu,color="Depth", shape = "filter")+theme_bw() +scale_color_manual(values=ap[8])+ geom_point(size=3) +ggtitle("D. Bottom")+theme(text = element_text(size=16))
pBottom

Supplemental Figure PCoA

supp_pcoa<-ggarrange(pSurf, pSCM, pMid, pBottom, common.legend = TRUE, legend = "bottom")

supp_pcoa

#ggsave("Supplemental_PCoA_byDepth.png", width = 6, height = 6 )

2.7 Relative Abundance Plots

Merge Samples –> collapse replicates (for Relative Abundance Plots)

Prepare metadata table:

metatable <- read.csv("sampleMap.csv")
row.names(metatable) <- metatable[[1]]
metatable$desc <- paste(metatable$Station, metatable$Depth, metatable$filter, sep="-")
metatable$Depth <- as.character(metatable$Depth)
metatable$Station <-as.character(metatable$Station)
metatable$filter<-as.character(metatable$filter)
metatable<- metatable[,-which(names(metatable) == "SampleID")]
META<- sample_data(metatable)
ps2<- ps
sample_data(ps2) <- META

print(ps2)

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 19574 taxa and 211 samples ]
## sample_data() Sample Data:       [ 211 samples by 32 sample variables ]
## tax_table()   Taxonomy Table:    [ 19574 taxa by 8 taxonomic ranks ]

Merge:

mergedps <- merge_samples(ps2, "desc")
print(mergedps)

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 19574 taxa and 112 samples ]
## sample_data() Sample Data:       [ 112 samples by 32 sample variables ]
## tax_table()   Taxonomy Table:    [ 19574 taxa by 8 taxonomic ranks ]

number of samples reduced from 211 to 112

Fix meta table in phyloseq object:

meta2<-as.data.frame(sample_data(mergedps))
split<- do.call(rbind, strsplit(row.names(meta2), "-"))
meta2$Depth <- split[,2]
meta2$filter<-split[,3]
meta2$desc <- row.names(meta2)
meta2$Station <- as.character(meta2$Station)
meta2$Depth_f <- factor(meta2$Depth, levels=c('Surface','SCM','Mid','Bottom'))
meta2$Station_f <- factor(meta2$Station, levels=c("11", "10", "9", "8", "3", "4", "2", "5", "12", "13", "14", "15", "17", "18"))
META <-sample_data(meta2)
sample_data(mergedps)<-META

Taxa glom - for clean RA plots (so that there is a not a line in the plot for each ASV)

mergedps<- subset_taxa(mergedps, D2 != "Metazoa")
mergedps<-subset_taxa(mergedps, D1 != "")
mergedps<-subset_taxa(mergedps, D1 != "NA")

mergedps = tax_glom(mergedps, "D2")

remove groups that are so low in abundance they are not visible in the RA plot (only necessary to decrease the number of colors needed)

'%ni%' <- Negate('%in%')
LowPrev<- c("Alveolata_X", "Amoebozoa_X", "Apicomplexa", "Apusomonadidae", "Conosa", "Discoba", "Eukaryota_XX", "Foraminifera", "Hilomonadea","Fungi", "Katablepharidophyta", "Lobosa", "Mesomycetozoa", "Metamonada", "Metazoa", "Opisthokonta_X", "Perkinsea", "Streptophyta", "Rhodophyta","Telonemia", "")
mergedHighPrev<- subset_taxa(mergedps, D2 %ni% LowPrev)

mergedRAhp <- transform_sample_counts(mergedHighPrev, function(OTU) 100* OTU/sum(OTU))

ap2<- c("#F7CDA4", "#DB5339", "#E3E5DB", "#A5CFCC", "#11638C", "#A83860","#D4A52A", "#CF529C", "#1E479A", "#F1B6A1", "#42465C",  "#0E899F")


taxabarplot<-plot_bar(mergedRAhp, x= "Station_f", fill = "D2", facet_grid=filter~Depth_f) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values= ap2) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D2), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")

taxaplot_nolegend <- taxabarplot + theme(legend.position="none")

taxaplot_nolegend

#ggsave("taxaplot_nolegend_newcolors.pdf", width = 8, height = 5)

take a closer look at bottom samples:

BottomTaxa<-subset_samples(mergedRAhp, Depth == "Bottom")

Bottombarplot<-plot_bar(BottomTaxa, x= "Station_f", fill = "D2", facet_grid=~filter) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=ap2) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D2), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")

Bottombarplot_nolegend <- Bottombarplot+ theme(legend.position="none")

Bottombarplot_nolegend

Differences visible at higher taxonomic ranking?

taxabarplotD1<-plot_bar(mergedRAhp, x= "Station_f", fill = "D1", facet_grid=filter~Depth_f) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=ap) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D1), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")

taxabarplotD1

2.8 Regional Comparisons

combine upper and lower filter —> consider whole community

#remove samples that do not have a corresponding top or bottom filter
ps_pairedfilters <- subset_samples(ps, SampleID %ni% c("F61", "F77", "F83", "F142", "F251", "264"))

mergedps <- merge_samples(ps_pairedfilters, "StnDepthRep")
print(mergedps)

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 19574 taxa and 104 samples ]
## sample_data() Sample Data:       [ 104 samples by 32 sample variables ]
## tax_table()   Taxonomy Table:    [ 19574 taxa by 8 taxonomic ranks ]

OTU4clr<- data.frame(t(data.frame(otu_table(mergedps))))
row.names(OTU4clr) <- gsub("\\.", "", row.names(OTU4clr))
OTUs.clr <- codaSeq.clr(OTU4clr + 0.5, samples.by.row=TRUE)
OTU2 <- otu_table(as.matrix(OTUs.clr), taxa_are_rows = TRUE)

metatable9 <- metatable[!duplicated(metatable$StnDepthRep),] 
metatable9$rn <- lapply(metatable9$StnDepthRep, function(x) paste('X', x, sep = ""))

row.names(metatable9) <- metatable9$rn

META2<- sample_data(metatable9)

mpsCLR <- phyloseq(OTU2,TAX,META2)

pcoa_ordu = ordinate(mpsCLR, "PCoA", "euclidean")
p_all_filters_merged<-plot_ordination(mpsCLR, pcoa_ordu, color="Depth")+theme_bw() +scale_color_manual(values=c(ap[8], ap[12], ap[5], ap[4])) +geom_point(size=3) +theme(text = element_text(size=16)) 
p_all_filters_merged

2.8.1 Surface

physeqSurf<- subset_samples(mpsCLR, Depth == "Surface")

OTUsSurf <- data.frame(t(otu_table(physeqSurf)))
metaS <- metatable9[row.names(metatable9) %in% row.names(OTUsSurf),]
Surface_merged<- prcomp(OTUsSurf)
plot(Surface_merged, type="lines")

PCA_Surf_merged<-ggbiplot(Surface_merged, var.axes = FALSE, groups= metaS$Region_SKN, ellipse=FALSE) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1]))+  theme(legend.position="bottom") +ggtitle("A. Surface")+  
  geom_point(aes(colour=metaS$Region_SKN), size = 6) +geom_text(aes(label=metaS$Station),hjust=-0.2, vjust=0.1, size = 6) 
PCA_Surf_merged

set.seed(1)

adonis2(vegdist(OTUsSurf, method = "euclidean") ~ Region_SKN, data = metaS)

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = vegdist(OTUsSurf, method = "euclidean") ~ Region_SKN, data = metaS)
##            Df SumOfSqs      R2     F Pr(>F)   
## Region_SKN  2     9018 0.12778 1.758  0.004 **
## Residual   24    61553 0.87222                
## Total      26    70571 1.00000                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

tst<-adonis_pairwise(x=metaS, dd=vegdist(OTUsSurf, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab

##   Comparison         R2        F   df     p  p.adj
## 1     KG.NOT 0.10538560 2.120401 1;18 0.003 0.0045
## 2     KG.SOT 0.14144256 2.141678 1;13 0.001 0.0030
## 3    NOT.SOT 0.06412787 1.164875 1;17 0.208 0.2080

0.8 is cutoff for multicollinearity

temp is 0.97 correlated with DO

cca_litdir = ordinate(physeqSurf, formula  = ~ Temp + Salinity + TURB + CDOM+ NITRAT+ NITRIT+ SILCAT+ PHSPHT+NH4 , "RDA")
p0 <- plot_ordination(physeqSurf, cca_litdir, type = "samples", color = "Region_SKN") +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1]))+ theme(text = element_text(size = 16)) +  theme(legend.position="bottom") +ggtitle("A. Surface") + geom_point(size = 4) +geom_text(aes(label=Station),hjust=-0.3, vjust=0.1, size = 4, color = "black")
# Now add the environmental variables as arrows
arrowmat = vegan::scores(cca_litdir, display = "bp")
arrowdf <- data.frame(labels = rownames(arrowmat), arrowmat)
arrow_map = aes(xend = 3* RDA1, yend = 3* RDA2, x = 0, y = 0, shape = NULL, color = NULL, 
    label = labels)
label_map = aes(x = 3.5 * RDA1, y = 3.5 * RDA2, shape = NULL, color = NULL, 
    label = labels)
arrowhead = arrow(length = unit(0.02, "npc"))
surfRDA_plot = p0 + geom_segment(arrow_map, size = 0.5, data = arrowdf, color = "gray", 
    arrow = arrowhead) + geom_text(label_map, size = 4, data = arrowdf)
surfRDA_plot

surfRDA <- rda(OTUsSurf ~ Temp + Salinity + TURB + CDOM+ NITRAT+ NITRIT+ PHSPHT+ SILCAT+ NH4, data = metaS)
anova(surfRDA, perm = 999)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = OTUsSurf ~ Temp + Salinity + TURB + CDOM + NITRAT + NITRIT + PHSPHT + SILCAT + NH4, data = metaS)
##          Df Variance      F Pr(>F)  
## Model     9   1053.0 1.1973  0.048 *
## Residual 17   1661.2                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

anova(surfRDA, by="term", perm=999)

## Permutation test for rda under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = OTUsSurf ~ Temp + Salinity + TURB + CDOM + NITRAT + NITRIT + PHSPHT + SILCAT + NH4, data = metaS)
##          Df Variance      F Pr(>F)  
## Temp      1   170.95 1.7493  0.019 *
## Salinity  1   107.80 1.1032  0.279  
## TURB      1   114.04 1.1670  0.219  
## CDOM      1   124.22 1.2712  0.142  
## NITRAT    1    74.70 0.7645  0.860  
## NITRIT    1   120.78 1.2360  0.172  
## PHSPHT    1   129.36 1.3238  0.108  
## SILCAT    1   122.12 1.2497  0.164  
## NH4       1    89.05 0.9113  0.556  
## Residual 17  1661.25                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

mod<-varpart(OTUsSurf, metaS["Temp"], metaS["Region_SKN"])
mod

## 
## Partition of variance in RDA 
## 
## Call: varpart(Y = OTUsSurf, X = metaS["Temp"],
## metaS["Region_SKN"])
## 
## Explanatory tables:
## X1:  metaS["Temp"]
## X2:  metaS["Region_SKN"] 
## 
## No. of explanatory tables: 2 
## Total variation (SS): 70571 
##             Variance: 2714.3 
## No. of observations: 27 
## 
## Partition table:
##                      Df R.squared Adj.R.squared Testable
## [a+b] = X1            1   0.06298       0.02550     TRUE
## [b+c] = X2            2   0.12778       0.05510     TRUE
## [a+b+c] = X1+X2       3   0.17731       0.07001     TRUE
## Individual fractions                                    
## [a] = X1|X2           1                 0.01491     TRUE
## [b]                   0                 0.01059    FALSE
## [c] = X2|X1           2                 0.04451     TRUE
## [d] = Residuals                         0.92999    FALSE
## ---
## Use function 'rda' to test significance of fractions of interest

#plot(mod)

2.8.2 SCM

physeqSCM<- subset_samples(mpsCLR, Depth == "SCM")

OTUsSCM <- data.frame(t(otu_table(physeqSCM)))
metaSCM<- metatable9[row.names(metatable9) %in% row.names(OTUsSCM),]
SCM_merged<- prcomp(OTUsSCM)
plot(SCM_merged, type="lines")

PCA_SCM<-ggbiplot(SCM_merged, var.axes = FALSE, groups= metaSCM$Region_SKN, ellipse=FALSE) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1]))+  theme(legend.position="bottom") +ggtitle("C. SCM")  +  
  geom_point(aes(colour=metaSCM$Region_SKN), size = 6) +geom_text(aes(label=metaSCM$Station),hjust=-0.2, vjust=0.1, size = 6) 
PCA_SCM

#ggsave("PCA_SCM_merged.png")

set.seed(1)
adonis2(vegdist(OTUsSCM, method = "euclidean") ~ Region_SKN, data = metaSCM)

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = vegdist(OTUsSCM, method = "euclidean") ~ Region_SKN, data = metaSCM)
##            Df SumOfSqs     R2      F Pr(>F)  
## Region_SKN  2    13727 0.1113 1.4403  0.015 *
## Residual   23   109600 0.8887                
## Total      25   123326 1.0000                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

tst<-adonis_pairwise(x=metaSCM, dd=vegdist(OTUsSCM, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab

##   Comparison         R2        F   df     p p.adj
## 1     KG.NOT 0.08237242 1.436268 1;16 0.066 0.099
## 2     KG.SOT 0.08384723 1.189773 1;13 0.236 0.236
## 3    NOT.SOT 0.08915413 1.663970 1;17 0.008 0.024

cca_litdir = ordinate(physeqSCM, formula  = ~ Temp + Salinity + DO + m + TURB + CDOM+ NITRAT+ NITRIT+SILCAT+ PHSPHT+NH4 , "RDA")
p0 = plot_ordination(physeqSCM, cca_litdir, type = "samples", color = "Region_SKN")   +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1]))+ theme(text = element_text( size = 16)) +  theme(legend.position="bottom") +ggtitle("B. SCM") + geom_point(size = 4) +geom_text(aes(label=Station),hjust=-0.3, vjust=0.1, size = 4, color = "black")
# Now add the environmental variables as arrows
arrowmat = vegan::scores(cca_litdir, display = "bp")
arrowdf <- data.frame(labels = rownames(arrowmat), arrowmat)
arrow_map = aes(xend = 3* RDA1, yend = 3* RDA2, x = 0, y = 0, shape = NULL, color = NULL, 
    label = labels)
label_map = aes(x = 3.5 * RDA1, y = 3.5 * RDA2, shape = NULL, color = NULL, 
    label = labels)
arrowhead = arrow(length = unit(0.02, "npc"))
SCMRDA_plot = p0 + geom_segment(arrow_map, size = 0.5, data = arrowdf, color = "gray", 
    arrow = arrowhead) + geom_text(label_map, size = 4, data = arrowdf)
SCMRDA_plot

SCMRDA <- rda(OTUsSCM ~ Temp + Salinity + DO + m + TURB + CDOM+ NITRAT+ NITRIT+ SILCAT+ PHSPHT+NH4, data = metaSCM)
anova(SCMRDA, perm=1000)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = OTUsSCM ~ Temp + Salinity + DO + m + TURB + CDOM + NITRAT + NITRIT + SILCAT + PHSPHT + NH4, data = metaSCM)
##          Df Variance      F Pr(>F)
## Model    11   2269.3 1.0843  0.182
## Residual 14   2663.7

anova(SCMRDA, by="term", perm=1000)

## Permutation test for rda under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = OTUsSCM ~ Temp + Salinity + DO + m + TURB + CDOM + NITRAT + NITRIT + SILCAT + PHSPHT + NH4, data = metaSCM)
##          Df Variance      F Pr(>F)
## Temp      1   221.94 1.1665  0.244
## Salinity  1   222.05 1.1671  0.211
## DO        1   194.20 1.0207  0.399
## m         1   235.40 1.2372  0.157
## TURB      1   168.39 0.8850  0.662
## CDOM      1   235.99 1.2403  0.185
## NITRAT    1   169.84 0.8927  0.624
## NITRIT    1   221.86 1.1660  0.217
## SILCAT    1   231.81 1.2184  0.196
## PHSPHT    1   248.65 1.3068  0.113
## NH4       1   119.18 0.6264  0.967
## Residual 14  2663.74

mod<-varpart(OTUsSCM, metaSCM["Temp"], metaSCM["Region_SKN"], metaSCM["m"])
mod

## 
## Partition of variance in RDA 
## 
## Call: varpart(Y = OTUsSCM, X = metaSCM["Temp"],
## metaSCM["Region_SKN"], metaSCM["m"])
## 
## Explanatory tables:
## X1:  metaSCM["Temp"]
## X2:  metaSCM["Region_SKN"]
## X3:  metaSCM["m"] 
## 
## No. of explanatory tables: 3 
## Total variation (SS): 123326 
##             Variance: 4933.1 
## No. of observations: 26 
## 
## Partition table:
##                       Df R.square Adj.R.square Testable
## [a+d+f+g] = X1         1  0.04499      0.00520     TRUE
## [b+d+e+g] = X2         2  0.11130      0.03402     TRUE
## [c+e+f+g] = X3         1  0.04485      0.00505     TRUE
## [a+b+d+e+f+g] = X1+X2  3  0.14631      0.02990     TRUE
## [a+c+d+e+f+g] = X1+X3  2  0.08216      0.00235     TRUE
## [b+c+d+e+f+g] = X2+X3  3  0.15853      0.04379     TRUE
## [a+b+c+d+e+f+g] = All  4  0.18198      0.02616     TRUE
## Individual fractions                                   
## [a] = X1 | X2+X3       1              -0.01763     TRUE
## [b] = X2 | X1+X3       2               0.02382     TRUE
## [c] = X3 | X1+X2       1              -0.00374     TRUE
## [d]                    0               0.01492    FALSE
## [e]                    0               0.00088    FALSE
## [f]                    0               0.01350    FALSE
## [g]                    0              -0.00560    FALSE
## [h] = Residuals                        0.97384    FALSE
## Controlling 1 table X                                  
## [a+d] = X1 | X3        1              -0.00271     TRUE
## [a+f] = X1 | X2        1              -0.00413     TRUE
## [b+d] = X2 | X3        2               0.03874     TRUE
## [b+e] = X2 | X1        2               0.02470     TRUE
## [c+e] = X3 | X1        1              -0.00285     TRUE
## [c+f] = X3 | X2        1               0.00976     TRUE
## ---
## Use function 'rda' to test significance of fractions of interest

2.8.3 Mid

physeqMid<- subset_samples(mpsCLR, altDepth == "Mid")

OTUsMid <- data.frame(t(otu_table(physeqMid)))
metaM <- metatable9[row.names(metatable9) %in% row.names(OTUsMid),]
Mid_10<- prcomp(OTUsMid)
plot(Mid_10, type="lines")

PCA_Mid<-ggbiplot(Mid_10, var.axes = FALSE, groups= metaM$Region_SKN, ellipse=FALSE) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1])) +  theme(legend.position="bottom") +ggtitle("E. Mid")  +  
  geom_point(aes(colour=metaM$Region_SKN), size = 6) +geom_text(aes(label=metaM$Station),hjust=-0.2, vjust=0.1, size = 6) 
PCA_Mid

set.seed(1)
adonis2(vegdist(OTUsMid, method = "euclidean") ~ Region_SKN, data = metaM)

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = vegdist(OTUsMid, method = "euclidean") ~ Region_SKN, data = metaM)
##            Df SumOfSqs      R2     F Pr(>F)   
## Region_SKN  2    10230 0.16536 1.783  0.003 **
## Residual   18    51634 0.83464                
## Total      20    61864 1.00000                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

tst<-adonis_pairwise(x=metaM, dd=vegdist(OTUsMid, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab

##   Comparison         R2        F   df     p p.adj
## 1     KG.NOT 0.12746450 1.753022 1;12 0.012 0.018
## 2     KG.SOT 0.20403858 2.307081  1;9 0.009 0.018
## 3    NOT.SOT 0.08623793 1.415652 1;15 0.038 0.038

RDA 0.8 is cutoff for multicollinearity Temp colinn with DO, CDOM, Nitrate, Silicate, Phosphate

cca_litdir = ordinate(physeqMid, formula  = ~ Temp + Salinity + TURB + NITRIT + NH4 , "RDA")
p0 = plot_ordination(physeqMid, cca_litdir, type = "samples", color = "Region_SKN") +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1]))+ theme(text = element_text( size = 16)) +  theme(legend.position="bottom") +ggtitle("C. Mid") + geom_point(size = 4) +geom_text(aes(label=Station),hjust=-0.3, vjust=0.1, size = 4, color = "black")
# Now add the environmental variables as arrows
arrowmat = vegan::scores(cca_litdir, display = "bp")
# Add labels, make a data.frame
arrowdf <- data.frame(labels = rownames(arrowmat), arrowmat)
# Define the arrow aesthetic mapping
arrow_map = aes(xend = 3* RDA1, yend = 3* RDA2, x = 0, y = 0, shape = NULL, color = NULL, 
    label = labels)
label_map = aes(x = 3.5 * RDA1, y = 3.5 * RDA2, shape = NULL, color = NULL, 
    label = labels)
# Make a new graphic
arrowhead = arrow(length = unit(0.02, "npc"))
MidRDA_plot = p0 + geom_segment(arrow_map, size = 0.5, data = arrowdf, color = "gray", 
    arrow = arrowhead) + geom_text(label_map, size = 4, data = arrowdf)
MidRDA_plot

MidRDA <- rda(OTUsMid ~ Temp + Salinity + TURB + NITRIT + NH4, data = metaM)
anova(MidRDA, perm = 1000)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = OTUsMid ~ Temp + Salinity + TURB + NITRIT + NH4, data = metaM)
##          Df Variance      F Pr(>F)   
## Model     5   968.31 1.3671  0.005 **
## Residual 15  2124.88                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

anova(MidRDA, by="term", perm=1000)

## Permutation test for rda under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = OTUsMid ~ Temp + Salinity + TURB + NITRIT + NH4, data = metaM)
##          Df Variance      F Pr(>F)   
## Temp      1   134.16 0.9470  0.467   
## Salinity  1   256.16 1.8083  0.005 **
## TURB      1   171.17 1.2083  0.191   
## NITRIT    1   182.03 1.2850  0.183   
## NH4       1   224.79 1.5868  0.022 * 
## Residual 15  2124.88                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

mod<-varpart(OTUsMid, metaM["Region_SKN"], metaM["Salinity"], metaM["NH4"] )
mod

## 
## Partition of variance in RDA 
## 
## Call: varpart(Y = OTUsMid, X = metaM["Region_SKN"],
## metaM["Salinity"], metaM["NH4"])
## 
## Explanatory tables:
## X1:  metaM["Region_SKN"]
## X2:  metaM["Salinity"]
## X3:  metaM["NH4"] 
## 
## No. of explanatory tables: 3 
## Total variation (SS): 61864 
##             Variance: 3093.2 
## No. of observations: 21 
## 
## Partition table:
##                       Df R.square Adj.R.square Testable
## [a+d+f+g] = X1         2  0.16536      0.07262     TRUE
## [b+d+e+g] = X2         1  0.06127      0.01186     TRUE
## [c+e+f+g] = X3         1  0.07250      0.02368     TRUE
## [a+b+d+e+f+g] = X1+X2  3  0.21302      0.07414     TRUE
## [a+c+d+e+f+g] = X1+X3  3  0.21811      0.08013     TRUE
## [b+c+d+e+f+g] = X2+X3  2  0.12181      0.02423     TRUE
## [a+b+c+d+e+f+g] = All  4  0.25600      0.07000     TRUE
## Individual fractions                                   
## [a] = X1 | X2+X3       2               0.04576     TRUE
## [b] = X2 | X1+X3       1              -0.01013     TRUE
## [c] = X3 | X1+X2       1              -0.00414     TRUE
## [d]                    0               0.01068    FALSE
## [e]                    0               0.01165    FALSE
## [f]                    0               0.01651    FALSE
## [g]                    0              -0.00034    FALSE
## [h] = Residuals                        0.93000    FALSE
## Controlling 1 table X                                  
## [a+d] = X1 | X3        2               0.05645     TRUE
## [a+f] = X1 | X2        2               0.06228     TRUE
## [b+d] = X2 | X3        1               0.00056     TRUE
## [b+e] = X2 | X1        1               0.00152     TRUE
## [c+e] = X3 | X1        1               0.00751     TRUE
## [c+f] = X3 | X2        1               0.01237     TRUE
## ---
## Use function 'rda' to test significance of fractions of interest

2.8.4 Bottom

(except mid for station 3 and 18)

physeqBott<- subset_samples(mpsCLR, altDepth == "Bottom")
ordu = ordinate(physeqBott, "PCoA", "euclidean")
pBott<-plot_ordination(physeqBott, ordu, color="Vent")+theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1]))+ geom_point(size=3) +ggtitle("Bottom") +theme(text = element_text(size=16)) + guides(color = FALSE) +geom_text(aes(label=Station),hjust=-0.2, vjust=0.1)
pBott

OTUsBot <- data.frame(t(otu_table(physeqBott)))
metaB <- metatable9[row.names(metatable9) %in% row.names(OTUsBot),]
Bot_10<- prcomp(OTUsBot)
plot(Bot_10, type="lines")

PCA_Bot<-ggbiplot(Bot_10, var.axes = FALSE, groups= metaB$Region_SKN, ellipse=FALSE) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1])) +  theme(legend.position="bottom") +ggtitle("G. Bottom")  +  
  geom_point(aes(colour=metaB$Region_SKN, shape = metaB$Vent), size = 6) +geom_text(aes(label=metaB$Station),hjust=-0.2, vjust=0.1, size = 6) 
PCA_Bot

set.seed(1)
adonis2(vegdist(OTUsBot, method = "euclidean") ~ Region_SKN, data = metaB)

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = vegdist(OTUsBot, method = "euclidean") ~ Region_SKN, data = metaB)
##            Df SumOfSqs      R2      F Pr(>F)    
## Region_SKN  2    12835 0.14019 1.9566  0.001 ***
## Residual   24    78718 0.85981                  
## Total      26    91553 1.00000                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

tst<-adonis_pairwise(x=metaB, dd=vegdist(OTUsBot, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab

##   Comparison         R2        F   df     p  p.adj
## 1     KG.NOT 0.09434725 1.770991 1;17 0.001 0.0015
## 2     KG.SOT 0.13176166 1.972847 1;13 0.004 0.0040
## 3    NOT.SOT 0.10556144 2.124356 1;18 0.001 0.0015

Vent Testing for Bottom Water Bottom-water by Hydrothermal Activity

PERMANOVA

set.seed(1)
physeqBottom<- subset_samples(psCLR, Depth == "Bottom")
OTUsBottom <- data.frame(otu_table(physeqBottom))
metab <- metatable[row.names(metatable) %in% row.names(OTUsBottom),]
adonis2(vegdist(OTUsBottom, method = "euclidean") ~ Vent, data = metab)

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = vegdist(OTUsBottom, method = "euclidean") ~ Vent, data = metab)
##          Df SumOfSqs      R2      F Pr(>F)    
## Vent      1     6878 0.05272 2.8382  0.001 ***
## Residual 51   123585 0.94728                  
## Total    52   130462 1.00000                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

cca_litdir = ordinate(physeqBott, formula  = ~ Temp + Salinity + DO + TURB + NITRIT+NH4 , "RDA")
p0 = plot_ordination(physeqBott, cca_litdir, type = "samples", color = "Region_SKN", shape = "Vent")+theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2], ap[1]))+ theme(text = element_text(size =16)) +  theme(legend.position="bottom") +ggtitle("D. Bottom") + geom_point(size = 4) +geom_text(aes(label=Station),hjust=-0.3, vjust=0.1, size = 4, color = "black") +scale_shape_manual(values=c(20, 17))
# Now add the environmental variables as arrows
arrowmat = vegan::scores(cca_litdir, display = "bp")
# Add labels, make a data.frame
arrowdf <- data.frame(labels = rownames(arrowmat), arrowmat)
# Define the arrow aesthetic mapping
arrow_map = aes(xend = 3* RDA1, yend = 3* RDA2, x = 0, y = 0, shape = NULL, color = NULL, 
    label = labels)
label_map = aes(x = 3.5 * RDA1, y = 3.5 * RDA2, shape = NULL, color = NULL, 
    label = labels)
# Make a new graphic
arrowhead = arrow(length = unit(0.02, "npc"))
BottRDA_plot = p0 + geom_segment(arrow_map, size = 0.5, data = arrowdf, color = "gray", 
    arrow = arrowhead) + geom_text(label_map, size = 4, data = arrowdf)
BottRDA_plot

BotRDA <- rda(OTUsBot ~ Temp + Salinity + DO + TURB + NITRIT+NH4, data = metaB)
anova(BotRDA, perm=500)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = OTUsBot ~ Temp + Salinity + DO + TURB + NITRIT + NH4, data = metaB)
##          Df Variance      F Pr(>F)    
## Model     6   1106.2 1.5267  0.001 ***
## Residual 20   2415.1                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

anova(BotRDA, by="term", perm=999)

## Permutation test for rda under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = OTUsBot ~ Temp + Salinity + DO + TURB + NITRIT + NH4, data = metaB)
##          Df Variance      F Pr(>F)   
## Temp      1   184.89 1.5311  0.036 * 
## Salinity  1   121.51 1.0062  0.389   
## DO        1   170.18 1.4093  0.035 * 
## TURB      1   303.69 2.5149  0.002 **
## NITRIT    1   147.24 1.2194  0.154   
## NH4       1   178.66 1.4795  0.039 * 
## Residual 20  2415.09                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

RsquareAdj(BotRDA)$r.square

## [1] 0.3141403

mod<-varpart(OTUsBot, metaB["DO"], metaB["TURB"], metaB["Region_SKN"], metaB["NH4"] )
mod

## 
## Partition of variance in RDA 
## 
## Call: varpart(Y = OTUsBot, X = metaB["DO"], metaB["TURB"],
## metaB["Region_SKN"], metaB["NH4"])
## 
## Explanatory tables:
## X1:  metaB["DO"]
## X2:  metaB["TURB"]
## X3:  metaB["Region_SKN"]
## X4:  metaB["NH4"] 
## 
## No. of explanatory tables: 4 
## Total variation (SS): 91553 
##             Variance: 3521.3 
## No. of observations: 27 
## 
## Partition table:
##                             Df R.square Adj.R.square Testable
## [aeghklno] = X1              1  0.05461      0.01679     TRUE
## [befiklmo] = X2              1  0.06963      0.03241     TRUE
## [cfgjlmno] = X3              2  0.14019      0.06854     TRUE
## [dhijkmno] = X4              1  0.05206      0.01414     TRUE
## [abefghiklmno] = X1+X2       2  0.12957      0.05703     TRUE
## [acefghjklmno] = X1+X3       3  0.18373      0.07726     TRUE
## [adeghijklmno] = X1+X4       2  0.10596      0.03146     TRUE
## [bcefgijklmno] = X2+X3       3  0.20087      0.09663     TRUE
## [bdefhijklmno] = X2+X4       2  0.13193      0.05959     TRUE
## [cdfghijklmno] = X3+X4       3  0.18523      0.07896     TRUE
## [abcefghijklmno] = X1+X2+X3  4  0.24683      0.10989     TRUE
## [abdefghijklmno] = X1+X2+X4  3  0.19078      0.08523     TRUE
## [acdefghijklmno] = X1+X3+X4  4  0.22878      0.08856     TRUE
## [bcdefghijklmno] = X2+X3+X4  4  0.24914      0.11263     TRUE
## [abcdefghijklmno] = All      5  0.29556      0.12783     TRUE
## Individual fractions                                         
## [a] = X1 | X2+X3+X4          1               0.01521     TRUE
## [b] = X2 | X1+X3+X4          1               0.03927     TRUE
## [c] = X3 | X1+X2+X4          2               0.04261     TRUE
## [d] = X4 | X1+X2+X3          1               0.01794     TRUE
## [e]                          0              -0.00561    FALSE
## [f]                          0               0.01450    FALSE
## [g]                          0               0.01042    FALSE
## [h]                          0              -0.00195    FALSE
## [i]                          0              -0.00665    FALSE
## [j]                          0               0.01025    FALSE
## [k]                          0               0.00107    FALSE
## [l]                          0              -0.00271    FALSE
## [m]                          0              -0.00688    FALSE
## [n]                          0               0.00094    FALSE
## [o]                          0              -0.00059    FALSE
## [p] = Residuals              0               0.87217    FALSE
## Controlling 2 tables X                                       
## [ae] = X1 | X3+X4            1               0.00960     TRUE
## [ag] = X1 | X2+X4            1               0.02563     TRUE
## [ah] = X1 | X2+X3            1               0.01326     TRUE
## [be] = X2 | X3+X4            1               0.03367     TRUE
## [bf] = X2 | X1+X4            1               0.05377     TRUE
## [bi] = X2 | X1+X3            1               0.03263     TRUE
## [cf] = X3 | X1+X4            2               0.05710     TRUE
## [cg] = X3 | X2+X4            2               0.05303     TRUE
## [cj] = X3 | X1+X2            2               0.05286     TRUE
## [dh] = X4 | X2+X3            1               0.01599     TRUE
## [di] = X4 | X1+X3            1               0.01130     TRUE
## [dj] = X4 | X1+X2            1               0.02819     TRUE
## Controlling 1 table X                                        
## [aghn] = X1 | X2             1               0.02462     TRUE
## [aehk] = X1 | X3             1               0.00872     TRUE
## [aegl] = X1 | X4             1               0.01732     TRUE
## [bfim] = X2 | X1             1               0.04024     TRUE
## [beik] = X2 | X3             1               0.02809     TRUE
## [befl] = X2 | X4             1               0.04545     TRUE
## [cfjm] = X3 | X1             2               0.06047     TRUE
## [cgjn] = X3 | X2             2               0.06422     TRUE
## [cfgl] = X3 | X4             2               0.06482     TRUE
## [dijm] = X4 | X1             1               0.01467     TRUE
## [dhjn] = X4 | X2             1               0.02718     TRUE
## [dhik] = X4 | X3             1               0.01042     TRUE
## ---
## Use function 'rda' to test significance of fractions of interest

variance inflation factors:

vif.cca(BotRDA)

##       Temp   Salinity         DO       TURB     NITRIT        NH4 
## 153.138941   8.823227 152.480736   2.272523   4.752871   1.564100

“anything above 10 should be inspected or discarded” - from docs

Put the plots together:

ggarrange(surfRDA_plot, SCMRDA_plot, MidRDA_plot, BottRDA_plot, common.legend = TRUE, legend = "bottom")

#ggsave("RDA4Depths.eps", width = 8, height =8)

2.9 DESeq2 – Differential abundance testing

subset rawcount phyloseq object to just hold bottom samples:

bottomraw<- subset_samples(ps, Depth=="Bottom")

bottomraw<-subset_taxa(bottomraw, D2 != "Metazoa")
bottomraw<-subset_taxa(bottomraw, D1 != "")
bottomraw<-subset_taxa(bottomraw, D1 != "NA")
ddse <- phyloseq_to_deseq2(bottomraw, ~Vent)

## converting counts to integer mode

ddse2 <- DESeq(ddse, test="Wald", fitType="parametric")

res<- results(ddse2,  cooksCutoff = FALSE)
alpha = 0.01
sigtab = res[which(res$padj < alpha), ]
dim(sigtab)

## [1] 46  6

sigtab = cbind(as(sigtab, "data.frame"), as(tax_table(bottomraw)[rownames(sigtab), ], "matrix"))
sigtab = sigtab[sigtab$D2 != "Fungi",] # this fungus classification seems to be a mistake (based on original publication) and would otherwise only be classified as Eukaryota and, therefore, filtered from our data set at the very beginning of the analysis

x = tapply(sigtab$log2FoldChange, sigtab$D2, function(x) max(x))
x = sort(x, TRUE)
sigtab$D2 = factor(as.character(sigtab$D2), levels=names(x))
#D3 level
x = tapply(sigtab$log2FoldChange, sigtab$D3, function(x) max(x))
x = sort(x, TRUE)
sigtab$D3 = factor(as.character(sigtab$D3), levels=names(x))
# D4 level
x = tapply(sigtab$log2FoldChange, sigtab$D4, function(x) max(x))
x = sort(x, TRUE)
sigtab$D4 = factor(as.character(sigtab$D4), levels=names(x))
#plot F0
sigplot<- ggplot(sigtab, aes(x=D2, y=log2FoldChange, color = D3)) + geom_point(size=5, alpha = 0.7) + theme(legend.title=element_blank()) + theme_bw() +
   ggtitle(" ") +
  theme(axis.text.x = element_text(angle = -45, hjust = 0, vjust=0.75)) + scale_color_manual(values=ap) + theme(text = element_text(size=16))
noLegendsigplot <- sigplot +theme(legend.position = "none")
noLegendsigplot

#ggsave("sigplotnoLegend.pdf", height = 4, width = 6)

sigplot

#ggsave("sigplot.pdf")

3 CTD Plots

CTDnames <- c("scan", "depSM", "prDM", "t090C", "c0S/m", "t190C", "c1S/m", "oxRaw1", "oxRaw2", "CStarTr0", "CStarAt0", "v4", "flSP", "seaTurbMtr", "wetCDOM", "altM", "dz/dtM", "modError", "pumps", "sbeox0Mm", "sbeox1Mm", "nbin", "salPSU", "Density", "potemp090C", "salPSU2", "Density2", "potemp190C", "cStation")

CTD data

C01 <- read.table("CTDdata/C01M001.txt", col.names = CTDnames)
C01$cStation <- "C01"
C02 <- read.table("CTDdata/C02M001.txt", col.names = CTDnames)
C02$cStation <- "C02"
C03 <- read.table("CTDdata/C03M001.txt", col.names = CTDnames)
C03$cStation <- "C03"
C04 <- read.table("CTDdata/C04M001.txt", col.names = CTDnames)
C04$cStation <-"C04"
C05 <- read.table("CTDdata/C05M001.txt", col.names = CTDnames)
C05$cStation<- "C05"
C06 <- read.table("CTDdata/C06M001.txt", col.names = CTDnames)
C06$cStation <- "C06"
C08 <- read.table("CTDdata/C08M001.txt", col.names = CTDnames)
C08$cStation <- "C08"
C09 <- read.table("CTDdata/C09M001.txt", col.names = CTDnames)
C09$cStation <- "C09"
C10 <- read.table("CTDdata/C10M002.txt", col.names = CTDnames)
C10$cStation <- "C10"
C11 <- read.table("CTDdata/C11M001.txt", col.names = CTDnames)
C11$cStation <- "C11"
C12 <- read.table("CTDdata/C12M001.txt", col.names = CTDnames)
C12$cStation <- "C12"
C13 <- read.table("CTDdata/C13M001.txt", col.names = CTDnames)
C13$cStation <- "C13"
C14 <- read.table("CTDdata/C14M001.txt", col.names = CTDnames)
C14$cStation <- "C14"
C15 <- read.table("CTDdata/C15M001.txt", col.names = CTDnames)
C15$cStation <- "C15"
C16 <- read.table("CTDdata/C16M001.txt", col.names = CTDnames)
C16$cStation <- "C16"
C17 <- read.table("CTDdata/C17M001.txt", col.names = CTDnames)
C17$cStation <- "C17"
C18 <- read.table("CTDdata/C18M001.txt", col.names = CTDnames)
C18$cStation <- "C18"

ctdALL <- rbind(C01, C02, C03, C04, C05, C06, C08, C09, C10, C11, C12, C13, C14, C15, C16, C17, C18)

ctdALL_turb <- ctdALL[, c("depSM", "seaTurbMtr", "cStation")]
names(ctdALL_turb) <- c("Depth", "Value", "Station")
ctdALL_turb$var <- "Turbidity"

ctdALL_CDOM <- ctdALL[, c("depSM", "wetCDOM", "cStation")]
names(ctdALL_CDOM) <- c("Depth", "Value", "Station")
ctdALL_CDOM <- ctdALL_CDOM[ctdALL_CDOM$Depth >200,]
ctdALL_CDOM$var <- "CDOM"

ctd_turb_CDOM <- rbind(ctdALL_CDOM, ctdALL_turb)
ctd_turb_CDOM$Depth <- ctd_turb_CDOM$Depth * -1
  
stations <- list(C01, C02, C03, C04, C05, C06, C08, C09, C10, C11, C12, C13, C14, C15, C16, C17, C18)

Bottle Chem Data

nuts<- read.csv("CTDdata/MR1703C20171023071730.csv", stringsAsFactors = FALSE)
tCO2 <- read.csv("CTDdata/TCO2.csv", stringsAsFactors = FALSE)

nuts$CRUISE <- gsub("-", "", nuts$CRUISE)
nuts$CRUISE <- str_sub(nuts$CRUISE, 1, str_length(nuts$CRUISE)-1)
nuts$ID <- paste(nuts$CRUISE, nuts$STNNBR, "00", nuts$CASTNO, formatC(nuts$SAMPNO, width =2, flag = "0"), sep = "")
nuts$ID <- gsub(" ", "", nuts$ID)

depth <- nuts[, c("ID", "CTDDPT", "DEPTH", "STNNBR", "CASTNO")]

tCO2n <- merge(tCO2, depth, by = "ID" )

tCO2n$CTDDPT <- gsub("-999", "1", tCO2n$CTDDPT )

tCO2n <- tCO2n[tCO2n$TCARBN != "   -999.0", ]

tCO2n <- tCO2n[tCO2n$STNNBR != "   C01" & tCO2n$STNNBR != "   C16" ,]

tCO2n$TCARBN <- as.numeric(tCO2n$TCARBN)
tCO2n$CTDDPT<- as.numeric(tCO2n$CTDDPT) * -1

tCplot <- ggplot(tCO2n, aes(x= TCARBN, y=CTDDPT)) + geom_point()+
  facet_grid(STNNBR~. , scales = "free_y" ) +theme_bw() + theme(text = element_text(size =10)) +theme(axis.text.x = element_text(angle=60, hjust=1))

tCplot

ggsave("tcarbon.eps", width =5, height = 8)

NH4

nuts$CTDDPT <- gsub("-999", "1", nuts$CTDDPT )
nuts$CTDDPT<- as.numeric(nuts$CTDDPT) * -1

## Warning: NAs introduced by coercion

nh4 <- nuts[nuts$NH4 > 0,]
nh4 <- nh4[-c(1,2),]
nh4 <- nh4[nh4$CTDDPT < -100,]
nh4$STNNBR = gsub(" ", "",nh4$STNNBR )
nh4<- nh4[nh4$STNNBR != "C06",]

nh4$STNNBR <- factor(nh4$STNNBR , levels=c("C11", "C10", "C09", "C08", "C03", "C04", "C02", "C05", "C12", "C13", "C14", "C15", "C16", "C17", "C18"))

NH4plot <- ggplot(nh4, aes(x= NH4, y=CTDDPT)) + geom_point()+
  facet_grid(STNNBR~. , scales = "free_y" ) +theme_bw() + theme(text = element_text(size =10))+theme(axis.text.x = element_text(angle=60, hjust=1))

NH4plot

#ggsave("nh4plot.pdf", width = 5, height= 8)

Turbidity, CDOM

turb_CDOM_plot <- ggplot(ctd_turb_CDOM, aes(x= Value, y=Depth)) + geom_point()+
  facet_grid( Station~ var ,scales="free") +theme_bw()

turb_CDOM_plot

Salinity, Chla, showing station 13 is different

ctdALL_psu <- ctdALL[, c("depSM", "salPSU", "cStation")]
names(ctdALL_psu) <- c("Depth", "Value", "Station")
ctdALL_psu$var <- "Salinity"
ctdALL_psu <- ctdALL_psu[ctdALL_psu$Depth <500,]

ctdALL_fluor <- ctdALL[, c("depSM", "flSP", "cStation")]
names(ctdALL_fluor) <- c("Depth", "Value", "Station")
ctdALL_fluor <- ctdALL_fluor[ctdALL_fluor$Depth <500,]
ctdALL_fluor$var <- "Fluorescence"

ctd_psu_fluor <- rbind(ctdALL_fluor, ctdALL_psu)
ctd_psu_fluor$Depth <- ctd_psu_fluor$Depth * -1

ctd_psu_fluor<- ctd_psu_fluor[ctd_psu_fluor$Station %ni% c("C06","C16", "C01"),]

psu_fluor_plot <- ggplot(ctd_psu_fluor, aes(x= Value, y=Depth, color = Station)) + geom_point()+ facet_grid(~ var ,scales="free") +theme_bw() + scale_color_manual(values=ap) +  theme(text = element_text(size =12))

psu_fluor_plot

ggsave("psu_fluor_plot.eps" )

## Saving 8 x 6.5 in image

Nitrite/Nitrate/Phosphate plots showing elevation at station 13

nnp <- nuts[nuts$CTDDPT > -510,]
nnp <- nnp[-c(1,2),]

nitrit <- nnp[, c("STNNBR", "CTDDPT", "NITRIT")]
names(nitrit)<- c("Station", "Depth", "Value")
nitrit$Var <- "Nitrite"

nitrat<- nnp[, c("STNNBR", "CTDDPT", "NITRAT")]
names(nitrat)<- c("Station", "Depth", "Value")
nitrat$Var <- "Nitrate"

phspht <- nnp[, c("STNNBR", "CTDDPT", "PHSPHT")]
names(phspht)<- c("Station", "Depth", "Value")
phspht$Var <- "Phosphate"

nnp<- rbind(nitrit, nitrat, phspht)

nnp$Value <- as.numeric(nnp$Value)
nnp<- nnp[nnp$Value != -999,]

nnp$Station = gsub(" ", "",nnp$Station )
nnp<- nnp[nnp$Station %ni% c("C06","C16", "C01"),]
nnp<- nnp[nnp$Var != "NA",]

nnp_plot <- ggplot(nnp, aes(x= Value, y=Depth, color = Station)) + geom_point()+ facet_grid(~ Var ,scales="free") +theme_bw() + scale_color_manual(values=ap) +  theme(text = element_text(size =12))

nnp_plot

ggsave("nnp_plot.eps")

## Saving 8 x 6.5 in image

For Salinity transects and isosurface plots:

library(oce)

## Loading required package: testthat

## 
## Attaching package: 'testthat'

## The following objects are masked from 'package:magrittr':
## 
##     equals, is_less_than, not

## The following object is masked from 'package:dplyr':
## 
##     matches

## The following object is masked from 'package:purrr':
## 
##     is_null

## Loading required package: gsw

## 
## Attaching package: 'oce'

## The following object is masked from 'package:scales':
## 
##     rescale

## The following object is masked from 'package:marmap':
## 
##     plotProfile

library(marmap)
stn1<-read.ctd("CTDdata/C01M001.cnv")
stn2 <- read.ctd("CTDdata/C02M001.cnv")
stn3 <- read.ctd("CTDdata/C03M001.cnv")
stn4 <- read.ctd("CTDdata/C04M001.cnv")
stn5 <- read.ctd("CTDdata/C05M001.cnv")
stn6 <- read.ctd("CTDdata/C06M001.cnv")
stn8 <- read.ctd("CTDdata/C08M001.cnv")
stn9 <- read.ctd("CTDdata/C09M001.cnv")
stn10 <- read.ctd("CTDdata/C10M002.cnv")
stn11 <- read.ctd("CTDdata/C11M001.cnv")
stn12 <- read.ctd("CTDdata/C12M001.cnv")
stn13 <- read.ctd("CTDdata/C13M001.cnv")
stn14 <- read.ctd("CTDdata/C14M001.cnv")
stn15 <- read.ctd("CTDdata/C15M001.cnv")
stn16 <- read.ctd("CTDdata/C16M001.cnv")
stn17 <- read.ctd("CTDdata/C17M001.cnv")
stn18 <- read.ctd("CTDdata/C18M001.cnv")

ctd.list <- list(stn1,stn2, stn3, stn4, stn5, stn6, stn8, stn9, stn10, stn11, stn12, stn13, stn14, stn15, stn16, stn17, stn18)

section = list(stn5, stn2, stn4, stn3)%>%
  as.section()

## Warning in as.section(.): estimated waterDepth as max(pressure) for CTDs
## numbered 1:4

lon <- section[["longitude", "byStation"]]
lat <- section[["latitude", "byStation"]]
lon1 <- min(lon) - 0.5
lon2 <- max(lon) + 0.5
lat1 <- min(lat) - 0.5
lat2 <- max(lat) + 0.5

b <- getNOAA.bathy(lon1, lon2, lat1, lat2, resolution=1, keep=TRUE)

## File already exists ; loading 'marmap_coord_125.584166666667;24.9158333333333;128.000333333333;27.0015_res_1.csv'

plot(section, which="salinity", xtype = 'distance',showBottom=FALSE, ztype = 'image')
loni <- seq(min(lon), max(lon), 0.1)
lati <- approx(lon, lat, loni, rule=2)$y
dist <- rev(geodDist(loni, lati, alongPath=TRUE))
bottom <- get.depth(b, x=loni, y=lati, locator=FALSE)
polygon(c(dist, min(dist), max(dist)), c(-bottom$depth, 10000, 10000), col='grey')

section%>%plot(which = c("map", "temperature", "salinity", "oxygen"),
               ztype = "image", showStations = TRUE)

#polygon(c(dist, min(dist), max(dist)), c(-bottom$depth, 10000, 10000), col='grey')

section = list(stn11, stn10, stn9, stn8, stn5, stn12, stn14, stn13)%>%
  as.section()

## Warning in as.section(.): estimated waterDepth as max(pressure) for CTDs
## numbered 1:8

plot(section, which=c("map", "temperature", "salinity", "oxygen"), xtype = 'distance',showBottom=FALSE, ztype = 'image', showStations=TRUE)

#polygon(c(dist, min(dist), max(dist)), c(-bottom$depth, 10000, 10000), col='grey')

require(tidyverse)
require(lubridate)

## Loading required package: lubridate

## 
## Attaching package: 'lubridate'

## The following object is masked from 'package:plyr':
## 
##     here

## The following object is masked from 'package:reshape':
## 
##     stamp

## The following object is masked from 'package:IRanges':
## 
##     %within%

## The following objects are masked from 'package:S4Vectors':
## 
##     second, second<-

## The following object is masked from 'package:base':
## 
##     date

require(oce)
require(sf)

## Loading required package: sf

## Warning: package 'sf' was built under R version 3.5.2

## Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0

Isosurface plots

require(ocedata)

## Loading required package: ocedata

data("coastlineWorld")
data("coastlineWorldMedium")
data("coastlineWorldFine")

Meta <- read.csv("sampleMap.csv")

Bottom water Salinity Plot

surfaceB <- Meta[Meta$Depth =="Bottom",]
surfB<- surfaceB[!duplicated(surfaceB$cStation),]

x=surfB$Long
y=surfB$Lat
z=surfB$Temp
s=surfB$Salinity

interp.temp = interpBarnes(x = x, 
                           y = y, 
                           z = z, 
                           xg = pretty(x, n = 50) , 
                           yg = pretty(y, n = 50))
interp.sal = interpBarnes(x = x, 
                           y = y, 
                           z = s, 
                           xg = pretty(x, n = 100) , 
                           yg = pretty(y, n = 100))

#pdf("isosurface_Bottom_Salinity.pdf", 7, 6)
lonlim <- c(123, 130.9)
latlim <- c(25.2, 28.4)
par(fig=c(0,0.95,0,.9))
## make a base
mapPlot(coastlineWorldFine, projection="+proj=longlat",
        col="lightgray", 
        longitudelim=lonlim, 
        latitudelim=latlim,
        clip = TRUE)
## overlay gridded sst layer
mapImage(interp.sal$xg, interp.sal$yg, interp.sal$zg, filledContour = TRUE, 
       col = oce.colors9A(120), zlim = c(34.3,34.6))
## add polygon
mapPolygon(coastlineWorldFine, 
        col="lightgray")
## ## add points
mapPoints(longitude = surfB$Long,
          latitude = surfB$Lat, 
          pch = 20, 
          cex = 1.75)
## add station name on the map
mapText(longitude = surfB$Long+0.16,
          latitude = surfB$Lat,
        labels = surfB$Station)
par(fig=c(0.4,1,0.0,.9), new = TRUE)
drawPalette(zlim = c(34.35,34.65), col = oce.colors9A(120))
#dev.off()

4 Session Info

sessionInfo()

## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
##  [1] grid      stats4    parallel  stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] ocedata_0.1.5               sf_0.8-0                   
##  [3] lubridate_1.7.4             oce_1.1-1                  
##  [5] gsw_1.0-5                   testthat_2.2.1             
##  [7] ggpubr_0.2.1                magrittr_1.5               
##  [9] ggbiplot_0.55               scales_1.0.0               
## [11] plyr_1.8.4                  CoDaSeq_0.99.2             
## [13] car_3.0-3                   carData_3.0-2              
## [15] zCompositions_1.3.2-1       truncnorm_1.0-8            
## [17] NADA_1.6-1                  survival_2.44-1.1          
## [19] MASS_7.3-51.4               ALDEx2_1.14.1              
## [21] breakaway_4.6.11            metagMisc_0.0.4            
## [23] gridExtra_2.3               wesanderson_0.3.6.9000     
## [25] reshape_0.8.8               DESeq2_1.22.2              
## [27] SummarizedExperiment_1.12.0 DelayedArray_0.8.0         
## [29] BiocParallel_1.16.6         matrixStats_0.54.0         
## [31] Biobase_2.42.0              GenomicRanges_1.34.0       
## [33] GenomeInfoDb_1.18.2         IRanges_2.16.0             
## [35] S4Vectors_0.20.1            BiocGenerics_0.28.0        
## [37] RColorBrewer_1.1-2          vegan_2.5-5                
## [39] lattice_0.20-38             permute_0.9-5              
## [41] phyloseq_1.26.1             qiime2R_0.99.1             
## [43] forcats_0.4.0               stringr_1.4.0              
## [45] dplyr_0.8.3                 purrr_0.3.2                
## [47] readr_1.3.1                 tidyr_0.8.3                
## [49] tibble_2.1.3                ggplot2_3.2.1              
## [51] tidyverse_1.2.1             marmap_1.0.3               
## 
## loaded via a namespace (and not attached):
##   [1] readxl_1.3.1           backports_1.1.4        Hmisc_4.2-0           
##   [4] igraph_1.2.4.1         lazyeval_0.2.2         sp_1.3-1              
##   [7] splines_3.5.0          digest_0.6.20          foreach_1.4.7         
##  [10] htmltools_0.3.6        checkmate_1.9.4        memoise_1.1.0         
##  [13] cluster_2.1.0          openxlsx_4.1.0.1       Biostrings_2.50.2     
##  [16] annotate_1.60.1        modelr_0.1.4           colorspace_1.4-1      
##  [19] blob_1.2.0             rvest_0.3.4            rgdal_1.4-4           
##  [22] haven_2.1.1            xfun_0.8               crayon_1.3.4          
##  [25] RCurl_1.95-4.12        jsonlite_1.6           genefilter_1.64.0     
##  [28] zeallot_0.1.0          iterators_1.0.12       ape_5.3               
##  [31] glue_1.3.1             gtable_0.3.0           zlibbioc_1.28.0       
##  [34] XVector_0.22.0         Rhdf5lib_1.4.3         shape_1.4.4           
##  [37] abind_1.4-5            DBI_1.0.0              Rcpp_1.0.2            
##  [40] xtable_1.8-4           htmlTable_1.13.1       units_0.6-3           
##  [43] foreign_0.8-71         bit_1.1-14             Formula_1.2-3         
##  [46] htmlwidgets_1.3        httr_1.4.0             acepack_1.4.1         
##  [49] pkgconfig_2.0.2        XML_3.98-1.20          nnet_7.3-12           
##  [52] locfit_1.5-9.1         tidyselect_0.2.5       rlang_0.4.0           
##  [55] reshape2_1.4.3         AnnotationDbi_1.44.0   munsell_0.5.0         
##  [58] cellranger_1.1.0       tools_3.5.0            cli_1.1.0             
##  [61] generics_0.0.2         RSQLite_2.1.1          ade4_1.7-13           
##  [64] broom_0.5.2            evaluate_0.14          biomformat_1.10.1     
##  [67] yaml_2.2.0             knitr_1.23             bit64_0.9-7           
##  [70] zip_2.0.3              ncdf4_1.16.1           nlme_3.1-140          
##  [73] xml2_1.2.2             compiler_3.5.0         rstudioapi_0.10       
##  [76] curl_3.3               e1071_1.7-2            ggsignif_0.5.0        
##  [79] geneplotter_1.60.0     stringi_1.4.3          Matrix_1.2-17         
##  [82] classInt_0.4-1         multtest_2.38.0        vctrs_0.2.0           
##  [85] pillar_1.4.2           data.table_1.12.2      bitops_1.0-6          
##  [88] raster_2.9-23          R6_2.4.0               latticeExtra_0.6-28   
##  [91] KernSmooth_2.23-15     rio_0.5.16             codetools_0.2-16      
##  [94] assertthat_0.2.1       rhdf5_2.26.2           withr_2.1.2           
##  [97] GenomeInfoDbData_1.2.0 adehabitatMA_0.3.13    mgcv_1.8-28           
## [100] hms_0.5.0              rpart_4.1-15           class_7.3-15          
## [103] rmarkdown_1.13         base64enc_0.1-3

Protist diversity and biogeography in the Okinawa Trough

Maggi Brisbin

4/16/2019