This script can be used to analyse biodiversity data with alien and native species occurrences and to evaluate

alien species incidence on environmental domains and EUNIS habitats

Loading libraries needed for analyses

In [3]:
library(BiodiversityR)
In [4]:
library(ggplot2)
In [5]:
library(reshape2)
In [6]:
library(cowplot)
In [7]:
library(plyr)
In [9]:
library(dplyr)
In [10]:
library(ggrepel)
In [11]:
library(ggpubr)
Attaching package: ‘ggpubr’


The following object is masked from ‘package:plyr’:

    mutate


The following object is masked from ‘package:cowplot’:

    get_legend


Verify active working directory

In [12]:
getwd()
'/home/jovyan/work'

Open and read the csv files in the input folder. There are five input datasets with occurrences information of alien and native species in Italy.

In [ ]:
freshwater = read.csv2(file ='/home/jovyan/work/input/Dataset_Biodiversity_Freshwaters_LifeWatch_2015.csv', header = TRUE, stringsAsFactors = T)
In [ ]:
head(freshwater)
A data.frame: 6 × 18
catalognumbereventdatelocalitydecimallatitudedecimallongitudenamepublishedinyearkingdomphylumclassorderfamilygenusprovidedscientificnamescientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroups
<int><fct><fct><fct><fct><int><fct><fct><fct><fct><fct><fct><fct><fct><fct><fct><int><fct>
1330312011 lago di avigliana piccolo45.0547.392NAAnimaliaArthropodaInsectaDipteraChironomidaeAblabesmyiaAblabesmyia longistylaAblabesmyia (Ablabesmyia) longistylafittkau, 1962 c1.10Invertebrates
2342882011 lago sirio 45.4857.885NAAnimaliaArthropodaInsectaDipteraChironomidaeAblabesmyiaAblabesmyia longistylaAblabesmyia (Ablabesmyia) longistylafittkau, 1962 c1.20Invertebrates
3331201999/2011lago di candia 45.3247.912NAAnimaliaArthropodaInsectaDipteraChironomidaeAblabesmyiaAblabesmyia longistylaAblabesmyia (Ablabesmyia) longistylafittkau, 1962 c1.20Invertebrates
4333162011 lago di mergozzo 45.9568.463NAAnimaliaArthropodaInsectaDipteraChironomidaeAblabesmyiaAblabesmyia longistylaAblabesmyia (Ablabesmyia) longistylafittkau, 1962 c1.10Invertebrates
5 57271998/2011lago maggiore 45.9558.634NAAnimaliaArthropodaInsectaDipteraChironomidaeAblabesmyiaAblabesmyia longistylaAblabesmyia (Ablabesmyia) longistylafittkau, 1962 c1.10Invertebrates
6330322011 lago di avigliana piccolo45.0547.392NAAnimaliaArthropodaInsectaDipteraChironomidaeAblabesmyiaAblabesmyia monilis Ablabesmyia (Ablabesmyia) monilis (linnaeus, 1758)c1.10Invertebrates
In [ ]:
marine = read.csv(file ='/home/jovyan/work/input/Dataset_Biodiversity_Marine_Lifewatch_2015.csv', header = TRUE, stringsAsFactors = T)
head(marine)
A data.frame: 6 × 18
catalognumbereventdatewaterbodylocalitydecimallatitudedecimallongitudenamepublishedinyearphylumclassorderfamilygenusprovidedscientificnamescientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroups
<int><fct><fct><fct><dbl><dbl><int><fct><fct><fct><fct><fct><fct><fct><fct><fct><int><fct>
153842011 Eastern-Central TyrrhenianGolfo di Napoli - circalittoral rocks40.7098114.26501NAMolluscaGastropoda Cerithiopsidae Cerithiopsis Cerithiopsis leucophrys Cerithiopsis leucophrys A40Invertebrates
253852011 Eastern-Central TyrrhenianGolfo di Napoli - circalittoral rocks40.7098114.26501NAMolluscaGastropoda Cerithiopsidae Cerithiopsis Cerithiopsis nicephorus Cerithiopsis nicephorus A40Invertebrates
353832011 Eastern-Central TyrrhenianGolfo di Napoli - circalittoral rocks40.7098114.26501NAMolluscaGastropodaNeogastropodaMangeliidae BrachycytharaBrachycythara zenetouae Brachycythara zenetouae A40Invertebrates
453872011 Eastern-Central TyrrhenianGolfo di Napoli - circalittoral rocks40.7098114.26501NAMolluscaGastropodaNeogastropodaClathurellidae Clathurella Clathurella gracilis Clathurella gracilis A40Invertebrates
540392001 Eastern-Central TyrrhenianBanco Santa Croce - infralittoral 40.6690014.41900NACnidariaHydrozoa AnthoathecataCordylophoridaeCordylophora Cordylophora neapolitanaCordylophora neapolitanaA30Invertebrates
681642010/2011Northern Ionian Santa Maria di Leuca - deep sea beds 39.6100018.46500NACnidariaAnthozoa Scleractinia CaryophylliidaeKadophellia Kadophellia bathyalis Kadophellia bathyalis A60Invertebrates
In [ ]:
transitional = read.csv(file ='/home/jovyan/work/input/Dataset_Biodiversity_TransitionalWaters_LifeWatch_2015.csv', header = TRUE, stringsAsFactors = T)
head(transitional)
A data.frame: 6 × 18
catalognumbereventdatelocalitydecimallatitudedecimallongitudenamepublishedinyearkingdomphylumclassorderfamilygenusprovidedscientificnamescientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroups
<int><fct><fct><dbl><dbl><int><fct><fct><fct><fct><fct><fct><fct><fct><fct><fct><int><fct>
1 12006/2007Acquatina 40.44018.240NAPlantae ChlorophytaUlvophyceae Dasycladales Polyphysaceae Acetabularia Acetabularia acetabulum Acetabularia acetabulum (Linnaeus) P.C.Silva, 1952X030Algae
2 22006/2007Acquatina 40.44018.240NAChromistaOchrophyta Phaeophyceae Dictyotales Dictyotaceae Dictyota Dictyota dichotoma var. intricata Dictyota dichotoma var. intricata (C.Agardh) Greville, 1830 X030Algae
3 31998/2010Acquatina 40.44018.240NAChromistaOchrophyta BacillariophyceaeLicmophoralesLicmophoraceae Licmosphenia Licmosphenia schmidtii Licmosphenia schmidtii Mereschkowsky, 1901 X030Algae
4272010/2011Isola e lago di Varano41.87915.747NAPlantae Rhodophyta Florideophyceae Ceramiales CallithamniaceaeAglaothamnionAglaothamnion tenuissimum var. tenuissimumAglaothamnion tenuissimum var. tenuissimum X030Algae
5282010/2011Isola e lago di Varano41.87915.747NAPlantae Rhodophyta Florideophyceae Ceramiales Ceramiaceae Ceramium Ceramium ciliatum var. ciliatum Ceramium ciliatum var. ciliatum X030Algae
6292010/2011Isola e lago di Varano41.87915.747NAPlantae Rhodophyta Florideophyceae Ceramiales Ceramiaceae Ceramium Ceramium siliquosum var. siliquosum Ceramium siliquosum var. siliquosum X030Algae
In [ ]:
Aterrestrial = read.csv2(file ='/home/jovyan/work/input/Dataset_Biodiversity_Terrestrial_Habitats_LifeWatch_2015.csv', header = TRUE, stringsAsFactors = T)
head(Aterrestrial)
A data.frame: 6 × 20
eventdatelocalitydecimallatitudedecimallongitudecatalognumberkingdomphylumclassfamilyordergenussubgenusspecificepithetinfraspecificepithetprovidedscientificnamescientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroups
<fct><fct><fct><fct><int><fct><fct><fct><fct><fct><fct><fct><fct><lgl><fct><fct><fct><fct><int><fct>
11951/2009bosco fontana 45.201099410.744400021AnimaliaArthropodaInsectaCarabidae ColeopteraAbax NAAbax (Abax) continuus Abax (Abax) continuus ganglbauer, 1891 g1 0Invertebrates
22011/2012pnfc - camaldoli - mixed woodland43.83888 11.80378 2AnimaliaArthropodaInsectaCarabidae ColeopteraAbax NAAbax (Abax) parallelepipedus curtulusAbax (Abax) parallelepipedus curtulus(fairmaire, 1856)g4 0Invertebrates
32009 sasso fratino - mixed woodland 43.843039 11.80425 3AnimaliaArthropodaInsectaCarabidae ColeopteraAbax NAAbax (Abax) parallelepipedus curtulusAbax (Abax) parallelepipedus curtulus(fairmaire, 1856)g4 0Invertebrates
42010 monte rufeno 42.792800911.889699944AnimaliaArthropodaInsectaMelandryidaeColeopteraAbdera NAAbdera (Abdera) bifasciata Abdera (Abdera) bifasciata (marsham, 1802) g1.70Invertebrates
52010 monte rufeno 42.792800911.889699945AnimaliaArthropodaInsectaMelandryidaeColeopteraAbdera NAAbdera (Abdera) quadrifasciata Abdera (Abdera) quadrifasciata (curtis, 1829) g1.70Invertebrates
62009 sasso fratino - mixed woodland 43.843039 11.80425 6AnimaliaArthropodaInsectaHisteridae ColeopteraAbraeusNAAbraeus (Abraeus) perpusillus Abraeus (Abraeus) perpusillus (marsham, 1802) g4 0Invertebrates
In [ ]:
PFterrestrial = read.csv2(file ='/home/jovyan/work/input/Dataset_Biodiversity_Terrestrial_Plants_and_Fungi_LifeWatch_2015.csv', header = TRUE, stringsAsFactors = T)
head(PFterrestrial)
A data.frame: 6 × 19
eventdatestartdayofyearenddayofyearlocalitydecimallatitudedecimallongitudeidentificationidnamepublishedinyearphylumclassfamilyordergenusprovidedscientificnamescientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroups
<fct><fct><fct><fct><fct><fct><int><lgl><fct><fct><fct><fct><fct><fct><fct><fct><fct><int><fct>
11978/200301/01/197831/12/2003bosco fontana 45.201099410.744400021426NATracheophytaMagnoliopsidaRanunculaceaeRanunculalesRanunculusRanunculus serpens ssp. nemorosus Ranunculus polyanthemos subsp. serpens(Schrank) Baltisb.g10Flowering plants
22011 01/01/201131/12/2011pnfc - camaldoli - mixed woodland43.83888 11.80378 2011NATracheophytaMagnoliopsidaRanunculaceaeRanunculalesRanunculusRanunculus serpens ssp. nemorosus Ranunculus polyanthemos subsp. serpens(Schrank) Baltisb.g40Flowering plants
31999/201101/01/199931/12/2011val masino 46.23778 9.5544 868NATracheophytaPinopsida Pinaceae Pinales Abies Abies alba Abies alba mill. g30Conifers
42011 01/01/201131/12/2011pnfc - camaldoli - mixed woodland43.83888 11.80378 841NATracheophytaPinopsida Pinaceae Pinales Abies Abies alba Abies alba mill. g40Conifers
52009 01/01/200931/12/2009sasso fratino - mixed woodland 43.843039 11.80425 860NATracheophytaPinopsida Pinaceae Pinales Abies Abies alba Abies alba mill. g40Conifers
61999/201101/01/199931/12/2011tarvisio 46.525327713.59661 866NATracheophytaPinopsida Pinaceae Pinales Abies Abies alba Abies alba mill. g30Conifers

Data wrangling and data cleaning

Merge the five datasets and work on a unique dataframe. Add a column to the original datasets to specify their provenance.

In [ ]:
freshwater$dataset <- c("Freshwater")
In [ ]:
marine$dataset <- c("Marine")
In [ ]:
transitional$dataset <- c("Transitional")
In [ ]:
Aterrestrial$dataset <- c("Terrestrial_A")
In [ ]:
PFterrestrial$dataset <- c("Terrestrial_PF")

Check column names in each dataset and keep only columns needed for analyses

In [ ]:
names(freshwater)
names(marine) 
names(transitional) 
names(Aterrestrial) 
names(PFterrestrial)
  1. 'catalognumber'
  2. 'eventdate'
  3. 'locality'
  4. 'decimallatitude'
  5. 'decimallongitude'
  6. 'namepublishedinyear'
  7. 'kingdom'
  8. 'phylum'
  9. 'class'
  10. 'order'
  11. 'family'
  12. 'genus'
  13. 'providedscientificname'
  14. 'scientificname'
  15. 'scientificnameauthorship'
  16. 'eunishabitatstypecode'
  17. 'alien'
  18. 'eunisspeciesgroups'
  19. 'dataset'
  1. 'catalognumber'
  2. 'eventdate'
  3. 'waterbody'
  4. 'locality'
  5. 'decimallatitude'
  6. 'decimallongitude'
  7. 'namepublishedinyear'
  8. 'phylum'
  9. 'class'
  10. 'order'
  11. 'family'
  12. 'genus'
  13. 'providedscientificname'
  14. 'scientificname'
  15. 'scientificnameauthorship'
  16. 'eunishabitatstypecode'
  17. 'alien'
  18. 'eunisspeciesgroups'
  19. 'dataset'
  1. 'catalognumber'
  2. 'eventdate'
  3. 'locality'
  4. 'decimallatitude'
  5. 'decimallongitude'
  6. 'namepublishedinyear'
  7. 'kingdom'
  8. 'phylum'
  9. 'class'
  10. 'order'
  11. 'family'
  12. 'genus'
  13. 'providedscientificname'
  14. 'scientificname'
  15. 'scientificnameauthorship'
  16. 'eunishabitatstypecode'
  17. 'alien'
  18. 'eunisspeciesgroups'
  19. 'dataset'
  1. 'eventdate'
  2. 'locality'
  3. 'decimallatitude'
  4. 'decimallongitude'
  5. 'catalognumber'
  6. 'kingdom'
  7. 'phylum'
  8. 'class'
  9. 'family'
  10. 'order'
  11. 'genus'
  12. 'subgenus'
  13. 'specificepithet'
  14. 'infraspecificepithet'
  15. 'providedscientificname'
  16. 'scientificname'
  17. 'scientificnameauthorship'
  18. 'eunishabitatstypecode'
  19. 'alien'
  20. 'eunisspeciesgroups'
  21. 'dataset'
  1. 'eventdate'
  2. 'startdayofyear'
  3. 'enddayofyear'
  4. 'locality'
  5. 'decimallatitude'
  6. 'decimallongitude'
  7. 'identificationid'
  8. 'namepublishedinyear'
  9. 'phylum'
  10. 'class'
  11. 'family'
  12. 'order'
  13. 'genus'
  14. 'providedscientificname'
  15. 'scientificname'
  16. 'scientificnameauthorship'
  17. 'eunishabitatstypecode'
  18. 'alien'
  19. 'eunisspeciesgroups'
  20. 'dataset'
In [ ]:
#Select the following common variables: eventdate, decimallatitude, decimallongitude, phylum, class, order, family, genus, 
#scientificname, scientificnameauthorship, eunishabitatstypecode, alien, eunisspeciesgroups and dataset 
BiotopeVars <- c("eventdate", "decimallatitude", "decimallongitude", "phylum", "class", "order", "family",
                 "genus", "scientificname", "scientificnameauthorship", "eunishabitatstypecode", "alien", 
                 "eunisspeciesgroups", "dataset")
In [ ]:
freshwater_var <- freshwater[BiotopeVars]
marine_var <- marine[BiotopeVars]
transitional_var <- transitional[BiotopeVars]
Aterrestrial_var <- Aterrestrial[BiotopeVars]
PFterrestrial_var <- PFterrestrial[BiotopeVars]
In [ ]:
#Check if selection worked correctly
names(freshwater_var)
names(marine_var)
names(transitional_var)
names(Aterrestrial_var)
names(PFterrestrial_var)
  1. 'eventdate'
  2. 'decimallatitude'
  3. 'decimallongitude'
  4. 'phylum'
  5. 'class'
  6. 'order'
  7. 'family'
  8. 'genus'
  9. 'scientificname'
  10. 'scientificnameauthorship'
  11. 'eunishabitatstypecode'
  12. 'alien'
  13. 'eunisspeciesgroups'
  14. 'dataset'
  1. 'eventdate'
  2. 'decimallatitude'
  3. 'decimallongitude'
  4. 'phylum'
  5. 'class'
  6. 'order'
  7. 'family'
  8. 'genus'
  9. 'scientificname'
  10. 'scientificnameauthorship'
  11. 'eunishabitatstypecode'
  12. 'alien'
  13. 'eunisspeciesgroups'
  14. 'dataset'
  1. 'eventdate'
  2. 'decimallatitude'
  3. 'decimallongitude'
  4. 'phylum'
  5. 'class'
  6. 'order'
  7. 'family'
  8. 'genus'
  9. 'scientificname'
  10. 'scientificnameauthorship'
  11. 'eunishabitatstypecode'
  12. 'alien'
  13. 'eunisspeciesgroups'
  14. 'dataset'
  1. 'eventdate'
  2. 'decimallatitude'
  3. 'decimallongitude'
  4. 'phylum'
  5. 'class'
  6. 'order'
  7. 'family'
  8. 'genus'
  9. 'scientificname'
  10. 'scientificnameauthorship'
  11. 'eunishabitatstypecode'
  12. 'alien'
  13. 'eunisspeciesgroups'
  14. 'dataset'
  1. 'eventdate'
  2. 'decimallatitude'
  3. 'decimallongitude'
  4. 'phylum'
  5. 'class'
  6. 'order'
  7. 'family'
  8. 'genus'
  9. 'scientificname'
  10. 'scientificnameauthorship'
  11. 'eunishabitatstypecode'
  12. 'alien'
  13. 'eunisspeciesgroups'
  14. 'dataset'
In [ ]:
#Merge datasets into a unique dataframe
fullDB <- rbind(marine_var, transitional_var, freshwater_var, Aterrestrial_var, PFterrestrial_var)
dim(fullDB)
head(fullDB)
  1. 32329
  2. 14
A data.frame: 6 × 14
eventdatedecimallatitudedecimallongitudephylumclassorderfamilygenusscientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroupsdataset
<fct><chr><chr><fct><fct><fct><fct><fct><fct><fct><fct><int><fct><chr>
12011 40.7098114.26501MolluscaGastropoda Cerithiopsidae Cerithiopsis Cerithiopsis leucophrys A40InvertebratesMarine
22011 40.7098114.26501MolluscaGastropoda Cerithiopsidae Cerithiopsis Cerithiopsis nicephorus A40InvertebratesMarine
32011 40.7098114.26501MolluscaGastropodaNeogastropodaMangeliidae BrachycytharaBrachycythara zenetouae A40InvertebratesMarine
42011 40.7098114.26501MolluscaGastropodaNeogastropodaClathurellidae Clathurella Clathurella gracilis A40InvertebratesMarine
52001 40.669 14.419 CnidariaHydrozoa AnthoathecataCordylophoridaeCordylophora Cordylophora neapolitanaA30InvertebratesMarine
62010/201139.61 18.465 CnidariaAnthozoa Scleractinia CaryophylliidaeKadophellia Kadophellia bathyalis A60InvertebratesMarine
In [ ]:
#The column "Alien" specifies the species status by assigning numerical values to native species ("0") and alien species ("1").
#To facilitate the analysis change numerical values with character values.
fullDB <- fullDB %>% mutate(alien=recode(alien, "1"="yes", "0"="no"))
In [ ]:
#Check classes
str(fullDB)
'data.frame':	32329 obs. of  14 variables:
 $ eventdate               : Factor w/ 187 levels "1984/1999","1984/2010",..: 45 45 45 45 25 43 19 9 29 45 ...
 $ decimallatitude         : chr  "40.70981" "40.70981" "40.70981" "40.70981" ...
 $ decimallongitude        : chr  "14.26501" "14.26501" "14.26501" "14.26501" ...
 $ phylum                  : Factor w/ 43 levels "Annelida","Arthropoda",..: 21 21 21 21 12 12 21 20 24 21 ...
 $ class                   : Factor w/ 130 levels "","Actinopterygii",..: 30 30 30 30 37 3 30 62 7 30 ...
 $ order                   : Factor w/ 494 levels "","Acanthoecida",..: 1 1 176 176 18 231 1 256 34 155 ...
 $ family                  : Factor w/ 1834 levels "","Acanthephyridae",..: 173 173 517 217 243 155 741 866 88 768 ...
 $ genus                   : Factor w/ 5246 levels "Aaptos","Abra",..: 329 329 239 394 435 935 943 1123 1234 1395 ...
 $ scientificname          : Factor w/ 11179 levels "Aaptos aaptos",..: 615 619 441 805 875 1833 1852 2129 2363 2657 ...
 $ scientificnameauthorship: Factor w/ 6181 levels "","(A. Costa, 1871)",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ eunishabitatstypecode   : Factor w/ 78 levels "A1","A2","A3",..: 4 4 4 4 3 6 3 7 5 4 ...
 $ alien                   : chr  "no" "no" "no" "no" ...
 $ eunisspeciesgroups      : Factor w/ 14 levels "Algae","Cyanobacteria",..: 5 5 5 5 5 5 5 1 1 5 ...
 $ dataset                 : chr  "Marine" "Marine" "Marine" "Marine" ...
In [ ]:
#Check values for variable "Alien"
fullDB[is.na(fullDB$alien),]
A data.frame: 6 × 14
eventdatedecimallatitudedecimallongitudephylumclassorderfamilygenusscientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroupsdataset
<fct><chr><chr><fct><fct><fct><fct><fct><fct><fct><fct><chr><fct><chr>
185532011 45.956 8.463 Arthropoda Insecta Diptera ChironomidaeEpoicocladiusEpoicocladius ephemerae(kieffer, 1924)c1.1NAInvertebrates Freshwater
277681999/200041.91619 15.34007 TracheophytaMagnoliopsidaAsterales Compositae Crepis Crepis sancta (l.) bornm. e1 NAFlowering plantsTerrestrial_PF
277692004/200540.62609516.947044TracheophytaMagnoliopsidaAsterales Compositae Crepis Crepis sancta (l.) bornm. e1 NAFlowering plantsTerrestrial_PF
301231999/200041.91619 15.34007 TracheophytaMagnoliopsidaRanunculalesPapaveraceaePapaver Papaver dubium l. e1 NAFlowering plantsTerrestrial_PF
301241999/200041.91619 15.34007 TracheophytaMagnoliopsidaRanunculalesPapaveraceaePapaver Papaver hybridum l. e1 NAFlowering plantsTerrestrial_PF
315321999/200041.91619 15.34007 TracheophytaMagnoliopsidaBrassicales BrassicaceaeSisymbrium Sisymbrium orientale l. e1 NAFlowering plantsTerrestrial_PF
In [ ]:
#There are six records without information about the "alien" status 
#Search species in EASIN and assign species status accordingly.
fullDB$alien[fullDB$scientificname=="Sisymbrium orientale"] <- "no"
fullDB$alien[fullDB$scientificname=="Papaver hybridum"] <- "no"
fullDB$alien[fullDB$scientificname=="Papaver dubium"] <- "no"
fullDB$alien[fullDB$scientificname=="Crepis sancta"] <- "yes"
In [ ]:
#Delete remaining "NA" values in variable "Alien" if no information about the species status could be found in EASIN.
fullDB <- fullDB[!is.na(fullDB$alien),]

Check if total number of species per dataset match number of total number of alien and native species per dataset

In [ ]:
fullDB %>% 
  group_by(dataset) %>%
  summarise(
    totSpecies=length(unique(scientificname))
      )
A tibble: 5 × 2
datasettotSpecies
<chr><int>
Freshwater 1750
Marine 3802
Terrestrial_A 2217
Terrestrial_PF2952
Transitional 2025
In [ ]:
fullDB %>% 
  group_by(dataset, alien) %>%
  summarise(
    totSpecies=length(unique(scientificname))
  )
`summarise()` has grouped output by 'dataset'. You can override using the
`.groups` argument.
A grouped_df: 10 × 3
datasetalientotSpecies
<chr><chr><int>
Freshwater no 1693
Freshwater yes 67
Marine no 3745
Marine yes 59
Terrestrial_A no 2198
Terrestrial_A yes 20
Terrestrial_PFno 2817
Terrestrial_PFyes 141
Transitional no 1930
Transitional yes 98
In [ ]:
#Total species numbers do not match. This means that the status alien and native have been assigned to the same species 
#problably for an error. Check which are these species.
fullDB %>% 
  group_by(scientificname) %>%
  filter(
    n_distinct(alien)>1)
A grouped_df: 237 × 14
eventdatedecimallatitudedecimallongitudephylumclassorderfamilygenusscientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroupsdataset
<fct><chr><chr><fct><fct><fct><fct><fct><fct><fct><fct><chr><fct><chr>
2002/200845.3953 12.5733 Arthropoda Malacostraca Amphipoda Corophiidae MonocorophiumMonocorophium sextonae (Crawford, 1937) A4 no InvertebratesMarine
1999 40.7 14.35 Arthropoda Malacostraca Amphipoda Corophiidae MonocorophiumMonocorophium sextonae (Crawford, 1937) A4 yesInvertebratesMarine
2000 40.92133 9.53367 SarcomastigophoraPhytomastigophoraEbriida Ebriidae Ebria Ebria tripartita (Schumann) Lemmermann, 1899A7 no Algae Marine
2004/201145.2 12.8 SarcomastigophoraPhytomastigophoraEbriida Ebriidae Ebria Ebria tripartita (Schumann) Lemmermann, 1899A7 no Algae Marine
2002/201243.725 13.2311 SarcomastigophoraPhytomastigophoraEbriida Ebriidae Ebria Ebria tripartita (Schumann) Lemmermann, 1899A7 no Algae Marine
1986/201245.7008 13.7138 SarcomastigophoraPhytomastigophoraEbriida Ebriidae Ebria Ebria tripartita (Schumann) Lemmermann, 1899A7 no Algae Marine
2000/201040.70981 14.26501 SarcomastigophoraPhytomastigophoraEbriida Ebriidae Ebria Ebria tripartita (Schumann) Lemmermann, 1899A7 yesAlgae Marine
2010/201240.29348 18.42189 SarcomastigophoraPhytomastigophoraEbriida Ebriidae Ebria Ebria tripartita (Schumann) Lemmermann, 1899A7 no Algae Marine
2010/201239.95863 18.43965 SarcomastigophoraPhytomastigophoraEbriida Ebriidae Ebria Ebria tripartita (Schumann) Lemmermann, 1899A7 no Algae Marine
2000/200841.8790016215.74699974Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X03 no InvertebratesTransitional
2009 41.8790016215.74699974Arthropoda Insecta Lepidoptera Nymphalidae Hipparchia Hipparchia (Neohipparchia) statilinus(Hufnagel, 1766) X03 yesInvertebratesTransitional
1982/198941.3800010712.93000031Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X03 no InvertebratesTransitional
1984/198941.3499984712.97500038Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X03 no InvertebratesTransitional
1983/198941.4000015312.89999962Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X02 no InvertebratesTransitional
1985/198941.3240013113.33699989Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X03 no InvertebratesTransitional
1964 40.9420013414.03600025Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X03 no InvertebratesTransitional
1982/198941.2739982613.40400028Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X03 no InvertebratesTransitional
1999/201141.8800010715.35000038Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X03 no InvertebratesTransitional
2006/200740.4399986318.23999977Rhodophyta Florideophyceae Ceramiales RhodomelaceaeAcanthophora Acanthophora nayadiformis (Delile) Papenfuss, 1968 X03 no Algae Transitional
1986/201140.4809989917.32600021Rhodophyta Florideophyceae Ceramiales RhodomelaceaeAcanthophora Acanthophora nayadiformis (Delile) Papenfuss, 1968 X02 yesAlgae Transitional
1980/200745.3779983512.37150002Arthropoda Thecostraca Balanomorpha Balanidae Amphibalanus Amphibalanus eburneus (Gould, 1841) X03 yesInvertebratesTransitional
2010/201140.2019996618.44099998 Ebriophyceae Ebriales Ebriaceae Ebria Ebria tripartita (Schumann) Lemmermann, 1899X03 no Algae Transitional
2000/201145.3779983512.37150002 Ebriophyceae Ebriales Ebriaceae Ebria Ebria tripartita (Schumann) Lemmermann, 1899X03 no Algae Transitional
2010/201141.8790016215.74699974Chordata Actinopterygii Perciformes Gobiidae KnipowitschiaKnipowitschia panizzae (Verga, 1841) X03 no Fishes Transitional
2010/201141.8800010715.35000038Chordata Actinopterygii Perciformes Gobiidae KnipowitschiaKnipowitschia panizzae (Verga, 1841) X03 no Fishes Transitional
2001/200245.3779983512.37150002Chordata Actinopterygii Perciformes Gobiidae KnipowitschiaKnipowitschia panizzae (Verga, 1841) X03 no Fishes Transitional
1989/201040.4809989917.32600021Chordata Actinopterygii Perciformes Gobiidae KnipowitschiaKnipowitschia panizzae (Verga, 1841) X02 no Fishes Transitional
1998/201341.4570007315.95800018Chordata Teleostei CypriniformesCyprinidae Cyprinus Cyprinus carpio Linnaeus, 1758 J5.3yesFishes Transitional
1999/201141.8800010715.35000038Chordata Teleostei CypriniformesCyprinidae Cyprinus Cyprinus carpio Linnaeus, 1758 X03 yesFishes Transitional
2003/200641.4570007315.95800018Chordata Aves Anseriformes Anatidae Anas Anas platyrhynchos Linnaeus, 1758 J5.3no Birds Transitional
2000/201141.613 15.663 Chordata ActinopterygiiCypriniformesCyprinidae Scardinius Scardinius erythrophthalmus (linnaeus, 1758) c2.3 no Fishes Freshwater
1980/201241.9006004312.47999954Chordata Aves Anseriformes Anatidae Anas Anas platyrhynchos linnaeus, 1758 x11 no Birds Terrestrial_A
1980/201241.2344 13.06686 Chordata Aves Anseriformes Anatidae Anas Anas platyrhynchos linnaeus, 1758 g2 no Birds Terrestrial_A
1993/200143.83888 11.80378 Arthropoda Insecta Lepidoptera Nymphalidae Hipparchia Hipparchia (Neohipparchia) statilinus(hufnagel, 1766) g4 no Invertebrates Terrestrial_A
1980/201241.2344 13.06686 Arthropoda Insecta Lepidoptera Nymphalidae Hipparchia Hipparchia (Neohipparchia) statilinus(hufnagel, 1766) g2 no Invertebrates Terrestrial_A
2002/200940.9944 16.3751 Arthropoda Insecta Lepidoptera Nymphalidae Hipparchia Hipparchia (Neohipparchia) statilinus(hufnagel, 1766) e1 no Invertebrates Terrestrial_A
1951/200945.2010994 10.74440002Arthropoda Insecta Coleoptera Curculionidae Xylosandrus Xylosandrus germanus (blandford, 1894) g1 no Invertebrates Terrestrial_A
1998/200945.2010994 10.74440002Arthropoda Insecta Coleoptera Curculionidae Xylosandrus Xylosandrus germanus (blandford, 1894) g1 yesInvertebrates Terrestrial_A
1978/200345.2010994 10.74440002TracheophytaPinopsida Cupressales Cupressaceae ChamaecyparisChamaecyparis lawsoniana (a. murray bis) parl.g1 yesConifers Terrestrial_PF
2011 43.83888 11.80378 TracheophytaPinopsida Cupressales Cupressaceae ChamaecyparisChamaecyparis lawsoniana (a. murray bis) parl.g4 no Conifers Terrestrial_PF
1978/200345.2010994 10.74440002TracheophytaMagnoliopsida Rosales Rosaceae Crataegus Crataegus germanica (l.) kuntze g1 no Flowering plants Terrestrial_PF
1999/201142.7928009 11.88969994TracheophytaMagnoliopsida Rosales Rosaceae Crataegus Crataegus germanica (l.) kuntze g1.7 yesFlowering plants Terrestrial_PF
1995/200340.780212 16.398739 TracheophytaMagnoliopsida Rosales Rosaceae Crataegus Crataegus germanica (l.) kuntze g1 no Flowering plants Terrestrial_PF
2004/200540.626095 16.947044 TracheophytaMagnoliopsida Rosales Rosaceae Crataegus Crataegus germanica (l.) kuntze g2 no Flowering plants Terrestrial_PF
2011 43.83888 11.80378 TracheophytaMagnoliopsida Myrtales Onagraceae Epilobium Epilobium montanum L. g4 no Flowering plants Terrestrial_PF
2009 43.843039 11.80425 TracheophytaMagnoliopsida Myrtales Onagraceae Epilobium Epilobium montanum L. g4 yesFlowering plants Terrestrial_PF
1997/200242.9686012313.03110027TracheophytaMagnoliopsida Myrtales Onagraceae Epilobium Epilobium montanum L. g1 yesFlowering plants Terrestrial_PF
1999/201146.23778 9.5544 TracheophytaMagnoliopsida Liliales Liliaceae Lilium Lilium martagon l. g3 no Flowering plants Terrestrial_PF
2011 43.83888 11.80378 TracheophytaMagnoliopsida Liliales Liliaceae Lilium Lilium martagon l. g4 no Flowering plants Terrestrial_PF
2009 43.843039 11.80425 TracheophytaMagnoliopsida Liliales Liliaceae Lilium Lilium martagon l. g4 yesFlowering plants Terrestrial_PF
1997/200242.9686012313.03110027TracheophytaMagnoliopsida Liliales Liliaceae Lilium Lilium martagon l. g1 no Flowering plants Terrestrial_PF
1999/200041.91619 15.34007 TracheophytaMagnoliopsida Poales Poaceae Paspalum Paspalum distichum L. b1 yesFlowering plants Terrestrial_PF
2011 43.83888 11.80378 Bryophyta Bryopsida LeucodontalesLeucodontaceaePterogonium Pterogonium gracile Smith, 1803 g4 no Mosses and liverwortsTerrestrial_PF
2009 43.843039 11.80425 Bryophyta Bryopsida LeucodontalesLeucodontaceaePterogonium Pterogonium gracile Smith, 1803 g4 yesMosses and liverwortsTerrestrial_PF
1999/200143.50944 10.438611 TracheophytaMagnoliopsida Vitales Vitaceae Vitis Vitis vinifera L. g2 yesFlowering plants Terrestrial_PF
1978/200345.2010994 10.74440002TracheophytaMagnoliopsida Vitales Vitaceae Vitis Vitis vinifera L. g1 no Flowering plants Terrestrial_PF
2004/201141.9667 12.05 TracheophytaMagnoliopsida Vitales Vitaceae Vitis Vitis vinifera L. b1 yesFlowering plants Terrestrial_PF
1999/201141.849361 13.588139 TracheophytaMagnoliopsida Vitales Vitaceae Vitis Vitis vinifera L. g1.68yesFlowering plants Terrestrial_PF
1999/200041.91619 15.34007 TracheophytaMagnoliopsida Vitales Vitaceae Vitis Vitis vinifera L. g2 no Flowering plants Terrestrial_PF
1995/200340.780212 16.398739 TracheophytaMagnoliopsida Vitales Vitaceae Vitis Vitis vinifera L. g1 no Flowering plants Terrestrial_PF
In [ ]:
#After a search on EASIN and GRIIS, the species status was ascertained it can be now amended.
fullDB$alien[fullDB$scientificname=="Monocorophium sextonae"] <- "yes"
fullDB$alien[fullDB$scientificname=="Ebria tripartita"] <- "no"
fullDB$alien[fullDB$scientificname=="Amphibalanus eburneus"] <- "yes"
fullDB$alien[fullDB$scientificname=="Hipparchia (Neohipparchia) statilinus"] <- "no"
fullDB$alien[fullDB$scientificname=="Acanthophora nayadiformis"] <- "yes"
fullDB$alien[fullDB$scientificname=="Knipowitschia panizzae"] <- "no"
fullDB$alien[fullDB$scientificname=="Cyprinus carpio"] <- "yes"
fullDB$alien[fullDB$scientificname=="Anas platyrhynchos"] <- "no"
fullDB$alien[fullDB$scientificname=="Erythrocladia irregularis"] <- "no"
fullDB$alien[fullDB$scientificname=="Alburnus alburnus"] <- "yes"
fullDB$alien[fullDB$scientificname=="Cobitis taenia"] <- "no"
fullDB$alien[fullDB$scientificname=="Cyclops vicinus"] <- "yes"
fullDB$alien[fullDB$scientificname=="Girardia tigrina"] <- "yes"
fullDB$alien[fullDB$scientificname=="Gobio gobio"] <- "yes"
fullDB$alien[fullDB$scientificname=="Paspalum distichum"] <- "yes"
fullDB$alien[fullDB$scientificname=="Rutilus rutilus"] <- "yes"
fullDB$alien[fullDB$scientificname=="Salmo trutta"] <- "no"
fullDB$alien[fullDB$scientificname=="Scardinius erythrophthalmus"] <- "yes"
fullDB$alien[fullDB$scientificname=="Xylosandrus germanus"] <- "yes"
fullDB$alien[fullDB$scientificname=="Chamaecyparis lawsoniana"] <- "yes"
fullDB$alien[fullDB$scientificname=="Crataegus germanica"] <- "yes"
fullDB$alien[fullDB$scientificname=="Epilobium montanum"] <- "no"
fullDB$alien[fullDB$scientificname=="Lilium martagon"] <- "no"
fullDB$alien[fullDB$scientificname=="Pterogonium gracile"] <- "yes"
fullDB$alien[fullDB$scientificname=="Vitis vinifera"] <- "no"
In [ ]:
#check values of EUNIS habitats
levels(fullDB$eunishabitatstypecode)
  1. 'A1'
  2. 'A2'
  3. 'A3'
  4. 'A4'
  5. 'A5'
  6. 'A6'
  7. 'A7'
  8. 'J5.1'
  9. 'J5.3'
  10. 'X02'
  11. 'X03'
  12. 'c1'
  13. 'c1.1'
  14. 'c1.2'
  15. 'c1.3'
  16. 'c1.6'
  17. 'c1.63'
  18. 'c2'
  19. 'c2.1'
  20. 'c2.2'
  21. 'c2.3'
  22. 'c3'
  23. 'j5.5'
  24. 'e1'
  25. 'g1'
  26. 'g1.68'
  27. 'g1.7'
  28. 'g2'
  29. 'g4'
  30. 'h1'
  31. 'h3'
  32. 'j4'
  33. 'x11'
  34. 'b1'
  35. 'e1.71'
  36. 'e1.72'
  37. 'e2.22'
  38. 'e3.51'
  39. 'e4'
  40. 'e4.1'
  41. 'e4.11'
  42. 'e4.12'
  43. 'e4.3'
  44. 'e4.31'
  45. 'e4.33'
  46. 'e4.34'
  47. 'e4.4'
  48. 'e4.41'
  49. 'e4.42'
  50. 'e4.43'
  51. 'e4.52'
  52. 'e5.58'
  53. 'e5.59'
  54. 'f1'
  55. 'f2.1'
  56. 'f2.11'
  57. 'f2.12'
  58. 'f2.21'
  59. 'f2.24'
  60. 'f4.21'
  61. 'f5'
  62. 'f6'
  63. 'g3'
  64. 'g3.13'
  65. 'g3.1e'
  66. 'g3.31'
  67. 'g3.44'
  68. 'h2'
  69. 'h2.3'
  70. 'h2.31'
  71. 'h2.42'
  72. 'h2.5'
  73. 'h3.13'
  74. 'h5.3'
  75. 'j5'
  76. 'j5.1'
  77. 'x02'
  78. 'x03'
In [ ]:
#there are duplications in column eunishabitatstype because some values are in capital letters and others in lowercase
#transform all values in capital
fullDB$eunishabitatstypecode <- toupper(fullDB$eunishabitatstypecode)
fullDB$eunishabitatstypecode <- as.factor(fullDB$eunishabitatstypecode)
levels(fullDB$eunishabitatstypecode)
  1. 'A1'
  2. 'A2'
  3. 'A3'
  4. 'A4'
  5. 'A5'
  6. 'A6'
  7. 'A7'
  8. 'B1'
  9. 'C1'
  10. 'C1.1'
  11. 'C1.2'
  12. 'C1.3'
  13. 'C1.6'
  14. 'C1.63'
  15. 'C2'
  16. 'C2.1'
  17. 'C2.2'
  18. 'C2.3'
  19. 'C3'
  20. 'E1'
  21. 'E1.71'
  22. 'E1.72'
  23. 'E2.22'
  24. 'E3.51'
  25. 'E4'
  26. 'E4.1'
  27. 'E4.11'
  28. 'E4.12'
  29. 'E4.3'
  30. 'E4.31'
  31. 'E4.33'
  32. 'E4.34'
  33. 'E4.4'
  34. 'E4.41'
  35. 'E4.42'
  36. 'E4.43'
  37. 'E4.52'
  38. 'E5.58'
  39. 'E5.59'
  40. 'F1'
  41. 'F2.1'
  42. 'F2.11'
  43. 'F2.12'
  44. 'F2.21'
  45. 'F2.24'
  46. 'F4.21'
  47. 'F5'
  48. 'F6'
  49. 'G1'
  50. 'G1.68'
  51. 'G1.7'
  52. 'G2'
  53. 'G3'
  54. 'G3.13'
  55. 'G3.1E'
  56. 'G3.31'
  57. 'G3.44'
  58. 'G4'
  59. 'H1'
  60. 'H2'
  61. 'H2.3'
  62. 'H2.31'
  63. 'H2.42'
  64. 'H2.5'
  65. 'H3'
  66. 'H3.13'
  67. 'H5.3'
  68. 'J4'
  69. 'J5'
  70. 'J5.1'
  71. 'J5.3'
  72. 'J5.5'
  73. 'X02'
  74. 'X03'
  75. 'X11'
In [ ]:
#The detailed classification of habitats could be problematic for downstream analyses because 
#1) not all habitats are classified till the thirth or forth level of EUNIS classification (comparison issues)
#2) because many of these habitats are underrepresented
#For such a reason it is better to work on a upper level of habitat classification (e.g. first level; A1, B1, C1, etc.)
#add a column with this upper level classification for EUNIS habitats

fullDB$eunishabitatsLevel1 <- sub("(^[^.]+)\\..*", "\\1", fullDB$eunishabitatstypecode)
# check result
fullDB$eunishabitatsLevel1 <- as.factor(fullDB$eunishabitatsLevel1)
levels(fullDB$eunishabitatsLevel1)
  1. 'A1'
  2. 'A2'
  3. 'A3'
  4. 'A4'
  5. 'A5'
  6. 'A6'
  7. 'A7'
  8. 'B1'
  9. 'C1'
  10. 'C2'
  11. 'C3'
  12. 'E1'
  13. 'E2'
  14. 'E3'
  15. 'E4'
  16. 'E5'
  17. 'F1'
  18. 'F2'
  19. 'F4'
  20. 'F5'
  21. 'F6'
  22. 'G1'
  23. 'G2'
  24. 'G3'
  25. 'G4'
  26. 'H1'
  27. 'H2'
  28. 'H3'
  29. 'H5'
  30. 'J4'
  31. 'J5'
  32. 'X02'
  33. 'X03'
  34. 'X11'
In [ ]:
#Even at the first level of classification it will be difficult to compare habitats across environmental domains because
#there are huge differences in numbers. For example there are only 3 habitats in freshwater ecosystems (C) with many records 
#and 18 habitats in terrestrial ecosystems with few records each.
#Further merge habitats based on EUNIS classification
fullDB$eunisL0 <- plyr::revalue(fullDB$eunishabitatsLevel1, c("A1"="Marine benthic",
                                                      "A2"="Marine benthic",
                                                      "A3"="Marine benthic",
                                                      "A4"="Marine benthic",
                                                      "A5"="Marine benthic",
                                                      "A6"="Marine benthic",
                                                      "A7"="Marine pelagic",
                                                      "B1"="Coastal",
                                                      "E1"="Grasslands",
                                                      "E2"="Grasslands",
                                                      "E3"="Grasslands",
                                                      "E4"="Grasslands",
                                                      "E5"="Grasslands",
                                                      "F1"="Heathland, scrub and tundra",
                                                      "F2"="Heathland, scrub and tundra",
                                                      "F4"="Heathland, scrub and tundra",
                                                      "F5"="Heathland, scrub and tundra",
                                                      "F6"="Heathland, scrub and tundra",
                                                      "G1"="Woodlands",
                                                      "G2"="Woodlands",
                                                      "G3"="Woodlands",
                                                      "G4"="Woodlands",
                                                      "H1"="Inland habitats with sparse vegetation",
                                                      "H2"="Inland habitats with sparse vegetation",
                                                      "H3"="Inland habitats with sparse vegetation",
                                                      "H5"="Inland habitats with sparse vegetation",
                                                      "J4"="Artificial terrestrial",
                                                      "J5"="Artificial waters",
                                                      "X11"="Artificial terrestrial",
                                                      "C1"="Standing waters",
                                                      "C2"="Running waters",
                                                      "C3"="Water-fringing vegetation",
                                                      "X02"="Saline lagoons",
                                                      "X03"="Brackish lagoons"))
In [ ]:
#check macro-habitats levels                                                                                                         
levels(fullDB$eunisL0)
  1. 'Marine benthic'
  2. 'Marine pelagic'
  3. 'Coastal'
  4. 'Standing waters'
  5. 'Running waters'
  6. 'Water-fringing vegetation'
  7. 'Grasslands'
  8. 'Heathland, scrub and tundra'
  9. 'Woodlands'
  10. 'Inland habitats with sparse vegetation'
  11. 'Artificial terrestrial'
  12. 'Artificial waters'
  13. 'Saline lagoons'
  14. 'Brackish lagoons'
In [ ]:
#Add column to specify environmental domains of habitats
fullDB$domain <- plyr::revalue(fullDB$eunishabitatsLevel1, c("A1"="Marine",
                                                   "A2"="Marine",
                                                   "A3"="Marine",
                                                   "A4"="Marine",
                                                   "A5"="Marine",
                                                   "A6"="Marine",
                                                   "A7"="Marine",
                                                   "B1"="Marine",
                                                   "E1"="Terrestrial",
                                                   "E2"="Terrestrial",
                                                   "E3"="Terrestrial",
                                                   "E4"="Terrestrial",
                                                   "E5"="Terrestrial",
                                                   "F1"="Terrestrial",
                                                   "F2"="Terrestrial",
                                                   "F4"="Terrestrial",
                                                   "F5"="Terrestrial",
                                                   "F6"="Terrestrial",
                                                   "G1"="Terrestrial",
                                                   "G2"="Terrestrial",
                                                   "G3"="Terrestrial",
                                                   "G4"="Terrestrial",
                                                   "H1"="Terrestrial",
                                                   "H2"="Terrestrial",
                                                   "H3"="Terrestrial",
                                                   "H5"="Terrestrial",
                                                   "J4"="Artificial",
                                                   "J5"="Artificial",
                                                   "X11"="Artificial",
                                                   "C1"="Freshwater",
                                                   "C2"="Freshwater",
                                                   "C3"="Freshwater",
                                                   "X02"="Transitional",
                                                   "X03"="Transitional"))
In [ ]:
#check levels of environmental domains                                                    
levels(fullDB$domain)
  1. 'Marine'
  2. 'Freshwater'
  3. 'Terrestrial'
  4. 'Artificial'
  5. 'Transitional'
In [ ]:
#Apply a treshold to the dataset based on number of records (occurrences) per macro-habitat
#Keep only macro-habitats with number of records >5% of the total occurrences in the related domain
In [ ]:
#Assign a number to each row in the dataset representing the occurrences and then aggregate the dataset
fullDB$occ <- c(1)
In [ ]:
ThresholdDB <- aggregate(fullDB$occ,
               by=list(fullDB$eunisL0,
               fullDB$domain), FUN="sum")
head(ThresholdDB)
A data.frame: 6 × 3
Group.1Group.2x
<fct><fct><dbl>
1Marine benthic Marine 6936
2Marine pelagic Marine 5283
3Coastal Marine 608
4Standing waters Freshwater4552
5Running waters Freshwater2096
6Water-fringing vegetationFreshwater 52
In [ ]:
ThresholdDB <- plyr::rename(ThresholdDB, c("Group.1"="eunisL0", "Group.2"="domain", "x"="occEunis"))
In [ ]:
#Add total number of occurrences per domain
ThresholdDB$occDomain <- plyr::revalue(ThresholdDB$domain, c("Transitional"=3744,
                                                             "Marine"=12781,
                                                             "Terrestrial"=8537,
                                                             "Freshwater"=6542,
                                                             "Artificial"=452))
In [ ]:
#Calculate ratio
str(ThresholdDB)

ThresholdDB$ratio <- ThresholdDB$occEunis/as.numeric(as.character(ThresholdDB$occDomain))

Threshold <- 0.05
for (b in 1:nrow(ThresholdDB)) {
  if (!is.na(ThresholdDB$ratio[b])) {
    if(ThresholdDB$ratio[b] < Threshold){ThresholdDB$occEunis[b] <- 0} 
  }
}
'data.frame':	14 obs. of  4 variables:
 $ eunisL0  : Factor w/ 14 levels "Marine benthic",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ domain   : Factor w/ 5 levels "Marine","Freshwater",..: 1 1 1 2 2 2 3 3 3 3 ...
 $ occEunis : num  6936 5283 608 4552 2096 ...
 $ occDomain: Factor w/ 5 levels "12781","6542",..: 1 1 1 2 2 2 3 3 3 3 ...
In [ ]:
#By applying a threshold of 0.05 the "water-fringing vegetation", "coastal habitats", "Heathland, scrub and tundra"
#and "Inland habitats with sparse vegetation" habitats will be cutted-off
#Create "clean" dataset after treshold to be used for downstream analyses
cleanDB <- fullDB[which(fullDB$eunisL0=="Marine benthic"|
                          fullDB$eunisL0=="Marine pelagic"|
                          fullDB$eunisL0=="Standing waters"|
                          fullDB$eunisL0=="Running waters"|
                          fullDB$eunisL0=="Woodlands"|
                          fullDB$eunisL0=="Grasslands"|
                          fullDB$eunisL0=="Artificial waters"|
                          fullDB$eunisL0=="Artificial terrestrial"|
                          fullDB$eunisL0=="Saline lagoons"|
                          fullDB$eunisL0=="Brackish lagoons"),]
dim(cleanDB)
head(cleanDB)
  1. 31180
  2. 18
A data.frame: 6 × 18
eventdatedecimallatitudedecimallongitudephylumclassorderfamilygenusscientificnamescientificnameauthorshipeunishabitatstypecodealieneunisspeciesgroupsdataseteunishabitatsLevel1eunisL0domainocc
<fct><chr><chr><fct><fct><fct><fct><fct><fct><fct><fct><chr><fct><chr><fct><fct><fct><dbl>
12011 40.7098114.26501MolluscaGastropoda Cerithiopsidae Cerithiopsis Cerithiopsis leucophrys A4noInvertebratesMarineA4Marine benthicMarine1
22011 40.7098114.26501MolluscaGastropoda Cerithiopsidae Cerithiopsis Cerithiopsis nicephorus A4noInvertebratesMarineA4Marine benthicMarine1
32011 40.7098114.26501MolluscaGastropodaNeogastropodaMangeliidae BrachycytharaBrachycythara zenetouae A4noInvertebratesMarineA4Marine benthicMarine1
42011 40.7098114.26501MolluscaGastropodaNeogastropodaClathurellidae Clathurella Clathurella gracilis A4noInvertebratesMarineA4Marine benthicMarine1
52001 40.669 14.419 CnidariaHydrozoa AnthoathecataCordylophoridaeCordylophora Cordylophora neapolitanaA3noInvertebratesMarineA3Marine benthicMarine1
62010/201139.61 18.465 CnidariaAnthozoa Scleractinia CaryophylliidaeKadophellia Kadophellia bathyalis A6noInvertebratesMarineA6Marine benthicMarine1

Analyses on species count

In [ ]:
#Abundance analyses - number of alien and native species per macro-habitat and domain
#Calculate number of species per environmental domain
speciesNumb <-  cleanDB %>% 
    group_by(domain, alien) %>%
    summarise(
      totSpecies=length(unique(scientificname))
    )  
head(speciesNumb)
`summarise()` has grouped output by 'domain'. You can override using the
`.groups` argument.
A grouped_df: 6 × 3
domainalientotSpecies
<fct><chr><int>
Marine no 3744
Marine yes 58
Freshwater no 1640
Freshwater yes 62
Terrestrialno 4677
Terrestrialyes 137
In [ ]:
# Plot 1 - species numbers by domains

speciesNumb <- speciesNumb %>% mutate(alienInv=ifelse(alien=="no", totSpecies, totSpecies*-1))


barplot1 <- ggplot(speciesNumb, aes(x=domain, y=alienInv))+
  geom_bar(aes(fill=alien), stat = "identity", position = "identity", alpha=0.8, color="black", width = 0.5)+
  geom_label(aes(label=format(totSpecies, big.mark = ",")), vjust=ifelse(speciesNumb$alienInv<0, 1.20, -0.2))+
  scale_fill_manual(labels=c("no"="Native", "yes"="Alien"), values = c("no"="aquamarine4", "yes"="black"))+
  scale_x_discrete(limits=c("Artificial", "Freshwater", "Transitional",  "Marine", "Terrestrial"))+
  geom_hline(yintercept=0, size=1)+
  theme_classic()+ labs(title="Number of native and alien species in environmental domains")+
  theme(axis.ticks = element_blank(), axis.line = element_blank(), axis.text.y = element_blank(),
        axis.text.x = element_text(color="black", size = 14), axis.title = element_blank(), legend.position = "null",
        legend.title = element_blank(), legend.text = element_text(size = 14))
barplot1
In [ ]:
#Save plot as png
png("/home/jovyan/work/output/SpeciesNumbByDomains.png", width = 500, height = 600, units = "px")
barplot1
dev.off()
png: 2
In [ ]:
#Calculate number of species per environmental domain

speciesNumb2 <-  cleanDB %>% 
  group_by(eunisL0, alien) %>%
  summarise(
    totSpecies=length(unique(scientificname))
  )  
head(speciesNumb2)
`summarise()` has grouped output by 'eunisL0'. You can override using the
`.groups` argument.
A grouped_df: 6 × 3
eunisL0alientotSpecies
<fct><chr><int>
Marine benthic no 2858
Marine benthic yes 51
Marine pelagic no 1253
Marine pelagic yes 9
Standing watersno 1369
Standing watersyes 51
In [ ]:
# Plot 2 - species number by EUNIS macro-habitats

speciesNumb2 <- speciesNumb2 %>% mutate(alienInv=ifelse(alien=="no", totSpecies, totSpecies*-1))
In [ ]:
barplot2 <- ggplot(speciesNumb2, aes(x=eunisL0, y=alienInv))+
  geom_bar(aes(fill=alien), stat = "identity", position = "identity", alpha=0.8, color="black", width = 0.7)+
  geom_label(aes(label=format(totSpecies, big.mark = ",")), hjust=ifelse(speciesNumb2$alienInv<0, 1.20, -0.2))+
  scale_fill_manual(labels=c("no"="Native", "yes"="Alien"), values = c("no"="aquamarine4", "yes"="black"))+
  scale_x_discrete(limits=c("Artificial waters","Artificial terrestrial", "Grasslands","Woodlands", "Running waters","Standing waters", 
                            "Saline lagoons","Brackish lagoons","Marine pelagic","Marine benthic"))+
  coord_flip()+
  geom_hline(yintercept=0, size=1)+
  theme_classic()+ labs(title="Number of native and alien species in EUNIS habitats")+
  theme(axis.ticks = element_blank(), axis.line = element_blank(), axis.text.x = element_blank(),
        axis.text.y = element_text(color="black", size = 12), axis.title = element_blank(), legend.position = "bottom",
        legend.title = element_blank(), legend.text = element_text(size = 12))
barplot2
In [ ]:
#Save plot as png
png("/home/jovyan/work/output/SpeciesNumbByMacro-Habitsts.png", width = 8000, height = 5000, units = "px", res = 500)
barplot2
dev.off()
png: 2
In [ ]:
#Calculate the species incidence on macro-habitats and domains 
speciesNumb3 <-  cleanDB %>% 
  group_by(domain,eunisL0, alien) %>%
  summarise(
    totSpecies=length(unique(scientificname))
  )  
head(speciesNumb3) 
`summarise()` has grouped output by 'domain', 'eunisL0'. You can override using
the `.groups` argument.
A grouped_df: 6 × 4
domaineunisL0alientotSpecies
<fct><fct><chr><int>
Marine Marine benthic no 2858
Marine Marine benthic yes 51
Marine Marine pelagic no 1253
Marine Marine pelagic yes 9
FreshwaterStanding watersno 1369
FreshwaterStanding watersyes 51

comando utilizzato per pulire spazzatura che occupa la memoria da usare quando la RAM è piena

In [ ]:
SpeciesInc <- dcast(speciesNumb3, eunisL0+domain~alien, value.var = "totSpecies")
head(SpeciesInc)

SpeciesInc$incidence <- SpeciesInc$yes/SpeciesInc$no
A data.frame: 6 × 4
eunisL0domainnoyes
<fct><fct><int><int>
1Marine benthic Marine 2858 51
2Marine pelagic Marine 1253 9
3Standing watersFreshwater 1369 51
4Running waters Freshwater 598 34
5Grasslands Terrestrial 855 23
6Woodlands Terrestrial4089119
In [ ]:
## Plot 3 - Alien incidence across environmental domains

box1 <-  ggboxplot(SpeciesInc, x="domain", y="incidence", xlab = FALSE, ylab = "Incidence\n",
                   width = 0.9, size = 0.7, font.label = list(size=14), fill = "domain", alpha=0.7)+
  scale_fill_manual(values = c("Marine"="darkslategray2", "Terrestrial"="indianred4", "Freshwater"="palegreen4", 
                               "Transitional"="steelblue4", "Artificial"="gray"))+
  scale_x_discrete(limits=c("Marine","Transitional", "Freshwater", "Terrestrial",
                            "Artificial"))+
  labs(title="Alien species incidence in environmental domains")+
  theme_pubclean()+theme(axis.text = element_text(colour = "black", size = 11), axis.ticks.x = element_blank(),
                         axis.title.x = element_blank(), legend.position = "null")


box1
In [ ]:
#Save plot as png
png("/home/jovyan/work/output/AlienIncidenceDomains.png", width = 1500, height = 1500, units = "px", res = 200)
box1
dev.off()
png: 2
In [ ]:
## Plot 4 - Alien incidence in EUNIS macro-habitats

barplot3 <-  ggplot(SpeciesInc, aes(x=eunisL0, y=incidence))+
  geom_bar(aes(fill=domain), stat = "identity", width = .5, alpha=.7, color="black")+
  ylab("Incidence\n")+
  scale_fill_manual(values = c("Terrestrial"="indianred4","Marine" ="darkslategray2", 
                               "Transitional"="steelblue4", "Freshwater"="palegreen4", "Artificial"="gray"))+
  scale_x_discrete(limits=c("Marine pelagic","Marine benthic", "Saline lagoons", "Brackish lagoons",
                            "Standing waters","Running waters", "Grasslands","Woodlands",
                            "Artificial terrestrial", "Artificial waters"))+
  labs(title="Alien species incidence in EUNIS habitats")+
  theme_pubclean()+theme(axis.text = element_text(colour = "black", size = 11), 
                         axis.text.x = element_text(angle = 45, hjust = 1),
                         axis.ticks.x = element_blank(),
                         axis.title.x = element_blank(), legend.position = "null")
barplot3
In [ ]:
#Save plot as png
png("/home/jovyan/work/output/AlienIncidenceHabitats.png", width = 1500, height = 1500, units = "px", res = 200)
barplot3
dev.off()
png: 2
In [ ]:
#Chi-Square Test of Independence
#The chi-square test evaluates whether there is a significant association between the categories (EUNIS habitats)
#of two variables (alien species vs. native species). 

#Prepare dataset for analysis
chiHab <- SpeciesInc[,-1]
rownames(chiHab) <- SpeciesInc[,1]
In [ ]:
#Run chi-square test
chisq <- chisq.test(chiHab[,2:3], simulate.p.value=T)
#Simulate.p.value added because of the small sample size
In [ ]:
#Check test results
chisq
	Pearson's Chi-squared test with simulated p-value (based on 2000
	replicates)

data:  chiHab[, 2:3]
X-squared = 88.58, df = NA, p-value = 0.0004998

The results indicated that habitats and number of alien/native species are statistically significantly associated (p-value < 0.5).

Chi-square statistic is 88.58

In [ ]:
#Extract and inspect observed and expected counts
chisq$observed
round(chisq$expected, 2)
A matrix: 10 × 2 of type int
noyes
Marine benthic2858 51
Marine pelagic1253 9
Standing waters1369 51
Running waters 598 34
Grasslands 855 23
Woodlands4089119
Artificial terrestrial 155 10
Artificial waters 96 7
Saline lagoons 886 37
Brackish lagoons1508 79
A matrix: 10 × 2 of type dbl
noyes
Marine benthic2822.27 86.73
Marine pelagic1224.37 37.63
Standing waters1377.66 42.34
Running waters 613.16 18.84
Grasslands 851.82 26.18
Woodlands4082.54125.46
Artificial terrestrial 160.08 4.92
Artificial waters 99.93 3.07
Saline lagoons 895.48 27.52
Brackish lagoons1539.68 47.32
In [ ]:
#Extract chi-square residuals
round(chisq$residuals, 3)
A matrix: 10 × 2 of type dbl
noyes
Marine benthic 0.673-3.837
Marine pelagic 0.818-4.667
Standing waters-0.233 1.331
Running waters-0.612 3.492
Grasslands 0.109-0.621
Woodlands 0.101-0.577
Artificial terrestrial-0.402 2.291
Artificial waters-0.393 2.242
Saline lagoons-0.317 1.807
Brackish lagoons-0.807 4.606
In [ ]:
#Residuals can be used to evaluate the cells of the contingency table that contribute the most to the total Chi-square score.
#A correlation plot can be used to visualise the value of residuals contributing the most to the chi-square score
#as well as the negative or positive associations between categorical variables
corrplot(chisq$residuals, is.corr = F, addCoef.col = "black", col = COL2("BrBG"), tl.col = "black", cl.pos = "n")
Error in corrplot(chisq$residuals, is.corr = F, addCoef.col = "black", : could not find function "corrplot"
Traceback:
In [ ]:
#Analysis of variance
#The anova test shows if there is any significant difference between the average incidence among domains
specinc.aov <- aov(incidence~domain, data = SpeciesInc)
#Summary of the analysis
summary(specinc.aov)
#p-value is less than the significance level 0.05 hence differences in incidence between domains is significant
            Df   Sum Sq   Mean Sq F value  Pr(>F)   
domain       4 0.003643 0.0009108   13.27 0.00714 **
Residuals    5 0.000343 0.0000686                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
In [ ]:
#Tukey test for multiple pairwise comparisons
TukeyHSD(specinc.aov)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = incidence ~ domain, data = SpeciesInc)

$domain
                                  diff          lwr        upr     p adj
Freshwater-Marine         3.454112e-02  0.001308933 0.06777332 0.0432056
Terrestrial-Marine        1.548782e-02 -0.017744368 0.04872002 0.4317264
Artificial-Marine         5.620269e-02  0.022970502 0.08943489 0.0056185
Transitional-Marine       3.456029e-02  0.001328100 0.06779248 0.0431142
Terrestrial-Freshwater   -1.905330e-02 -0.052285493 0.01417889 0.2802635
Artificial-Freshwater     2.166157e-02 -0.011570622 0.05489376 0.2020280
Transitional-Freshwater   1.916664e-05 -0.033213025 0.03325136 1.0000000
Artificial-Terrestrial    4.071487e-02  0.007482679 0.07394706 0.0225229
Transitional-Terrestrial  1.907247e-02 -0.014159724 0.05230466 0.2795951
Transitional-Artificial  -2.164240e-02 -0.054874594 0.01158979 0.2025161

Analyses on species occurrences

In [ ]:
#Abundance analyses - number of alien and native species records per macro-habitat and domain

#Calculate and plot number of alien and native occurrences per macro-habitat

#Aggregate dataset and rename variables
subDB1 <- aggregate(cleanDB$occ, by=list(cleanDB$eunisL0, cleanDB$alien), FUN="sum")
subDB1 <- plyr::rename(subDB1, c("Group.1"="eunisL0", "Group.2"="alien", "x"="occ"))
head(subDB1)
A data.frame: 6 × 3
eunisL0alienocc
<fct><chr><dbl>
1Marine benthic no6826
2Marine pelagic no5253
3Standing watersno4357
4Running waters no1971
5Grasslands no1618
6Woodlands no6310
In [ ]:
#Generate barplot for visualisation

subDB1 <- subDB1 %>% mutate(alienInv=ifelse(alien=="no", occ, occ*-1))

ggplot(subDB1, aes(x=eunisL0, y=alienInv))+
  geom_bar(aes(fill=alien), stat = "identity", position = "identity", alpha=0.8, color="black", width = 0.7)+
  geom_label(aes(label=format(occ, big.mark = ",")), hjust=ifelse(subDB1$alienInv<0, 1.20, -0.2))+
  scale_fill_manual(labels=c("no"="Native", "yes"="Alien"), values = c("no"="aquamarine4", "yes"="black"))+
  scale_x_discrete(limits=c("Artificial waters","Artificial terrestrial", "Grasslands","Woodlands", "Running waters","Standing waters", 
                            "Saline lagoons","Brackish lagoons","Marine pelagic","Marine benthic"))+
  coord_flip()+
  geom_hline(yintercept=0, size=1)+
  theme_classic()+ labs(title="Number of native and alien species occurrences in EUNIS habitats")+
  theme(axis.ticks = element_blank(), axis.line = element_blank(), axis.text.x = element_blank(),
        axis.text.y = element_text(color="black", size = 12), axis.title = element_blank(), legend.position = "bottom",
        legend.title = element_blank(), legend.text = element_text(size = 12))
In [ ]:
#Calculate and plot number of alien and native occurrences per domain

#Aggregate dataset and rename variables
subDB2 <- aggregate(cleanDB$occ, by=list(cleanDB$domain, cleanDB$alien), FUN="sum")
subDB2 <- plyr::rename(subDB2, c("Group.1"="domain", "Group.2"="alien", "x"="occ"))
head(subDB2)
A data.frame: 6 × 3
domainalienocc
<fct><chr><dbl>
1Marine no 12079
2Freshwater no 6328
3Terrestrial no 7928
4Artificial no 434
5Transitionalno 3588
6Marine yes 140
In [ ]:
#Generate barplot for visualisation
subDB2 <- subDB2 %>% mutate(alienInv=ifelse(alien=="no", occ, occ*-1))

ggplot(subDB2, aes(x=domain, y=alienInv))+
  geom_bar(aes(fill=alien), stat = "identity", position = "identity", alpha=0.8, color="black", width = 0.5)+
  geom_label(aes(label=format(occ, big.mark = ",")), vjust=ifelse(subDB2$alienInv<0, 1.20, -0.2))+
  scale_fill_manual(labels=c("no"="Native", "yes"="Alien"), values = c("no"="aquamarine4", "yes"="black"))+
  scale_x_discrete(limits=c("Artificial","Transitional", "Freshwater", "Terrestrial", "Marine"))+
  geom_hline(yintercept=0, size=1)+
  theme_classic()+labs(title="Number of native and alien species occurrences in environmental domains")+
  theme(axis.ticks = element_blank(), axis.line = element_blank(), axis.text.y = element_blank(),
        axis.text.x = element_text(color="black", size = 12), axis.title = element_blank(), legend.position = "null",
        legend.title = element_blank(), legend.text = element_text(size = 12))
In [ ]:
#Calculate the species incidence on macro-habitats and domains 

HabInc <- dcast(cleanDB, eunisL0+domain~alien, value.var = "occ", fun.aggregate = sum)
head(HabInc)

HabInc$incidence <- HabInc$yes/HabInc$no
A data.frame: 6 × 4
eunisL0domainnoyes
<fct><fct><dbl><dbl>
1Marine benthic Marine 6826110
2Marine pelagic Marine 5253 30
3Standing watersFreshwater 4357195
4Running waters Freshwater 1971125
5Grasslands Terrestrial1618 32
6Woodlands Terrestrial6310143
In [ ]:
#Plot alien species incidence in environmental domains
ggboxplot(HabInc, x="domain", y="incidence", xlab = FALSE, ylab = "Incidence\n",
                   width = 0.9, size = 0.7, font.label = list(size=14), fill = "domain", alpha=0.7)+
  scale_fill_manual(values = c("Marine"="darkslategray2", "Terrestrial"="indianred4", "Freshwater"="palegreen4", 
                               "Transitional"="steelblue4", "Artificial"="gray"))+
    scale_x_discrete(limits=c("Marine","Transitional", "Freshwater", "Terrestrial",
                              "Artificial"))+
  labs(title="Alien species incidence in environmental domains")+
  theme_pubclean()+theme(axis.text = element_text(colour = "black", size = 11), axis.ticks.x = element_blank(),
                         axis.title.x = element_blank(), legend.position = "null")

Incidence values remain consistent between number of species and species occurrences!

In [ ]:
#Plot alien species incidence in EUNIS habitats
ggplot(HabInc, aes(x=eunisL0, y=incidence))+
    geom_bar(aes(fill=domain), stat = "identity", width = .5, alpha=.7, color="black")+
    ylab("Incidence\n")+
    scale_fill_manual(values = c("Terrestrial"="indianred4","Marine" ="darkslategray2", 
                                 "Transitional"="steelblue4", "Freshwater"="palegreen4", "Artificial"="gray"))+
    scale_x_discrete(limits=c("Marine pelagic","Marine benthic", "Saline lagoons", "Brackish lagoons",
                            "Standing waters","Running waters", "Grasslands","Woodlands",
                            "Artificial terrestrial", "Artificial waters"))+
    labs(title="Alien species incidence in EUNIS habitats")+
    theme_pubclean()+theme(axis.text = element_text(colour = "black", size = 11), 
                           axis.text.x = element_text(angle = 45, hjust = 1),
                           axis.ticks.x = element_blank(),
                         axis.title.x = element_blank(), legend.position = "null")

Analysis of alien taxonomic groups contribution in different environmental domains

In [ ]:
#Generate dataframe for analysis
alienDB_domain = dcast(cleanDB[which(cleanDB$alien=="yes"),], domain+eunisL0~eunisspeciesgroups, value.var = "occ", fun.aggregate = sum)
head(alienDB_domain)
dim(alienDB_domain)
A data.frame: 6 × 14
domaineunisL0AlgaeCyanobacteriaFishesFlowering plantsInvertebratesAmphibiansBirdsReptilesMosses and liverwortsMammalsConifersFungi
<fct><fct><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
1Marine Marine benthic 520 1 156000 00 0 0
2Marine Marine pelagic 210 0 0 9000 00 0 0
3Freshwater Standing waters 12123 563000 10 0 0
4Freshwater Running waters 00101 518100 00 0 0
5TerrestrialGrasslands 00 031 1000 00 0 0
6TerrestrialWoodlands 00 091110001711310
  1. 10
  2. 14
In [ ]:
#To run the analysis two datasets should be used with (1) taxonomic groups abundances and (2) environmental information
alienData <- alienDB_domain[,3:ncol(alienDB_domain)]
head(alienData)
A data.frame: 6 × 12
AlgaeCyanobacteriaFishesFlowering plantsInvertebratesAmphibiansBirdsReptilesMosses and liverwortsMammalsConifersFungi
<dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
1520 1 156000 00 0 0
2210 0 0 9000 00 0 0
3 12123 563000 10 0 0
4 00101 518100 00 0 0
5 00 031 1000 00 0 0
6 00 091110001711310
In [ ]:
#Compute rank abundances for alien species for all domains and taxonomic groups
RankAbu <- rankabundance(alienData)
RankAbu
A matrix: 12 × 8 of type dbl
rankabundanceproportionplowerpupperaccumfreqlogabunrankfreq
Invertebrates 127032.814.151.5 32.82.4 8.3
Fishes 223929.0-3.161.2 61.82.4 16.7
Flowering plants 313316.2-8.140.4 78.02.1 25.0
Algae 412314.9-2.232.1 93.02.1 33.3
Mosses and liverworts 5 18 2.2-2.2 6.5 95.11.3 41.7
Conifers 6 13 1.6-1.8 5.0 96.71.1 50.0
Fungi 7 10 1.2-1.4 3.8 97.91.0 58.3
Birds 8 5 0.6-0.6 1.8 98.50.7 66.7
Mammals 9 5 0.6-0.6 1.9 99.10.7 75.0
Reptiles10 4 0.5-0.5 1.5 99.60.6 83.3
Cyanobacteria11 2 0.2-0.2 0.7 99.90.3 91.7
Amphibians12 1 0.1-0.1 0.4100.00.0100.0
In [ ]:
#Plot rank abundance curve
rankabunplot(RankAbu, scale = "abundance", specnames = c(1:6))
In [ ]:
#Generate environmental dataset
domain <- as.data.frame(alienDB_domain[,1:2])
head(domain)
str(domain)
A data.frame: 6 × 2
domaineunisL0
<fct><fct>
1Marine Marine benthic
2Marine Marine pelagic
3Freshwater Standing waters
4Freshwater Running waters
5TerrestrialGrasslands
6TerrestrialWoodlands
'data.frame':	10 obs. of  2 variables:
 $ domain : Factor w/ 5 levels "Marine","Freshwater",..: 1 1 2 2 3 3 4 4 5 5
 $ eunisL0: Factor w/ 14 levels "Marine benthic",..: 1 2 4 5 7 9 11 12 13 14
In [ ]:
#PLot rank abundance curves across environmental domains
alienrank <- rankabuncomp(alienData, y=domain, factor = "domain", scale = "abundance", 
                            return.data = T,specnames = c(1:5), legend = FALSE)
In [ ]:
#Use ggplot
alienRankPlot <- ggplot(data = alienrank[which(alienrank$Grouping!="Artificial"),], aes(x=rank, y=abundance))+
    scale_x_continuous(expand = c(0,1), sec.axis = dup_axis(labels = NULL, name = NULL))+
    scale_y_continuous(expand = c(0,1), sec.axis = dup_axis(labels = NULL, name = NULL))+
    geom_line(aes(color=Grouping), size=1, alpha=0.7, linetype="dashed")+
    geom_point(aes(color=Grouping), size=4)+
    geom_text_repel(data = subset(alienrank[which(alienrank$Grouping!="Artificial"),], labelit==T),
                   aes(label=species), angle=0, nudge_x = 1.5, nudge_y = 0, show.legend = F)+
    facet_wrap(~factor(Grouping, levels=c("Marine", "Transitional", "Freshwater", "Terrestrial")))+
    theme_pubclean()+
    scale_color_manual(values = c("Terrestrial"="indianred4","Marine" ="darkslategray3", 
    "Transitional"="steelblue4", "Freshwater"="palegreen4"))+
    labs(x="\nTaxonomic groups rank", y="Alien abundance\n")+
    theme(panel.grid = element_blank(), legend.position = "none", axis.ticks.y.right = element_blank(),
          axis.text = element_text(colour = "black"),
          axis.ticks.x.top = element_blank(), strip.background = element_rect(fill = "transparent",
                                                                              color = "black"),
          strip.placement = "outside",
          strip.text = element_text(size=12))
  
alienRankPlot
In [ ]:
png("/home/jovyan/work/output/AlienRankAbundanceCurves.png", width = 4000, height = 4000, units = "px", res = 500)
alienRankPlot
dev.off()
png: 2
In [ ]:
#gc()
A matrix: 2 × 6 of type dbl
used(Mb)gc trigger(Mb)max used(Mb)
Ncells2391377127.84762535254.44566178243.9
Vcells4099994 31.38388608 64.07022986 53.6