Chromosome-scale genome assembly of the sea louse Caligus rogercresseyi by SMRT sequencing and Hi-C analysis

Caligus
genome
Author

Cristian Gallardo-Escárate, Valentina Valenzuela-Muñoz, Gustavo Nuñez-Acuña, Diego Valenzuela-Miranda, Ana Teresa Gonçalves, Hugo Escobar-Sepulveda, Ivan Liachko, Bradley Nelson, Steven Roberts & Wesley Warren

Doi

Citation

Gallardo-Escárate, C., Valenzuela-Muñoz, V., Nuñez-Acuña, G. et al. Chromosome-scale genome assembly of the sea louse Caligus rogercresseyi by SMRT sequencing and Hi-C analysis. Sci Data 8, 60 (2021). https://doi.org/10.1038/s41597-021-00842-w

Abstract

Caligus rogercresseyi, commonly known as sea louse, is an ectoparasite copepod that impacts the salmon aquaculture in Chile, causing losses of hundreds of million dollars per year. In this study, we report a chromosome-scale assembly of the sea louse (C. rogercresseyi) genome based on single-molecule real-time sequencing (SMRT) and proximity ligation (Hi-C) analysis. Coding RNAs and non-coding RNAs, and specifically long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) were identified through whole transcriptome sequencing from different life stages. A total of 23,686 protein-coding genes and 12,558 non-coding RNAs were annotated. In addition, 6,308 lncRNAs and 5,774 miRNAs were found to be transcriptionally active from larvae to adult stages. Taken together, this genomic resource for C. rogercresseyi represents a valuable tool to develop sustainable control strategies in the salmon aquaculture industry.

Data Availability

DNA and RNA sequencing runs were deposited to NCBI Sequence Read Archive (SRA)1(https://www.nature.com/articles/s41597-021-00842-w#ref-CR50 “NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP229458

             (2019)."),[51](https://www.nature.com/articles/s41597-021-00842-w#ref-CR51 "NCBI Sequence Read Archive 
              https://identifiers.org/ncbi/insdc.sra:SRP212140
              
             (2019)."),[52](https://www.nature.com/articles/s41597-021-00842-w#ref-CR52 "NCBI Sequence Read Archive 
              https://identifiers.org/ncbi/insdc.sra:SRP067375
              
             (2015).")^. The assembled genome has been deposited at NCBI assembly with the accession number ASM1338718v1^[53](https://www.nature.com/articles/s41597-021-00842-w#ref-CR53 "NCBI Assembly 
              https://identifiers.org/insdc.gca:GCA_013387185.1
              
             (2020).")^. Additional files containing repeated sequences, gene structure, and functional prediction were deposited in the *Figshare*database^[54](https://www.nature.com/articles/s41597-021-00842-w#ref-CR54 "Gallardo-Escárate, C. Additional annotation files_GenSAS. figshare 
              https://doi.org/10.6084/m9.figshare.12847493
              
             (2020).")^.

Footnotes

  1. 50↩︎