Data Entry: Please note that the research database will be replaced by UNIverse by the end of October 2023. Please enter your data into the system https://universe-intern.unibas.ch. Thanks

Login for users with Unibas email account...

Login for registered users without Unibas email account...

 
PyBDA: a command line tool for automated analysis of big biological data sets
JournalArticle (Originalarbeit in einer wissenschaftlichen Zeitschrift)
 
ID 4526726
Author(s) Dirmeier, Simon; Emmenlauer, Mario; Dehio, Christoph; Beerenwinkel, Niko
Author(s) at UniBasel Dehio, Christoph
Emmenlauer, Mario
Year 2019
Title PyBDA: a command line tool for automated analysis of big biological data sets
Journal BMC bioinformatics
Volume 20
Number 1
Pages / Article-Number 564
Keywords Big data; Command line; Computing cluster; Data analysis; Grid engine; Machine learning; Pipeline
Mesh terms Algorithms; Automation; Computational Biology, methods; Computing Methodologies; HeLa Cells; Humans; Image Processing, Computer-Assisted; Machine Learning
Abstract Analysing large and high-dimensional biological data sets poses significant computational difficulties for bioinformaticians due to lack of accessible tools that scale to hundreds of millions of data points. We developed a novel machine learning command line tool called PyBDA for automated, distributed analysis of big biological data sets. By using Apache Spark in the backend, PyBDA scales to data sets beyond the size of current applications. It uses Snakemake in order to automatically schedule jobs to a high-performance computing cluster. We demonstrate the utility of the software by analyzing image-based RNA interference data of 150 million single cells. PyBDA allows automated, easy-to-use data analysis using common statistical methods and machine learning algorithms. It can be used with simple command line calls entirely making it accessible to a broad user base. PyBDA is available at https://pybda.rtfd.io.
Publisher BioMed Central
ISSN/ISBN 1471-2105
URL https://doi.org/10.1186/s12859-019-3087-8
edoc-URL https://edoc.unibas.ch/74629/
Full Text on edoc Available
Digital Object Identifier DOI 10.1186/s12859-019-3087-8
PubMed ID http://www.ncbi.nlm.nih.gov/pubmed/31718539
ISI-Number WOS:000497733100002
Document type (ISI) Journal Article
 
   

MCSS v5.8 PRO. 0.323 sec, queries - 0.000 sec ©Universität Basel  |  Impressum   |    
12/05/2024