This year’s conference kicked off with an interesting Data Science panel that included Tanya Cashorali (TCB Analytics), Jerald Schindler (Alkermes), John Reynders (Alexion Pharmaceuticals), and Lihua Yu (H3 Biomedicine). Without a doubt, data science is important and involves much more than big data, including the entire workflow of data creation, to insight generation, and decision-making. One key message that came across is that not all data is big data, and that we need improvements in data collection and infrastructure support for data analysis and management. Other key take-home messages: breaking down data silos, educating about responsibility, being responsible to not act independently on data, creating an environment that motivates everybody within a team to share, not rewarding bad data behavior, and being insensitive to data sharing to get more out in return. The big issue here is the culture. If the culture rewards bad behavior, bad behavior will dominate. Companies can manage this on an individual institution level, but the deeper challenge is the world of many independent labs and companies. Data science clearly needs the right tool set to address the data analysis, management, and sharing demands in order to keep up with the advancements of science.
- Tanya Cashorali: “Data science can include data wrangling, summarization, exploration and visualization. Ultimately, you’re optimizing the decision-making process. How you go from data to insight is less important than the quality of the results.”
- John Reynders: “Data science is the application of mathematical, algorithmic, statistical, and computational learning methods to broad-scale, heterogeneous, and complex data to create insight and generate hypotheses. More importantly, the ability to ask the right question is key.”
- Jerald Schindler: “Data Science is the offspring of statistics and computer science which is used to collect, store, retrieve, display, and analyze information for knowledge, insights, and decision-making.
- Lihua Yu: “At its core, data science is about discovering insights, testing hypotheses, and providing creative solutions using ‘big data’ robustly, systematically, and objectively. “
Additional interesting quotes worth highlighting:
- Lihua Yu:
- “We have a big responsibility to democratize data.”
- “Make data very accessible to enable scientific conversation.”
- “If you cannot explain it simply, you do not understand it deeply.”
- Tanya Cashorali:
- “Everybody should become a data scientist instead of being a friend of a data scientist.”
- “Data scientists need a clear hypothesis, if left to play they will flounder.”
- “We may not all be doctors but we need to be involved in our own care by knowing how to use the data.”
- John Reynders:
- “Data science should help us decide the next relevant question to ask.”
- “Data science is a person asking a question and human insight pushing an answer forward.”
- “I want the doctor who is treating my daughters to be a human being, thank you very much, but supported by machine learning.”
- “Data science is not big data, but leverages big data.”
- Jerald Schindler: ”If we’re sharing data, we need to be responsible about how we use it.”
- Irene Blat: “Data are just data, it’s what you do with it that makes it actionable.”
- Carl Zimmer:
- “Height is a great example of how hard it is to get from knowing something is heritable to understanding how.”
- “Computer in the 1880s: a woman writing out data by hand.”
- “The origin of modern intelligence tests began in the early 1900s; they were also intertwined with eugenics and lead to some of our darkest history, and all this based on flawed information about the Kallikak family tree.”
- “I am concerned that as we have more access to our genes we will fall back on old ideas of heredity to understand it.”
- “There is a need for journalists in getting better at understanding and explaining the complexity of science – this in relation to the role of the environment on intelligence. There are lots of intriguing correlations, but correlations are not associations.”
- “Heredity is so central to our own identity, but the science shows it is marvelously complex and still kind of magical.”
- “Epigenetics has completely taken over pop culture, racing ahead of the science.”
Coinciding with Bio-IT was a number of major announcements as listed below:
Benjamin Franklin Award
- This year’s Benjamin Franklin Award – an award presented annually by Bioinformatics.org to an individual who has, in her or his practice, promoted free and open access to the materials and methods used in the life sciences – was presented to Desmond Higgins, Professor of Bioinformatics at University College Dublin (UCD) Conway Institute. Professor Higgins has been working on methods and software for DNA and protein sequence alignment since 1985. He wrote a series of sequence alignment programs called CLUSTAL. He later co-developed T-Coffee (Tree-based Consistency Objective Function for Alignment Evaluation), and his current research group at the UCD Conway Institute develops new, open source tools for evolutionary biology and the multivariate analysis of omics data.
Best Practices Award
- As every year the commercial sector is actively following the Best Practices awards leading up to this event (see the pre-conference finalists). This year’s Best Practices award winners included:
- Project Platypus by Takeda Pharmaceuticals – Clinical & Health IT – for developing a Data and Analytics Hub platform that addresses issues of data transparency, trust, and accessibility to support efficient generation of data insights for functions across R&D.
- AstraZeneca’s (Discovery Sciences, IMED Biotech Unit) Deep Learning for Automated Phenotypic Image Analysis software and corresponding workflows based on convolutional neural networks with improved automated image analysis in “Informatics & Knowledge Management”.
- Celgene Laboratory Instrument Mobile Alert (LabAlert) system – in IT Infrastructure – which allows a user to get warning or error notifications generated from laboratory instruments delivered instantly to user’s mobile devices.
- Center for Innovation and Bioinformatics, Neurological Clinical Research Institute, Massachusetts General Hospital in “Personalized & Translational Medicine” for developing a NeuroBank patient-centric platform enabling researchers to capture and link patients’ data from multiple observational clinical research and natural history studies to medical images, genetic information. tissue repositories, and patient reported outcomes.
- Alexion Pharmaceuticals’ SmartPanel Rare Genetic Disease Diagnosis Algorithm Competition Platform as the Judges’ Prize.
- Honorable Mentions for Pfizer/SciBite, for building an Artificial Intelligence-driven tool for enabling pharmaceutical acquisitions and collaborations, and The Jackson Laboratory for Genomic Medicine for building its Clinical Knowledgebase (CKB) that provides evidence-based information to clinicians, researchers, and ultimately patients.
- Six commercial products were selected and honored among the 46 originally considered new products or product updates. Winners were selected across five product categories:
- Storage Infrastructure & Hardware: PetaGene, PetaSuite Cloud Edition, Version1.2
- Data Visualization & Exploration: Nanome, NanoPro
- Analysis & Data Computing: Sinequa, Sinequa ES v10
- Genomic Data Services: Diploid, Moon 1.0
- Data Integration & Management: The Hyve BV, RADAR-base
- The “People’s Choice Award” was going to the genomics data analysis and interpretation platform ROSALIND by OnRamp Bioinformatics, and the Judges’ Price to Linguamatics for iScite 2.0.
Following is a review of the different announcements that coincided with the conference spanning commercial product launches, new partnerships and collaborations, product integrations, and other interesting and relevant topics.
Strategic partnerships and collaborations
- Google Cloud for Life Sciences adds new products and partners: Google is actively working in the life science sector as demonstrated with a new product launch and the announcements of new partnerships of which the complete list now includes: BC Platforms, the Broad Institute’s FireCloud, Dell EMC, DNAstack, Elastifile, Komprise, OnRamp.Bio, Petagene, Seven Bridges Genomics, and WuXi NextCODE.
- PierianDx partners with ScienceVision to bring precision medicine support to Southeast Asia Under the partnership, ScienceVision will commercialize and distribute the PierianDx Clinical Genomics Workspace™ (CGW) platform for clinical genomic informatics, classification, interpretation, and clinical reporting.
- Google Cloud showcased their recent progress with an example being Variant Transforms, which helps organizations structure genomic variant data in BigQuery.
- Bluebee introduces BLUEBASE which runs on the Bluebee core data analysis platform and provides post-sequencing intelligent data aggregation, data querying, and deep knowledge mining. BLUEBASE is designed for use by diagnostic assay developers, pharmaceutical researchers, clinical trial operators, and investigators of population-scale initiatives.
- Linguamatics debuts iScite, which supports artificial intelligence-based scientific searches and guides their users via an Answer Routing Engine that delivers insightful answers to their search questions across biomedical data sources.
- L7 Informatics Announces the Availability of Microsoft Genomics on the L7 Enterprise Science Platform The expanded offering enhances ESP’s configurability and content portfolio. Furthermore, L7 has validated the performance of ESP on Microsoft Azure, enabling its users to confidently utilize Azure for their scientific process and data management needs.
- BioBam Launches a New Version of Blast2GO for the analysis of novel genomes. With the launch of Blast2GO Version 5 it is now an all-in-one solution for functional genomics analysis of newly sequenced organisms. With this latest version, users will benefit from several new visualization and workflow features allowing them to perform high-throughput as well as exploratory analysis in just one place.
- Dotmatics announced Bioregister 3.0, a registration system which offers new capabilities that enhance intellectual property protection and supports emerging methodologies in biologic drug discovery.
- Illumina to acquire Edico Genome for $100M Edico’s platform will be accessible through Illumina sequencers or on the cloud. Illumina will work with cloud storage providers such as Amazon to make Edico’s pipelines available on Illumina’s BaseSpace analysis platform.
- Agilent Completes Genohm Buyout, Strengthens Portfolio The acquisition will allow Agilent to widen its growth prospects, and further expand its software portfolio by adding LIMS and workflow management. This strategic acquisition complements Agilent’s own sales efforts and will help it to offer better services to its customers. Learn more about Genohm in the enlightenbio company spotlight Genohm Aims to Hit the Sweet Spot With a Customizable SLIMS + ELN Solution for Lab Information Management.
Other News of Interest
- Lab7 Systems Appoints New CEO and Changes Name to L7 Informatics which reflects the expansion of L7 Informatics’ product offering on its mission of providing synchronized solutions for science and health, and to pioneer scientific process and data management (SPDM) in order to accelerate discoveries and drive higher quality of healthcare. Along with this name change the company also announced its new website at l7informatics.com.
- Broad Institute is seeking every drug ever developed to build the Drug Repurposing Hub to find new potential uses.
- Broad Institute Spin Out Aims To Bring Precision Medicine To Autoimmune Disease Celsius Therapeutics, is launching with $65 million from funders including Third Rock Ventures, GV (formerly Google Ventures), Heritage Provider Network, Casdin Capital, and Alexandria Venture Investments. It aims to use Regev and Kuchroo’s techniques for constructing detailed profiles of individual cells to develop drugs for autoimmune diseases and cancer immunotherapies.