Sunday, February 28, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Machine Learning

Facebook Releases AI Model for Protein Sequence Processing

September 22, 2020
in Machine Learning
Facebook Releases AI Model for Protein Sequence Processing
586
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

A team of scientists at Facebook AI Research have released a deep-learning model for processing protein data from DNA sequences. The model contains approximately 700M parameters, was trained on 250 million protein sequences, and learned representations of biological properties that can be used to improve current state-of-the-art in several genomics prediction tasks.

The team described the model and several experiments in a paper published on bioArxiv. Using techniques similar to those used in natural language processing (NLP), the researchers trained a Transformer deep-learning model using unsupervised learning on sequences of amino acids which represent the genetic encoding of proteins. The Transformer learned a representation, or embedding, of the sequences that the researchers showed encodes several properties of the proteins, such as 3-d structure and evolutionary relationships. The team also showed that the embedding, when used as an input feature, can improve performance of other machine-learning tasks on genetic sequence data, such as predicting the evolutionary fitness of genetic mutations.

You might also like

Top Master’s Programs In Machine Learning In The US

Key Company Profile, Production Revenue, Product Picture and Specifications 2025

Machine Learning May Reduce Mental Health Misdiagnosis

Deep-learning models for NLP typically use an embedding—a transformation of high-dimensional vectors into a lower-dimensional space—as the first layer in their networks. These embeddings often encode relationships about the original data in interesting ways; for example, in Google’s famous word2vec embedding, performing vector arithmetic in the embedding space can produce results such as “Paris – France + Poland = Warsaw.”

To learn embeddings for protein sequences, the team built a Transformer neural-network with 669.2M parameters, based on the BERT model used for NLP, and used self-supervised learning to train the model on 250 million sequences from the Uniparc database. The training data consists of sequences of amino acids; similar to masked language modeling in NLP training, each input sequence was “corrupted” by replacing random parts of the sequence with a special mask token, and the network was trained to correctly identify the removed amino acids.

After training, the team investigated the properties of the network’s learned embedding. The embedding maps each amino acid into a point in embedding space; the researchers noted that the space had a “distinct clustering of hydrophobic and polar residues, aromatic amino acids, and organization by molecular weight and charge.” A protein or gene can also be mapped into the space by averaging the points of its constituent amino acids. Using principal component analysis (PCA) on the embedding representation of orthologous genes from different species, the scientists noted that “linear dimensionality reduction recovers species and orthology as primary axes of variation.”

Besides encoding chemical and genetic relationships, the embedding was also useful as input to further machine-learning tasks. One such task is secondary structure prediction. In this task, a machine-learning model tries to predict the local three-dimensional form of portions of a protein chain. By including the embedding representation of the input sequence, the team improved the state-of-the-art result by 2.5 percentage points. The embedding data also improved the task of predicting tertiary protein structure and the effect of mutations.

The team’s lead author Alex Rives highlighted several of the results on Twitter. When asked by deep-learning researcher Gwern Branwen why the team only used 700M parameters in their model, Rives noted that that was the most that could fit in a single GPU. Branwen replied,

You could probably fit more, I didn’t see any mention of reversible layers or reduced precision. Reducing context window length is also an option; it’s unlikely you saturated the full 1024 window (eg predict the 1024th token as accurately as the 2nd).

Facebook isn’t the only major tech company applying its NLP expertise to genomics problems. Google recently announced its BigBird NLP model also achieved new state-of-the-art performance on two genomics tasks. While Google has not released its BigBird code, Facebook has open-sourced their model available on GitHub.


Credit: Google News

Previous Post

Phishing awareness training wears off after a few months

Next Post

Bitcoin-Gold Correlation Hits Record High as Institutions Buy Crypto

Related Posts

Top Master’s Programs In Machine Learning In The US
Machine Learning

Top Master’s Programs In Machine Learning In The US

February 28, 2021
Machine Learning as a Service (MLaaS) Market 2020 Emerging Trend and Advancement Outlook 2025
Machine Learning

Key Company Profile, Production Revenue, Product Picture and Specifications 2025

February 28, 2021
New AI Machine Learning Reduces Mental Health Misdiagnosis
Machine Learning

Machine Learning May Reduce Mental Health Misdiagnosis

February 28, 2021
AI & ML Are Not Same. Here's Why – Analytics India Magazine
Machine Learning

AI & ML Are Not Same. Here's Why – Analytics India Magazine

February 27, 2021
Is Wattpad and its machine learning tool the future of TV? — Quartz
Machine Learning

Is Wattpad and its machine learning tool the future of TV? — Quartz

February 27, 2021
Next Post
Bitcoin-Gold Correlation Hits Record High as Institutions Buy Crypto

Bitcoin-Gold Correlation Hits Record High as Institutions Buy Crypto

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Privacy Commissioner asks for clarity on minister’s powers in Critical Infrastructure Bill
Internet Security

Privacy Commissioner asks for clarity on minister’s powers in Critical Infrastructure Bill

February 28, 2021
Top Master’s Programs In Machine Learning In The US
Machine Learning

Top Master’s Programs In Machine Learning In The US

February 28, 2021
TikTok agrees to pay $92 million to settle teen privacy class-action lawsuit
Internet Security

TikTok agrees to pay $92 million to settle teen privacy class-action lawsuit

February 28, 2021
Machine Learning as a Service (MLaaS) Market 2020 Emerging Trend and Advancement Outlook 2025
Machine Learning

Key Company Profile, Production Revenue, Product Picture and Specifications 2025

February 28, 2021
Cybercrime groups are selling their hacking skills. Some countries are buying
Internet Security

Cybercrime groups are selling their hacking skills. Some countries are buying

February 28, 2021
New AI Machine Learning Reduces Mental Health Misdiagnosis
Machine Learning

Machine Learning May Reduce Mental Health Misdiagnosis

February 28, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Privacy Commissioner asks for clarity on minister’s powers in Critical Infrastructure Bill February 28, 2021
  • Top Master’s Programs In Machine Learning In The US February 28, 2021
  • TikTok agrees to pay $92 million to settle teen privacy class-action lawsuit February 28, 2021
  • Key Company Profile, Production Revenue, Product Picture and Specifications 2025 February 28, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates