NexaBio: Advanced Protein Structure Prediction Models

NexaBio is a sophisticated two-stage model suite designed for high-accuracy protein structure prediction from amino acid sequences. It comprises two complementary models:

  • NexaBio_1: A Convolutional Neural Network (CNN) and Bidirectional LSTM (BiLSTM) model for secondary structure prediction.
  • NexaBio_2: A Variational Autoencoder (VAE) and Diffusion-based model for tertiary (3D) structure prediction.

NexaBio is a core component of the Nexa Scientific Model Suite, a collection of machine learning models advancing scientific discovery.

Model Overview

NexaBio_1: Secondary Structure Prediction

  • Architecture: CNN combined with BiLSTM for robust sequence modeling.
  • Input: Amino acid sequence (one-hot encoded or embedded).
  • Output: Secondary structure classifications (e.g., Helix, Sheet, Coil).
  • Use Case: Identification of local structural motifs and protein folding patterns.

NexaBio_2: Tertiary Structure Prediction

  • Architecture: VAE integrated with a Diffusion Model for generative 3D modeling.
  • Input: Amino acid sequence (optionally augmented with secondary structure predictions).
  • Output: 3D coordinates of protein backbone atoms.
  • Use Case: Full tertiary structure prediction for structural analysis and design.

Applications

  • Structural Bioinformatics: Enabling precise protein structure analysis for research.
  • Drug Discovery: Supporting protein-ligand interaction studies and therapeutic design.
  • Protein Engineering: Facilitating the design of novel proteins for industrial and medical applications.
  • Synthetic Biology: Generating protein structures for biotechnological innovation.
  • Academic Research: Serving as a tool for educational and exploratory studies.

Getting Started

Example Usage

from transformers import AutoModel

# Initialize the secondary structure prediction model
model_sec = AutoModel.from_pretrained("Allanatrix/NexaBio_1")

# Initialize the tertiary structure prediction model
model_ter = AutoModel.from_pretrained("Allanatrix/NexaBio_2")

# Process an amino acid sequence (refer to model documentation for input formatting)

For comprehensive instructions, including inference APIs and preprocessing details, consult the individual model cards on Hugging Face.

Citation and License

If you utilize NexaBio in your research or applications, please cite this repository and include a link to the Nexa R&D Space.
The models and associated code are licensed under the Boost Software License 1.1 (BSL-1.1).

Part of the Nexa Scientific Ecosystem

Discover other components of the Nexa Scientific Stack:


Developed and maintained by Allan, an independent machine learning researcher specializing in scientific AI and infrastructure.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Allanatrix/NexaBio

Collection including Allanatrix/NexaBio