Documentation

Overview

1. Functions of TMKit

1.1 Modules summary

import tmkit as tmk
# Tool module Function class Note
1 tmk.fetch Quality control fetch example data
2 tmk.qc Quality control generate and extract metrics of sequences and structures
3 tmk.seq Sequence parse sequences and structures
4 tmk.msa Sequence produce commands for generating multiple sequence alignment
5 tmk.feature Feature protein biological features
6 tmk.collate Mapping seek difference between RCSB and PDBTM structures
7 tmk.topo Topology transmembrane protein topologies
8 tmk.rrc Feature performance evaluation of residue contact prediction
9 tmk.ppi Connectivity protein connectivity
10 tmk.mut Annotation transmembrane protein's mutation data processing
11 tmk.vs Visualization visualize protein structures
12 tmk.cath Annotation access protein domains and families
13 tmk.mapping Mapping conversion between protein identifiers
14 tmk.edge Edge extraction rewiring of connections between residues

1.2 Module functions

...

Visualization

Identification of protein-protein interaction (PPI) interfaces of proteins is critical to understand the biological processes governed by them.

...

Sequence

The sequence pre-processing module is a fundamental component of TMKit, designed to handle sequence reading in diverse formats, sequence retrieval from various sources, and multiple sequence alignment (MSA) generation.

...

Quality control

This module evaluates various criteria, including the experimentation methods used, resolution, subclass, and sequence length, to qualify proteins in bulk.

...

Topology

TMKit can be used to obtain more detailed non-TM topologies, that is, side 1, side 2, strand, coil, inside, loop, and interfacial. Besides the structure-derived topologies, TMKit also supplies predicted topologies by embedding TMHMM and Phobius running on the command line interface (CLI) and within Python

...

Mapping

Identifier mapping between structural and sequence data (e.g., FASTA residue IDs and PDB residue IDs) is an important technical premise to guarantee the correct interpretation of biological findings.

...

Annotation

Amino acid residues of transmembrane proteins to be involved in mutations and function domains can be annotated through the MutHTP, Pred-MutHTP and CATH databases.

...

Connectivity

Studying connections of a protein to others in a PPI network is of crucial importance to understand its biological role.

...

Edge extraction

We provide a high-performance computing library for extracting connections between residues by constructing bipartite and unipartite graphs (where residue connections are treated as edges) and assigning features in linear time with respect to the number of residues used.

...

Feature

A set of transmembrane protein-specific and general-purpose features is provided by TMKit in support of machine learning modelling.