To see the other types of publications on this topic, follow the link: Sequence motif.

Journal articles on the topic 'Sequence motif'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Sequence motif.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Roebuck, K. A., D. P. Szeto, K. P. Green, Q. N. Fan, and W. E. Stumph. "Octamer and SPH motifs in the U1 enhancer cooperate to activate U1 RNA gene expression." Molecular and Cellular Biology 10, no. 1 (January 1990): 341–52. http://dx.doi.org/10.1128/mcb.10.1.341-352.1990.

Full text
Abstract:
The transcriptional enhancer of a chicken U1 small nuclear RNA gene has been shown to extend over approximately 50 base pairs of DNA sequence located 180 to 230 base pairs upstream of the U1 transcription initiation site. It is composed of multiple functional motifs, including a GC box, an octamer motif, and a novel SPH motif. The contributions of these three distinct sequence motifs to enhancer function were studied with an oocyte expression assay. Under noncompetitive conditions in oocytes, the SPH motif is capable of stimulating U1 RNA transcription in the absence of the other functional motifs, whereas the octamer motif by itself lacks this ability. However, to form a transcription complex that is stable to challenge by a second competing small nuclear RNA transcription unit, both the octamer and SPH motifs are required. The GC box, although required for full enhancer activity, is not essential for stable complex formation in oocytes. Site-directed mutagenesis was used to study the DNA sequence requirements of the SPH motif. Functional activity of the SPH motif is spread throughout a 24-base-pair region 3' of the octamer but is particularly dependent upon sequences near an SphI restriction site located at the center of the SPH motif. Using embryonic chicken tissue as a source material, we identified and partially purified a factor, termed SBF, that binds sequence specifically to the SPH motif of the U1 enhancer. The ability of this factor to recognize and bind to mutant enhancer DNA fragments in vitro correlates with the functional activity of the corresponding enhancer sequences in vivo.
APA, Harvard, Vancouver, ISO, and other styles
2

Roebuck, K. A., D. P. Szeto, K. P. Green, Q. N. Fan, and W. E. Stumph. "Octamer and SPH motifs in the U1 enhancer cooperate to activate U1 RNA gene expression." Molecular and Cellular Biology 10, no. 1 (January 1990): 341–52. http://dx.doi.org/10.1128/mcb.10.1.341.

Full text
Abstract:
The transcriptional enhancer of a chicken U1 small nuclear RNA gene has been shown to extend over approximately 50 base pairs of DNA sequence located 180 to 230 base pairs upstream of the U1 transcription initiation site. It is composed of multiple functional motifs, including a GC box, an octamer motif, and a novel SPH motif. The contributions of these three distinct sequence motifs to enhancer function were studied with an oocyte expression assay. Under noncompetitive conditions in oocytes, the SPH motif is capable of stimulating U1 RNA transcription in the absence of the other functional motifs, whereas the octamer motif by itself lacks this ability. However, to form a transcription complex that is stable to challenge by a second competing small nuclear RNA transcription unit, both the octamer and SPH motifs are required. The GC box, although required for full enhancer activity, is not essential for stable complex formation in oocytes. Site-directed mutagenesis was used to study the DNA sequence requirements of the SPH motif. Functional activity of the SPH motif is spread throughout a 24-base-pair region 3' of the octamer but is particularly dependent upon sequences near an SphI restriction site located at the center of the SPH motif. Using embryonic chicken tissue as a source material, we identified and partially purified a factor, termed SBF, that binds sequence specifically to the SPH motif of the U1 enhancer. The ability of this factor to recognize and bind to mutant enhancer DNA fragments in vitro correlates with the functional activity of the corresponding enhancer sequences in vivo.
APA, Harvard, Vancouver, ISO, and other styles
3

XING, ERIC P., WEI WU, MICHAEL I. JORDAN, and RICHARD M. KARP. "LOGOS: A MODULAR BAYESIAN MODEL FOR DE NOVO MOTIF DETECTION." Journal of Bioinformatics and Computational Biology 02, no. 01 (March 2004): 127–54. http://dx.doi.org/10.1142/s0219720004000508.

Full text
Abstract:
The complexity of the global organization and internal structure of motifs in higher eukaryotic organisms raises significant challenges for motif detection techniques. To achieve successful de novo motif detection, it is necessary to model the complex dependencies within and among motifs and to incorporate biological prior knowledge. In this paper, we present LOGOS, an integrated LOcal and GlObal motif Sequence model for biopolymer sequences, which provides a principled framework for developing, modularizing, extending and computing expressive motif models for complex biopolymer sequence analysis. LOGOS consists of two interacting submodels: HMDM, a local alignment model capturing biological prior knowledge and positional dependency within the motif local structure; and HMM, a global motif distribution model modeling frequencies and dependencies of motif occurrences. Model parameters can be fit using training motifs within an empirical Bayesian framework. A variational EM algorithm is developed for de novo motif detection. LOGOS improves over existing models that ignore biological priors and dependencies in motif structures and motif occurrences, and demonstrates superior performance on both semi-realistic test data and cis-regulatory sequences from yeast and Drosophila genomes with regard to sensitivity, specificity, flexibility and extensibility.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhai, Xiandun, and Adilai Tuerxun. "DNA Sequence Specificity Prediction Algorithm Based on Artificial Intelligence." Mathematical Problems in Engineering 2022 (October 3, 2022): 1–8. http://dx.doi.org/10.1155/2022/4150106.

Full text
Abstract:
DNA sequence specificity refers to the ability of DNA sequences to bind specific proteins. These proteins play a central role in gene regulation such as transcription and alternative splicing. Obtaining DNA sequence specificity is very important for establishing the regulatory model of the biological system and identifying pathogenic variants. Motifs are sequence patterns shared by fragments of DNA sequences that bind to specific proteins. At present, some motif mining algorithms have been proposed, which perform well under the condition of given motif length. This research is based on deep learning. As for the description of motif level, this paper constructs an AI based method to predict the length of the motif. The experimental results show that the prediction accuracy on the test set is more than 90%.
APA, Harvard, Vancouver, ISO, and other styles
5

Wang, Mengchi, David Wang, Kai Zhang, Vu Ngo, Shicai Fan, and Wei Wang. "Motto: Representing Motifs in Consensus Sequences with Minimum Information Loss." Genetics 216, no. 2 (August 19, 2020): 353–58. http://dx.doi.org/10.1534/genetics.120.303597.

Full text
Abstract:
Sequence analysis frequently requires intuitive understanding and convenient representation of motifs. Typically, motifs are represented as position weight matrices (PWMs) and visualized using sequence logos. However, in many scenarios, in order to interpret the motif information or search for motif matches, it is compact and sufficient to represent motifs by wildcard-style consensus sequences (such as [GC][AT]GATAAG[GAC]). Based on mutual information theory and Jensen-Shannon divergence, we propose a mathematical framework to minimize the information loss in converting PWMs to consensus sequences. We name this representation as sequence Motto and have implemented an efficient algorithm with flexible options for converting motif PWMs into Motto from nucleotides, amino acids, and customized characters. We show that this representation provides a simple and efficient way to identify the binding sites of 1156 common transcription factors (TFs) in the human genome. The effectiveness of the method was benchmarked by comparing sequence matches found by Motto with PWM scanning results found by FIMO. On average, our method achieves a 0.81 area under the precision-recall curve, significantly (P-value < 0.01) outperforming all existing methods, including maximal positional weight, Cavener’s method, and minimal mean square error. We believe this representation provides a distilled summary of a motif, as well as the statistical justification.
APA, Harvard, Vancouver, ISO, and other styles
6

Wright, Elisé P., Mahmoud A. S. Abdelhamid, Michelle O. Ehiabor, Melanie C. Grigg, Kelly Irving, Nicole M. Smith, and Zoë A. E. Waller. "Epigenetic modification of cytosines fine tunes the stability of i-motif DNA." Nucleic Acids Research 48, no. 1 (November 28, 2019): 55–62. http://dx.doi.org/10.1093/nar/gkz1082.

Full text
Abstract:
Abstract i-Motifs are widely used in nanotechnology, play a part in gene regulation and have been detected in human nuclei. As these structures are composed of cytosine, they are potential sites for epigenetic modification. In addition to 5-methyl- and 5-hydroxymethylcytosine modifications, recent evidence has suggested biological roles for 5-formylcytosine and 5-carboxylcytosine. Herein the human telomeric i-motif sequence was used to examine how these four epigenetic modifications alter the thermal and pH stability of i-motifs. Changes in melting temperature and transitional pH depended on both the type of modification and its position within the i-motif forming sequence. The cytosines most sensitive to modification were next to the first and third loops within the structure. Using previously described i-motif forming sequences, we screened the MCF-7 and MCF-10A methylomes to map 5-methylcytosine and found the majority of sequences were differentially methylated in MCF7 (cancerous) and MCF10A (non-cancerous) cell lines. Furthermore, i-motif forming sequences stable at neutral pH were significantly more likely to be epigenetically modified than traditional acidic i-motif forming sequences. This work has implications not only in the epigenetic regulation of DNA, but also allows discreet tunability of i-motif stability for nanotechnological applications.
APA, Harvard, Vancouver, ISO, and other styles
7

MAURER-STROH, SEBASTIAN, HE GAO, HAO HAN, LIES BAETEN, JOOST SCHYMKOWITZ, FREDERIC ROUSSEAU, LOUXIN ZHANG, and FRANK EISENHABER. "MOTIF DISCOVERY WITH DATA MINING IN 3D PROTEIN STRUCTURE DATABASES: DISCOVERY, VALIDATION AND PREDICTION OF THE U-SHAPE ZINC BINDING ("HUF-ZINC") MOTIF." Journal of Bioinformatics and Computational Biology 11, no. 01 (February 2013): 1340008. http://dx.doi.org/10.1142/s0219720013400088.

Full text
Abstract:
Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif—structural motif—function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL ( http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/ ).
APA, Harvard, Vancouver, ISO, and other styles
8

Liu, Xiang-Qin, and Jing Yang. "Bacterial Thymidylate Synthase with Intein, Group II Intron, and Distinctive ThyX Motifs." Journal of Bacteriology 186, no. 18 (September 15, 2004): 6316–19. http://dx.doi.org/10.1128/jb.186.18.6316-6319.2004.

Full text
Abstract:
ABSTRACT The ThyX class of thymidylate synthases was previously characterized by a common ThyX motif, RHRX7S. We report bacterial ThyX sequences having distinctive ThyX motifs, suggesting a more general ThyX motif, R/THRX7-8S. One ThyX sequence has an intein in its ThyX motif that was shown to do protein splicing and a group II intron in its gene, suggesting a hot spot for these self-splicing mobile elements.
APA, Harvard, Vancouver, ISO, and other styles
9

Pal, Soumitra, Jan Hoinka, and Teresa M. Przytycka. "Co-SELECT reveals sequence non-specific contribution of DNA shape to transcription factor binding in vitro." Nucleic Acids Research 47, no. 13 (June 21, 2019): 6632–41. http://dx.doi.org/10.1093/nar/gkz540.

Full text
Abstract:
Abstract Understanding the principles of DNA binding by transcription factors (TFs) is of primary importance for studying gene regulation. Recently, several lines of evidence suggested that both DNA sequence and shape contribute to TF binding. However, the following compelling question is yet to be considered: in the absence of any sequence similarity to the binding motif, can DNA shape still increase binding probability? To address this challenge, we developed Co-SELECT, a computational approach to analyze the results of in vitro HT-SELEX experiments for TF–DNA binding. Specifically, Co-SELECT leverages the presence of motif-free sequences in late HT-SELEX rounds and their enrichment in weak binders allows Co-SELECT to detect an evidence for the role of DNA shape features in TF binding. Our approach revealed that, even in the absence of the sequence motif, TFs have propensity to bind to DNA molecules of the shape consistent with the motif specific binding. This provides the first direct evidence that shape features that accompany the preferred sequence motifs also bestow an advantage for weak, sequence non-specific binding.
APA, Harvard, Vancouver, ISO, and other styles
10

Gunawardana, D., V. A. Likic, and K. R. Gayler. "A Comprehensive Bioinformatics Analysis of the Nudix Superfamily inArabidopsis thaliana." Comparative and Functional Genomics 2009 (2009): 1–13. http://dx.doi.org/10.1155/2009/820381.

Full text
Abstract:
Nudix enzymes are a superfamily with a conserved common reaction mechanism that provides the capacity for the hydrolysis of a broad spectrum of metabolites. We used hidden Markov models based on Nudix sequences from the PFAM and PROSITE databases to identify Nudix hydrolases encoded by theArabidopsisgenome. 25 Nudix hydrolases were identified and classified into 11 individual families by pairwise sequence alignments. Intron phases were strikingly conserved in each family. Phylogenetic analysis showed that all multimember families formed monophyletic clusters. Conserved familial sequence motifs were identified with the MEME motif analysis algorithm. One motif (motif 4) was found in three diverse families. All proteins containing motif 4 demonstrated a degree of preference for substrates containing an ADP moiety. We conclude that HMM model-based genome scanning and MEME motif analysis, respectively, can significantly improve the identification and assignment of function of new members of this mechanistically-diverse protein superfamily.
APA, Harvard, Vancouver, ISO, and other styles
11

Motta-Mena, Laura B., Sarah A. Smith, Michael J. Mallory, Jason Jackson, Jiarong Wang, and Kristen W. Lynch. "A Disease-associated Polymorphism Alters Splicing of the Human CD45 Phosphatase Gene by Disrupting Combinatorial Repression by Heterogeneous Nuclear Ribonucleoproteins (hnRNPs)." Journal of Biological Chemistry 286, no. 22 (April 20, 2011): 20043–53. http://dx.doi.org/10.1074/jbc.m111.218727.

Full text
Abstract:
Alternative splicing is typically controlled by complexes of regulatory proteins that bind to sequences within or flanking variable exons. The identification of regulatory sequence motifs and the characterization of sequence motifs bound by splicing regulatory proteins have been essential to predicting splicing regulation. The activation-responsive sequence (ARS) motif has previously been identified in several exons that undergo changes in splicing upon T cell activation. hnRNP L binds to this ARS motif and regulates ARS-containing exons; however, hnRNP L does not function alone. Interestingly, the proteins that bind together with hnRNP L differ for different exons that contain the ARS core motif. Here we undertake a systematic mutational analysis of the best characterized context of the ARS motif, namely the ESS1 sequence from CD45 exon 4, to understand the determinants of binding specificity among the components of the ESS1 regulatory complex and the relationship between protein binding and function. We demonstrate that different mutations within the ARS motif affect specific aspects of regulatory function and disrupt the binding of distinct proteins. Most notably, we demonstrate that the C77G polymorphism, which correlates with autoimmune disease susceptibility in humans, disrupts exon silencing by preventing the redundant activity of hnRNPs K and E2 to compensate for the weakened function of hnRNP L. Therefore, these studies provide an important example of the functional relevance of combinatorial function in splicing regulation and suggest that additional polymorphisms may similarly disrupt function of the ESS1 silencer.
APA, Harvard, Vancouver, ISO, and other styles
12

Kumar, Vinod, Gopal Singh, A. K. Verma, and Sanjeev Agrawal. "In Silico Characterization of Histidine Acid Phytase Sequences." Enzyme Research 2012 (December 5, 2012): 1–8. http://dx.doi.org/10.1155/2012/845465.

Full text
Abstract:
Histidine acid phytases (HAPhy) are widely distributed enzymes among bacteria, fungi, plants, and some animal tissues. They have a significant role as an animal feed enzyme and in the solubilization of insoluble phosphates and minerals present in the form of phytic acid complex. A set of 50 reference protein sequences representing HAPhy were retrieved from NCBI protein database and characterized for various biochemical properties, multiple sequence alignment (MSA), homology search, phylogenetic analysis, motifs, and superfamily search. MSA using MEGA5 revealed the presence of conserved sequences at N-terminal “RHGXRXP” and C-terminal “HD.” Phylogenetic tree analysis indicates the presence of three clusters representing different HAPhy, that is, PhyA, PhyB, and AppA. Analysis of 10 commonly distributed motifs in the sequences indicates the presence of signature sequence for each class. Motif 1 “SPFCDLFTHEEWIQYDYLQSLGKYYGYGAGNPLGPAQGIGF” was present in 38 protein sequences representing clusters 1 (PhyA) and 2 (PhyB). Cluster 3 (AppA) contains motif 9 “KKGCPQSGQVAIIADVDERTRKTGEAFAAGLAPDCAITVHTQADTSSPDP” as a signature sequence. All sequences belong to histidine acid phosphatase family as resulted from superfamily search. No conserved sequence representing 3- or 6-phytase could be identified using multiple sequence alignment. This in silico analysis might contribute in the classification and future genetic engineering of this most diverse class of phytase.
APA, Harvard, Vancouver, ISO, and other styles
13

FIGGE, JAMES, and TEMPLE F. SMITH. "Cell-division sequence motif." Nature 334, no. 6178 (July 1988): 109. http://dx.doi.org/10.1038/334109a0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Das, Rahul K., Yongqi Huang, Aaron H. Phillips, Richard W. Kriwacki, and Rohit V. Pappu. "Cryptic sequence features within the disordered protein p27Kip1 regulate cell cycle signaling." Proceedings of the National Academy of Sciences 113, no. 20 (May 2, 2016): 5616–21. http://dx.doi.org/10.1073/pnas.1516277113.

Full text
Abstract:
Peptide motifs embedded within intrinsically disordered regions (IDRs) of proteins are often the sites of posttranslational modifications that control cell-signaling pathways. How do IDR sequences modulate the functionalities of motifs? We answer this question using the polyampholytic C-terminal IDR of the cell cycle inhibitory protein p27Kip1 (p27). Phosphorylation of Thr-187 (T187) within the p27 IDR controls entry into S phase of the cell division cycle. Additionally, the conformational properties of polyampholytic sequences are predicted to be influenced by the linear patterning of oppositely charged residues. Therefore, we designed sequence variants of the p27 IDR to alter charge patterning outside the primary substrate motif containing T187. Computer simulations and biophysical measurements confirm predictions regarding the impact of charge patterning on the global dimensions of IDRs. Through functional studies, we uncover cryptic sequence features within the p27 IDR that influence the efficiency of T187 phosphorylation. Specifically, we find a positive correlation between T187 phosphorylation efficiency and the weighted net charge per residue of an auxiliary motif. We also find that accumulation of positive charges within the auxiliary motif can diminish the efficiency of T187 phosphorylation because this increases the likelihood of long-range intra-IDR interactions that involve both the primary and auxiliary motifs and inhibit their contributions to function. Importantly, our findings suggest that the cryptic sequence features of the WT p27 IDR negatively regulate T187 phosphorylation signaling. Our approaches provide a generalizable strategy for uncovering the influence of sequence contexts on the functionalities of primary motifs in other IDRs.
APA, Harvard, Vancouver, ISO, and other styles
15

Hou, Benjun, Suping Feng, and Yaoting Wu. "Systemic Identification ofHevea brasiliensisEST-SSR Markers and Primer Screening." Journal of Nucleic Acids 2017 (2017): 1–9. http://dx.doi.org/10.1155/2017/6590902.

Full text
Abstract:
This research aimed to systematically identify and preliminarily validate theHevea brasiliensisexpressed sequence tag (EST) information using Simple Sequence Repeat (SSR) and provide evidence for further development of SSR molecular marker. The definition of general SSR features ofHeveaEST splicing sequences and development of SSR primers founded the basis of diversity analysis and variety identification forHeveatree resource. 1134 SSR loci were identified in the EST splicing sequence and distributed in 840 Unigene. The occurrence rate of SSR loci was 23.9%, and the average distribution distance of EST-SSR was 2.59 kb. The major repeat type was mononucleotide repeat motif, which accounted for 38.89%, while the corresponding value was 36.95% for dinucleotide repeat motif and 18.17% for trinucleotide repeat motif; the proportion of other motifs was only 5.99%. The superior repeat motifs for mononucleotide, dinucleotide, and trinucleotide were A/T, AG/CT, and AAG/CTT, respectively. 739 pair of primers were designed for 1134 SSR loci. PCR amplification was performed onHeveaReyan5-11, Reyan87-6-47, and PR107, and 180 pairs of primers were selected which were able to amplify polymorphism bands.
APA, Harvard, Vancouver, ISO, and other styles
16

LIANG, S., M. P. SAMANTA, and B. A. BIEGEL. "cWINNOWER ALGORITHM FOR FINDING FUZZY DNA MOTIFS." Journal of Bioinformatics and Computational Biology 02, no. 01 (March 2004): 47–60. http://dx.doi.org/10.1142/s0219720004000466.

Full text
Abstract:
The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in proteinbinding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a suffciently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting threemember sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N=12,000 for (l,d)=(15,4).
APA, Harvard, Vancouver, ISO, and other styles
17

Hatstat, A. Katherine, Michael D. Pupi, and Dewey G. McCafferty. "Predicting PY motif-mediated protein-protein interactions in the Nedd4 family of ubiquitin ligases." PLOS ONE 16, no. 10 (October 12, 2021): e0258315. http://dx.doi.org/10.1371/journal.pone.0258315.

Full text
Abstract:
The Nedd4 family contains several structurally related but functionally distinct HECT-type ubiquitin ligases. The members of the Nedd4 family are known to recognize substrates through their multiple WW domains, which recognize PY motifs (PPxY, LPxY) or phospho-threonine or phospho-serine residues. To better understand protein interactor recognition mechanisms across the Nedd4 family, we report the development and implementation of a python-based tool, PxYFinder, to identify PY motifs in the primary sequences of previously identified interactors of Nedd4 and related ligases. Using PxYFinder, we find that, on average, half of Nedd4 family interactions are likely PY-motif mediated. Further, we find that PPxY motifs are more prevalent than LPxY motifs and are more likely to occur in proline-rich regions and that PPxY regions are more disordered on average relative to LPxY-containing regions. Informed by consensus sequences for PY motifs across the Nedd4 interactome, we rationally designed a focused peptide library and employed a computational screen, revealing sequence- and biomolecular interaction-dependent determinants of WW-domain/PY-motif interactions. Cumulatively, our efforts provide a new bioinformatic tool and expand our understanding of sequence and structural factors that contribute to PY-motif mediated interactor recognition across the Nedd4 family.
APA, Harvard, Vancouver, ISO, and other styles
18

Yu, Qiang, Xiang Zhao, and Hongwei Huo. "A new algorithm for DNA motif discovery using multiple sample sequence sets." Journal of Bioinformatics and Computational Biology 17, no. 04 (August 2019): 1950021. http://dx.doi.org/10.1142/s0219720019500215.

Full text
Abstract:
DNA motif discovery plays an important role in understanding the mechanisms of gene regulation. Most existing motif discovery algorithms can identify motifs in an efficient and effective manner when dealing with small datasets. However, large datasets generated by high-throughput sequencing technologies pose a huge challenge: it is too time-consuming to process the entire dataset, but if only a small sample sequence set is processed, it is difficult to identify infrequent motifs. In this paper, we propose a new DNA motif discovery algorithm: first divide the input dataset into multiple sample sequence sets, then refine initial motifs of each sample sequence set with the expectation maximization method, and finally combine all the results from each sample sequence set. Besides, we design a new initial motif generation method with the utilization of the entire dataset, which helps to identify infrequent motifs. The experimental results on the simulated data show that the proposed algorithm has better time performance for large datasets and better accuracy of identifying infrequent motifs than the compared algorithms. Also, we have verified the validity of the proposed algorithm on the real data.
APA, Harvard, Vancouver, ISO, and other styles
19

Bredesen, Bjørn André, and Marc Rehmsmeier. "DNA sequence models of genome-wide Drosophila melanogaster Polycomb binding sites improve generalization to independent Polycomb Response Elements." Nucleic Acids Research 47, no. 15 (July 24, 2019): 7781–97. http://dx.doi.org/10.1093/nar/gkz617.

Full text
Abstract:
Abstract Polycomb Response Elements (PREs) are cis-regulatory DNA elements that maintain gene transcription states through DNA replication and mitosis. PREs have little sequence similarity, but are enriched in a number of sequence motifs. Previous methods for modelling Drosophila melanogaster PRE sequences (PREdictor and EpiPredictor) have used a set of 7 motifs and a training set of 12 PREs and 16-23 non-PREs. Advances in experimental methods for mapping chromatin binding factors and modifications has led to the publication of several genome-wide sets of Polycomb targets. In addition to the seven motifs previously used, PREs are enriched in the GTGT motif, recently associated with the sequence-specific DNA binding protein Combgap. We investigated whether models trained on genome-wide Polycomb sites generalize to independent PREs when trained with control sequences generated by naive PRE models and including the GTGT motif. We also developed a new PRE predictor: SVM-MOCCA. Training PRE predictors with genome-wide experimental data improves generalization to independent data, and SVM-MOCCA predicts the majority of PREs in three independent experimental sets. We present 2908 candidate PREs enriched in sequence and chromatin signatures. 2412 of these are also enriched in H3K4me1, a mark of Trithorax activated chromatin, suggesting that PREs/TREs have a common sequence code.
APA, Harvard, Vancouver, ISO, and other styles
20

WU, CATHY H., HONGZHAN HUANG, and JERRY MCLARTY. "GENE FAMILY IDENTIFICATION NETWORK DESIGN FOR PROTEIN SEQUENCE ANALYSIS." International Journal on Artificial Intelligence Tools 08, no. 04 (December 1999): 419–32. http://dx.doi.org/10.1142/s0218213099000282.

Full text
Abstract:
With the exponential accumulation of sequence data, continued progress in the Human Genome Project will depend increasingly on advanced computational tools to manage and analyze the data. Utilizing information embedded within families of homologous sequences, a gene family identification approach may facilitate the understanding of gene functions. We have developed a GeneFIND (Gene Family Identification Network Design) system for database searching against gene families. It provides rapid and accurate protein family identification by combining global and motif sequence similarities and incorporating ProClass family information. Multi-level filters are used, starting with the MOTIFIND neural networks and BLAST search, followed by SSEARCH alignment, motif pattern match, hidden Markov modeling of motifs and ClustalW motif alignment. GeneFIND has been implemented as a full-scale system for the classification of more than 1200 ProSite and 6000 PIR families. It has been used to identify thousands of new family members and is well suited for genomic sequence analysis. The system is available for on-line family identification from our WWW server ().
APA, Harvard, Vancouver, ISO, and other styles
21

Zhang, S., M. J. Ruiz-Echevarria, Y. Quan, and S. W. Peltz. "Identification and characterization of a sequence motif involved in nonsense-mediated mRNA decay." Molecular and Cellular Biology 15, no. 4 (April 1995): 2231–44. http://dx.doi.org/10.1128/mcb.15.4.2231.

Full text
Abstract:
In both prokaryotes and eukaryotes, nonsense mutations in a gene can enhance the decay rate or reduce the abundance of the mRNA transcribed from that gene, and we call this process nonsense-mediated mRNA decay. We have been investigating the cis-acting sequences involved in this decay pathway. Previous experiments have demonstrated that, in addition to a nonsense codon, specific sequences 3' of a nonsense mutation, which have been defined as downstream elements, are required for mRNA destabilization. The results presented here identify a sequence motif (TGYYGATGYYYYY, where Y stands for either T or C) that can predict regions in genes that, when positioned 3' of a nonsense codon, promote rapid decay of its mRNA. Sequences harboring two copies of the motif from five regions in the PGK1, ADE3, and HIS4 genes were able to function as downstream elements. In addition, four copies of this motif can function as an independent downstream element. The sequences flanking the motif played a more significant role in modulating its activity when fewer copies of the sequence motif were present. Our results indicate the sequences 5' of the motif can modulate its activity by maintaining a certain distance between the sequence motif and the termination codon. We also suggest that the sequences 3' of the motif modulate the activity of the downstream element by forming RNA secondary structures. Consistent with this view, a stem-loop structure positioned 3' of the sequence motif can enhance the activity of the downstream element. This sequence motif is one of the few elements that have been identified that can predict regions in genes that can be involved in mRNA turnover. The role of these sequences in mRNA decay is discussed.
APA, Harvard, Vancouver, ISO, and other styles
22

Sahu, Santosh Kumar, Himadri Gourav Behuria, Sangam Gupta, and Babita Sahoo. "Sequence Analysis of a Subset of Plasma Membrane Raft Proteome Containing CXXC Metal Binding Motifs." International Journal of Knowledge Discovery in Bioinformatics 5, no. 2 (July 2015): 1–15. http://dx.doi.org/10.4018/ijkdb.2015070101.

Full text
Abstract:
In an attempt to identify the metal sensing proteins localized to mammalian plasma membrane, the authors screened a list of 300 raft associated proteins that are involved in cellular signaling mechanisms by searching the presence of metal thionin (CXXC) motifs. 50 proteins were found to possess CXXC motifs that could act as potential metal sensing proteins. The authors determined membrane topologies of the above CXXC motif containing proteins using TM-pred and analyzed the positions of their transmembrane (TM) domains using Bio-edit software. Based on the topology of CXXC domains, the authors classified all the raft-associated metal sensing proteins into six categories. They are (i) Exoplasmic tails with CXXC motif, (ii) Exoplasmic loops with CXXC motif, (iii) Cytosolic tails with CXXC motif, (iv) Cytosolic loop with CXXC motif, (v) TM domains with CXXC motifs, (vi) Proteins with multiple topologies of CXXC motif. The authors' study will lead to understanding of the raft-mediated mechanism of heavy metal sensing and signaling in mammalian cells.
APA, Harvard, Vancouver, ISO, and other styles
23

Bostan, Hamed, Naomie Salim, Zeti Azura Hussein, Peter Klappa, and Mohd Shahir Shamsir. "CMD: A Database to Store the Bonding States of Cysteine Motifs with Secondary Structures." Advances in Bioinformatics 2012 (October 10, 2012): 1–5. http://dx.doi.org/10.1155/2012/849830.

Full text
Abstract:
Computational approaches to the disulphide bonding state and its connectivity pattern prediction are based on various descriptors. One descriptor is the amino acid sequence motifs flanking the cysteine residue motifs. Despite the existence of disulphide bonding information in many databases and applications, there is no complete reference and motif query available at the moment. Cysteine motif database (CMD) is the first online resource that stores all cysteine residues, their flanking motifs with their secondary structure, and propensity values assignment derived from the laboratory data. We extracted more than 3 million cysteine motifs from PDB and UniProt data, annotated with secondary structure assignment, propensity value assignment, and frequency of occurrence and coefficiency of their bonding status. Removal of redundancies generated 15875 unique flanking motifs that are always bonded and 41577 unique patterns that are always nonbonded. Queries are based on the protein ID, FASTA sequence, sequence motif, and secondary structure individually or in batch format using the provided APIs that allow remote users to query our database via third party software and/or high throughput screening/querying. The CMD offers extensive information about the bonded, free cysteine residues, and their motifs that allows in-depth characterization of the sequence motif composition.
APA, Harvard, Vancouver, ISO, and other styles
24

Ralton, J. E., X. Lu, A. M. Hutcheson, and R. A. Quinlan. "Identification of two N-terminal non-alpha-helical domain motifs important in the assembly of glial fibrillary acidic protein." Journal of Cell Science 107, no. 7 (July 1, 1994): 1935–48. http://dx.doi.org/10.1242/jcs.107.7.1935.

Full text
Abstract:
The non-alpha-helical N-terminal domain of intermediate filament proteins plays a key role in filament assembly. Previous studies have identified a nonapeptide motif, SSYRRIFGG, in the non-alpha-helical N-terminal domain of vimentin that is required for assembly. This motif is also found in desmin, peripherin and the type IV intermediate filament proteins. GFAP is the only type III intermediate filament protein in which this motif is not readily identified. This study has identified two motifs in the non-alpha-helical N-terminal domain of mouse GFAP that play important roles in GFAP assembly. One motif is located at the very N terminus and has the consensus sequence, MERRRITS-ARRSY. It has some characteristics in common with the vimentin nonapeptide motif, SSYRRIFGG, including its location in the non-alpha-helical N-terminal domain and a concentration of arginine residues. Unlike the vimentin motif in which even conserved sequence changes affect filament assembly, the GFAP consensus sequence, MERRRITS-ARRSY, can be replaced by a completely unrelated sequence; namely, the heptapeptide, MVRANKR, derived from the lambda cII protein. When fused to GFAP sequences with sequential deletions of the N-terminal domain, the lambda cII heptapeptide was used to help identify a second motif, termed the RP-box, which is located just upstream of the GFAP alpha-helical rod domain. This RP-box affected the efficiency of filament assembly as well as protein-protein interactions in the filament, as shown by sedimentation assays and electron microscopy. These results are supported by previous data, which showed that the dramatic reorganization of GFAP within cells was due to phosphorylation-dephosphorylation of a site located in this RP-box. The results in this study suggest the RP-box motif to be a key modulator in the mechanism of GFAP assembly, and support a role for this motif in both the nucleation and elongation phases of filament assembly. The RP-box motif in GFAP has the consensus sequence, RLSL-RM-PP. Sequences similar to the GFAP RP-box motif are also to be found in vimentin, desmin and peripherin. Like GFAP, these include phosphorylation and proteolysis sites and are adjacent to the start of the central alpha-helical rod domain, suggesting that this motif of general importance to type III intermediate filament protein assembly.
APA, Harvard, Vancouver, ISO, and other styles
25

Li, Xiang, Linna Ma, Xinyue Mei, Yixiang Liu, and Huichuan Huang. "ggmotif: An R Package for the extraction and visualization of motifs from MEME software." PLOS ONE 17, no. 11 (November 3, 2022): e0276979. http://dx.doi.org/10.1371/journal.pone.0276979.

Full text
Abstract:
MEME (Multiple Em for Motif Elicitation) is the most commonly used tool to identify motifs within deoxyribonucleic acid (DNA) or protein sequences. However, the results generated by the MEMEare saved using file formats .xml and .txt, which are difficult to read, visualize, or integrate with other widely used phylogenetic tree packages, such as ggtree. To overcome this problem, we developed the ggmotif R package, which provides two easy-to-use functions that can facilitate the extraction and visualization of motifs from the results files generated by the MEME. ggmotif can extract the information of the location of motif(s) on the corresponding sequence(s) from the .xml format file and visualize it. Additionally, the data extracted by ggmotif can be easily integrated with the phylogenetic data. On the other hand, ggmotif can obtain the sequence of each motif from the .txt format file and draw the sequence logo with the function ggseqlogo from the ggseqlogo R package. The ggmotif R package is freely available (including examples and vignettes) from GitHub at https://github.com/lixiang117423/ggmotif or from CRAN at https://CRAN.R-project.org/package=ggmotif.
APA, Harvard, Vancouver, ISO, and other styles
26

Blum, Christopher F., and Markus Kollmann. "Neural networks with circular filters enable data efficient inference of sequence motifs." Bioinformatics 35, no. 20 (March 27, 2019): 3937–43. http://dx.doi.org/10.1093/bioinformatics/btz194.

Full text
Abstract:
Abstract Motivation Nucleic acids and proteins often have localized sequence motifs that enable highly specific interactions. Due to the biological relevance of sequence motifs, numerous inference methods have been developed. Recently, convolutional neural networks (CNNs) have achieved state of the art performance. These methods were able to learn transcription factor binding sites from ChIP-seq data, resulting in accurate predictions on test data. However, CNNs typically distribute learned motifs across multiple filters, making them difficult to interpret. Furthermore, networks trained on small datasets often do not generalize well to new sequences. Results Here we present circular filters, a novel convolutional architecture, that convolves sequences with circularly permutated variants of the same filter. We motivate circular filters by the observation that CNNs frequently learn filters that correspond to shifted and truncated variants of the true motif. Circular filters enable learning of full-length motifs and allow easy interpretation of the learned filters. We show that circular filters improve motif inference performance over a wide range of hyperparameters as well as sequence length. Furthermore, we show that CNNs with circular filters in most cases outperform conventional CNNs at inferring DNA binding sites from ChIP-seq data. Availability and implementation Code is available at https://github.com/christopherblum. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
27

Hong, Jian, Ying C. Q. Zang, Maria V. Tejada-Simon, Milena Kozovska, Sufang Li, Rana A. K. Singh, Deye Yang, Victor M. Rivera, James K. Killian, and Jingwu Z. Zhang. "A Common TCR V-D-J Sequence in Vβ13.1 T Cells Recognizing an Immunodominant Peptide of Myelin Basic Protein in Multiple Sclerosis." Journal of Immunology 163, no. 6 (September 15, 1999): 3530–38. http://dx.doi.org/10.4049/jimmunol.163.6.3530.

Full text
Abstract:
Abstract T cell responses to the immunodominant peptide (residues 83–99) of myelin basic protein are potentially associated with multiple sclerosis (MS). This study was undertaken to examine whether a common sequence motif(s) exists within the TCR complementarity-determining region (CDR)-3 of T cells recognizing the MBP83–99 peptide. Twenty MBP83–99-reactive T cell clones derived from patients with MS were analyzed for CDR3 sequences, which revealed several shared motifs. Some Vβ13.1 T cell clones derived from different patients with MS were found to contain an identical CDR3 motif, Vβ13.1-LGRAGLTY. Oligonucleotides complementary to the shared CDR3 motifs were used as specific probes to detect identical target CDR3 sequences in a large panel of T cell lines reactive to MBP83–99 and unprimed PBMC. The results revealed that, in contrast to other CDR3 motifs examined, the LGRAGLTY motif was common to T cells recognizing the MBP83–99 peptide, as evident by its expression in the majority of MBP83–99-reactive T cell lines (36/44) and PBMC specimens (15/48) obtained from randomly selected MS patients. The motif was also detected in lower expression in some PBMC specimens from healthy individuals, suggesting the presence of low precursor frequency of T cells expressing this motif in healthy individuals. This study provides new evidence indicating that the identified LGRAGLTY motif is preferentially expressed in MBP83–99-reactive T cells. The findings have important implications in monitoring and targeting MBP83–99-reactive T cells in MS.
APA, Harvard, Vancouver, ISO, and other styles
28

Wohlschlegel, James A., Brian T. Dwyer, David Y. Takeda, and Anindya Dutta. "Mutational Analysis of the Cy Motif from p21 Reveals Sequence Degeneracy and Specificity for Different Cyclin-Dependent Kinases." Molecular and Cellular Biology 21, no. 15 (August 1, 2001): 4868–74. http://dx.doi.org/10.1128/mcb.21.15.4868-4874.2001.

Full text
Abstract:
ABSTRACT Inhibitors, activators, and substrates of cyclin-dependent kinases (cdks) utilize a cyclin-binding sequence, known as a Cy or RXL motif, to bind directly to the cyclin subunit. Alanine scanning mutagenesis of the Cy motif of the cdk inhibitor p21 revealed that the conserved arginine or leucine (constituting the conserved RXL sequence) was important for p21's ability to inhibit cyclin E-cdk2 activity. Further analysis of mutant Cy motifs showed, however, that RXL was neither necessary nor sufficient for a functional cyclin-binding motif. Replacement of either of these two residues with small hydrophobic residues such as valine preserved p21's inhibitory activity on cyclin E-cdk2, while mutations in either polar or charged residues dramatically impaired p21's inhibitory activity. Expressing p21N with non-RXL Cy sequences inhibited growth of mammalian cells, providing in vivo confirmation that RXL was not necessary for a functional Cy motif. We also show that the variant Cy motifs identified in this study can effectively target substrates to cyclin-cdk complexes for phosphorylation, providing additional evidence that these non-RXL motifs are functional. Finally, binding studies using p21 Cy mutants demonstrated that the Cy motif was essential for the association of p21 with cyclin E-cdk2 but not with cyclin A-cdk2. Taking advantage of this differential specificity toward cyclin E versus cyclin A, we demonstrate that cell growth inhibition was absolutely dependent on the ability of a p21 derivative to inhibit cyclin E-cdk2.
APA, Harvard, Vancouver, ISO, and other styles
29

Magandhi, Mahat, Sobir, Yudiwanti W. E. Kusumo, Sudarmono, and Deden Derajat Matra. "Development and characterization of Simple Sequence Repeats (SSRs) markers in durian kura-kura (Durio testudinarius Becc.) using NGS data." IOP Conference Series: Earth and Environmental Science 948, no. 1 (December 1, 2021): 012082. http://dx.doi.org/10.1088/1755-1315/948/1/012082.

Full text
Abstract:
Abstract Durian Kura-kura (Durio testudinarius Becc.) belongs to the Malvaceae family and is an endemic species of Borneo. Recently, genomic-based next-generation sequencing (NGS) approaches have been carried out for germplasm conservation and plant breeding programs. The NGS technologies allow plant genomes to be sequenced quickly and inexpensively and enable the efficient development of SSR markers through the in-silico approaches. This study aimed to develop and characterize simple sequence repeats (SSRs) from the assembled genome. The 1203929 scaffolds of the assembled genome were produced from the Ray assembler. The SSRs were identified and extracted using the MISA program produced 4315 sequences containing SSRs. The six motif repeats of SSRs were identified; consist of 431 sequences of dinucleotide (the most motif is AT), 3257 sequences of trinucleotide (the most motif is TTA), 516 sequences of tetranucleotide (the most motif is AAAT), 89 sequences of pentanucleotide (the most motif is ATTTT), 18 sequences of hexanucleotide and four sequences of heptanucleotide. The new SSRs markers will be used in further studies of genetic population of D. testudinarius and plant breeding programs.
APA, Harvard, Vancouver, ISO, and other styles
30

Liu, Jiao, Wen Rui Xia, Yan Ping Hu, Yuan Yao, Shao Ping Fu, Rui Jun Duan, Rui Mei Li, and Jian Chun Guo. "Cloning and Analysis of MeCWINV6 Promoter from Biofuel Plant Cassava (Manihot esculenta Crantz)." Advanced Materials Research 986-987 (July 2014): 25–29. http://dx.doi.org/10.4028/www.scientific.net/amr.986-987.25.

Full text
Abstract:
In order to gain insight into the specific function of the cassava cell wall invertase 6 (MeCWINV6), the promoter sequence of MeCWINV6 gene was cloned using the PCR amplification approach. 118 bp CDS sequence and 1042 bp potential promoter sequence of MeCWINV6 gene were obtained. PlantCARE analyzed the putative cis-elements in silico revealed that these elements can be grouped into five classes: basic transcription elements (CAAT box and TATA box), light responsive elements (ACE, AE-box, ATCT-motif, AT1-motif, Box 4, GAG-motif, GT1-motif and Sp1), phytohormone responsive motifs (GARE-motif, TATC-box, TGACG-motif and TCA-element), defense and stress responsive element (TC-rich repeats and HSE), wounding and pathogen responsive elements (W-box and WUN-motif). This data demonstrate that it might be associated to regulate the cell wall invertase gene function in source-sink relations of cassava starch accumulation and response to internal and environmental stimuli.
APA, Harvard, Vancouver, ISO, and other styles
31

Bielecka, Patrycja, Anna Dembska, and Bernard Juskowiak. "Monitoring of pH Using an i-Motif-Forming Sequence Containing a Fluorescent Cytosine Analogue, tC." Molecules 24, no. 5 (March 8, 2019): 952. http://dx.doi.org/10.3390/molecules24050952.

Full text
Abstract:
The i-motif is a four-stranded DNA structure formed from the cytosine (C)-rich ssDNA sequence, which is stabilized in slightly acidic pH. Additionally, labeling of a cytosine-rich sequence with a fluorescent molecule may constitute a way to construct a pH-sensitive biosensor. In this paper, we report tC-modified fluorescent probes that contain RET-related sequence C4GC4GC4GC4A. Results of the UV absorption melting experiments, circular dichroism (CD) spectra, and steady-state fluorescence measurements of tC-modified i-motifs are presented and discussed here. Efficient fluorescence quenching of tC fluorophore occurred upon lowering the pH from 8.0 to 5.5. Furthermore, we present and discuss fluorescence spectra of systems containing tC-modified i-motifs and complementary G-rich sequences in the ratios 1:1, 1:2, and 1:3 in response to pH changes. The fluorescence anisotropy was proposed for the study of conformational switching of the i-motif structure for tC-probes in the presence and absence of a complementary sequence. The possibility of using of the sensor for monitoring pH changes was demonstrated.
APA, Harvard, Vancouver, ISO, and other styles
32

Speckmann, Wayne, Aarthi Narayanan, Rebecca Terns, and Michael P. Terns. "Nuclear Retention Elements of U3 Small Nucleolar RNA." Molecular and Cellular Biology 19, no. 12 (December 1, 1999): 8412–21. http://dx.doi.org/10.1128/mcb.19.12.8412.

Full text
Abstract:
ABSTRACT The processing and methylation of precursor rRNA is mediated by the box C/D small nucleolar RNAs (snoRNAs). These snoRNAs differ from most cellular RNAs in that they are not exported to the cytoplasm. Instead, these RNAs are actively retained in the nucleus where they assemble with proteins into mature small nucleolar ribonucleoprotein particles and are targeted to their intranuclear site of action, the nucleolus. In this study, we have identified the cis-acting sequences responsible for the nuclear retention of U3 box C/D snoRNA by analyzing the nucleocytoplasmic distributions of an extensive panel of U3 RNA variants after injection of the RNAs into Xenopus oocyte nuclei. Our data indicate the importance of two conserved sequence motifs in retaining U3 RNA in the nucleus. The first motif is comprised of the conserved box C′ and box D sequences that characterize the box C/D family. The second motif contains conserved box sequences B and C. Either motif is sufficient for nuclear retention, but disruption of both motifs leads to mislocalization of the RNAs to the cytoplasm. Variant RNAs that are not retained also lack 5′ cap hypermethylation and fail to associate with fibrillarin. Furthermore, our results indicate that nuclear retention of U3 RNA does not simply reflect its nucleolar localization. A fragment of U3 containing the box B/C motif is not localized to nucleoli but retained in coiled bodies. Thus, nuclear retention and nucleolar localization are distinct processes with differing sequence requirements.
APA, Harvard, Vancouver, ISO, and other styles
33

Adams, Peter D., Xiaotong Li, William R. Sellers, Kayla B. Baker, Xiaohong Leng, J. Wade Harper, Yoichi Taya, and William G. Kaelin. "Retinoblastoma Protein Contains a C-terminal Motif That Targets It for Phosphorylation by Cyclin-cdk Complexes." Molecular and Cellular Biology 19, no. 2 (February 1, 1999): 1068–80. http://dx.doi.org/10.1128/mcb.19.2.1068.

Full text
Abstract:
ABSTRACT Stable association of certain proteins, such as E2F1 and p21, with cyclin-cdk2 complexes is dependent upon a conserved cyclin-cdk2 binding motif that contains the core sequence ZRXL, where Z and X are usually basic. In vitro phosphorylation of the retinoblastoma tumor suppressor protein, pRB, by cyclin A-cdk2 and cyclin E-cdk2 was inhibited by a short peptide spanning the cyclin-cdk2 binding motif present in E2F1. Examination of the pRB C terminus revealed that it contained sequence elements related to ZRXL. Site-directed mutagenesis of one of these sequences, beginning at residue 870, impaired the phosphorylation of pRB in vitro. A synthetic peptide spanning this sequence also inhibited the phosphorylation of pRB in vitro. pRB C-terminal truncation mutants lacking this sequence were hypophosphorylated in vitro and in vivo despite the presence of intact cyclin-cdk phosphoacceptor sites. Phosphorylation of such mutants was restored by fusion to the ZRXL-like motif derived from pRB or to the ZRXL motifs from E2F1 or p21. Phospho-site-specific antibodies revealed that certain phosphoacceptor sites strictly required a C-terminal ZRXL motif whereas at least one site did not. Furthermore, this residual phosphorylation was sufficient to inactivate pRB in vivo, implying that there are additional mechanisms for directing cyclin-cdk complexes to pRB. Thus, the C terminus of pRB contains a cyclin-cdk interaction motif of the type found in E2F1 and p21 that enables it to be recognized and phosphorylated by cyclin-cdk complexes.
APA, Harvard, Vancouver, ISO, and other styles
34

Su, J., R. A. Bapat, G. Visakan, and J. Moradian-Oldak. "An Evolutionarily Conserved Helix Mediates Ameloblastin-Cell Interaction." Journal of Dental Research 99, no. 9 (May 13, 2020): 1072–81. http://dx.doi.org/10.1177/0022034520918521.

Full text
Abstract:
Ameloblastin (Ambn) has the potential to regulate cell-matrix adhesion through familiar cell-binding domains, but the proposed sequence motifs are not highly conserved across species. Here, we report that Ambn binds to ameloblast-like cell membranes through a highly evolutionary conserved amphipathic helix-forming (AH) motif encoded by exon 5. We applied high-resolution confocal microscopy to show colocalization of Ambn with ameloblast membrane surfaces in developing mouse incisors. Using a series of Ambn-derived peptides and Ambn variants, we showed that Ambn binds to cell membranes through a motif within the sequence encoded by exon 5. Using peptides derived from the N- or C-termini of this sequence, as well as Ambn variants that lacked or had a disrupted AH motif, we demonstrated that the AH motif located at the N-terminus of the sequence is involved in cell-Ambn adhesion. Sequence analysis revealed that this highly conserved AH motif is absent from other enamel matrix proteins, including amelogenin, enamelin, and amelotin. Collectively, these data suggest that Ambn binds to the cell surface membrane via a helix-forming motif and provide insight into the molecular mechanism and function of Ambn in enamel cell-matrix interaction.
APA, Harvard, Vancouver, ISO, and other styles
35

Andersson, Samuel A., and Jens Lagergren. "Motif Yggdrasil: Sampling Sequence Motifs from a Tree Mixture Model." Journal of Computational Biology 14, no. 5 (June 2007): 682–97. http://dx.doi.org/10.1089/cmb.2007.r010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

TAYLOR, WILLIAM R. "Motif-Biased Protein Sequence Alignment." Journal of Computational Biology 1, no. 4 (January 1994): 297–310. http://dx.doi.org/10.1089/cmb.1994.1.297.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Shaw, Gerry. "A neurofilament-specific sequence motif." Trends in Biochemical Sciences 17, no. 9 (September 1992): 345. http://dx.doi.org/10.1016/0968-0004(92)90309-w.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Colombo, Nicoló, and Nikos Vlassis. "FastMotif: spectral sequence motif discovery." Bioinformatics 31, no. 16 (April 16, 2015): 2623–31. http://dx.doi.org/10.1093/bioinformatics/btv208.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Martyanov, Viktor, and Robert H. Gross. "Transcriptional Regulation in the G1-S Cell Cycle Stage in Fungi: Insights through Computational Analysis." Open Bioinformatics Journal 6, no. 1 (September 7, 2012): 43–54. http://dx.doi.org/10.2174/1875036201206010043.

Full text
Abstract:
The transcription factor complexes Mlu1-box binding factor (MBF) and Swi4/6 cell cycle box binding factor (SBF) regulate the cell cycle in Saccharomyces cerevisiae. They activate hundreds of genes and are responsible for nor-mal cell cycle progression from G1 to S phase. We investigated the conservation of MBF and SBF binding sites during fungal evolution. Orthologs of S. cerevisiae targets of these transcription factors were identified in 37 fungal species and their upstream regions were analyzed for putative transcription factor binding sites. Both groups displayed enrichment in specific putative regulatory DNA sequences in their upstream regions and showed different preferred upstream motif loca-tions, variable patterns of evolutionary conservation of the motifs and enrichment in unique biological functions for the regulated genes. The results indicate that despite high sequence similarity of upstream DNA motifs putatively associated with G1-S transcriptional regulation by MBF and SBF transcription factors, there are important upstream sequence feature differences that may help differentiate the two seemingly similar regulatory modes. The incorporation of upstream motif sequence comparison, positional distribution and evolutionary variability of the motif can complement functional infor-mation about roles of the respective gene products and help elucidate transcriptional regulatory pathways and functions.
APA, Harvard, Vancouver, ISO, and other styles
40

Weiner, Benjamin G., Andrew G. T. Pyo, Yigal Meir, and Ned S. Wingreen. "Motif-pattern dependence of biomolecular phase separation driven by specific interactions." PLOS Computational Biology 17, no. 12 (December 29, 2021): e1009748. http://dx.doi.org/10.1371/journal.pcbi.1009748.

Full text
Abstract:
Eukaryotic cells partition a wide variety of important materials and processes into biomolecular condensates—phase-separated droplets that lack a membrane. In addition to nonspecific electrostatic or hydrophobic interactions, phase separation also depends on specific binding motifs that link together constituent molecules. Nevertheless, few rules have been established for how these ubiquitous specific, saturating, motif-motif interactions drive phase separation. By integrating Monte Carlo simulations of lattice-polymers with mean-field theory, we show that the sequence of heterotypic binding motifs strongly affects a polymer’s ability to phase separate, influencing both phase boundaries and condensate properties (e.g. viscosity and polymer diffusion). We find that sequences with large blocks of single motifs typically form more inter-polymer bonds, which promotes phase separation. Notably, the sequence of binding motifs influences phase separation primarily by determining the conformational entropy of self-bonding by single polymers. This contrasts with systems where the molecular architecture primarily affects the energy of the dense phase, providing a new entropy-based mechanism for the biological control of phase separation.
APA, Harvard, Vancouver, ISO, and other styles
41

Peng, He. "CFSP: a collaborative frequent sequence pattern discovery algorithm for nucleic acid sequence classification." PeerJ 8 (April 20, 2020): e8965. http://dx.doi.org/10.7717/peerj.8965.

Full text
Abstract:
Background Conserved nucleic acid sequences play an essential role in transcriptional regulation. The motifs/templates derived from nucleic acid sequence datasets are usually used as biomarkers to predict biochemical properties such as protein binding sites or to identify specific non-coding RNAs. In many cases, template-based nucleic acid sequence classification performs better than some feature extraction methods, such as N-gram and k-spaced pairs classification. The availability of large-scale experimental data provides an unprecedented opportunity to improve motif extraction methods. The process for pattern extraction from large-scale data is crucial for the creation of predictive models. Methods In this article, a Teiresias-like feature extraction algorithm to discover frequent sub-sequences (CFSP) is proposed. Although gaps are allowed in some motif discovery algorithms, the distance and number of gaps are limited. The proposed algorithm can find frequent sequence pairs with a larger gap. The combinations of frequent sub-sequences in given protracted sequences capture the long-distance correlation, which implies a specific molecular biological property. Hence, the proposed algorithm intends to discover the combinations. A set of frequent sub-sequences derived from nucleic acid sequences with order is used as a base frequent sub-sequence array. The mutation information is attached to each sub-sequence array to implement fuzzy matching. Thus, a mutate records a single nucleotide variant or nucleotides insertion/deletion (indel) to encode a slight difference between frequent sequences and a matched subsequence of a sequence under investigation. Conclusions The proposed algorithm has been validated with several nucleic acid sequence prediction case studies. These data demonstrate better results than the recently available feature descriptors based methods based on experimental data sets such as miRNA, piRNA, and Sigma 54 promoters. CFSP is implemented in C++ and shell script; the source code and related data are available at https://github.com/HePeng2016/CFSP.
APA, Harvard, Vancouver, ISO, and other styles
42

Uchiumi, F., K. Semba, Y. Yamanashi, J. Fujisawa, M. Yoshida, K. Inoue, K. Toyoshima, and T. Yamamoto. "Characterization of the promoter region of the src family gene lyn and its trans activation by human T-cell leukemia virus type I-encoded p40tax." Molecular and Cellular Biology 12, no. 9 (September 1992): 3784–95. http://dx.doi.org/10.1128/mcb.12.9.3784-3795.1992.

Full text
Abstract:
The src family gene lyn is expressed preferentially in B lymphocytes but very little in normal T lymphocytes. Transcription of the lyn gene in T lymphocytes was shown to be induced by the p40tax protein encoded by human T-cell lymphotropic virus type I. For determination of the mechanism of p40tax-mediated trans activation, the transcriptional promoter region of the lyn gene was characterized. By endonuclease S1 mapping, the transcriptional initiation sites were identified within the 770-bp EcoRI-SacI fragment of the 5'-terminal portion of the human lyn gene. This fragment showed promoter activity when placed upstream of the bacterial chloramphenicol acetyltransferase gene and transfected into various cell lines. Nucleotide sequence analysis revealed that the lyn promoter region contained four GC box-like sequences but not a TATA or CCAAT box. In addition, it contained sequences characteristic of a cyclic AMP-responsive element, octamer-binding motif, PEA3-like motifs, and NF kappa B-binding motif-like sequence. Mutational analysis suggested that the octamer-binding motif sequence is of primary importance for the lyn promoter activity but that the other elements are not. Cotransfection of various chloramphenicol acetyltransferase constructs containing different length of the lyn promoter together with p40tax expression plasmids into Jurkat T cells showed that the sequence responsible for p40tax-induced transcription is present around the transcription initiation sites.
APA, Harvard, Vancouver, ISO, and other styles
43

Uchiumi, F., K. Semba, Y. Yamanashi, J. Fujisawa, M. Yoshida, K. Inoue, K. Toyoshima, and T. Yamamoto. "Characterization of the promoter region of the src family gene lyn and its trans activation by human T-cell leukemia virus type I-encoded p40tax." Molecular and Cellular Biology 12, no. 9 (September 1992): 3784–95. http://dx.doi.org/10.1128/mcb.12.9.3784.

Full text
Abstract:
The src family gene lyn is expressed preferentially in B lymphocytes but very little in normal T lymphocytes. Transcription of the lyn gene in T lymphocytes was shown to be induced by the p40tax protein encoded by human T-cell lymphotropic virus type I. For determination of the mechanism of p40tax-mediated trans activation, the transcriptional promoter region of the lyn gene was characterized. By endonuclease S1 mapping, the transcriptional initiation sites were identified within the 770-bp EcoRI-SacI fragment of the 5'-terminal portion of the human lyn gene. This fragment showed promoter activity when placed upstream of the bacterial chloramphenicol acetyltransferase gene and transfected into various cell lines. Nucleotide sequence analysis revealed that the lyn promoter region contained four GC box-like sequences but not a TATA or CCAAT box. In addition, it contained sequences characteristic of a cyclic AMP-responsive element, octamer-binding motif, PEA3-like motifs, and NF kappa B-binding motif-like sequence. Mutational analysis suggested that the octamer-binding motif sequence is of primary importance for the lyn promoter activity but that the other elements are not. Cotransfection of various chloramphenicol acetyltransferase constructs containing different length of the lyn promoter together with p40tax expression plasmids into Jurkat T cells showed that the sequence responsible for p40tax-induced transcription is present around the transcription initiation sites.
APA, Harvard, Vancouver, ISO, and other styles
44

Mohanty, Satarupa, Prasant Kumar Pattnaik, Ahmed Abdulhakim Al-Absi, and Dae-Ki Kang. "A Review on Planted (l, d) Motif Discovery Algorithms for Medical Diagnose." Sensors 22, no. 3 (February 5, 2022): 1204. http://dx.doi.org/10.3390/s22031204.

Full text
Abstract:
Personalized diagnosis of chronic disease requires capturing the continual pattern across the biological sequence. This repeating pattern in medical science is called “Motif”. Motifs are the short, recurring patterns of biological sequences that are supposed signify some health disorder. They identify the binding sites for transcription factors that modulate and synchronize the gene expression. These motifs are important for the analysis and interpretation of various health issues like human disease, gene function, drug design, patient’s conditions, etc. Searching for these patterns is an important step in unraveling the mechanisms of gene expression properly diagnose and treat chronic disease. Thus, motif identification has a vital role in healthcare studies and attracts many researchers. Numerous approaches have been characterized for the motif discovery process. This article attempts to review and analyze fifty-four of the most frequently found motif discovery processes/algorithms from different approaches and summarizes the discussion with their strengths and weaknesses.
APA, Harvard, Vancouver, ISO, and other styles
45

Shen, Zeyang, Marten A. Hoeksema, Zhengyu Ouyang, Christopher Benner, and Christopher K. Glass. "MAGGIE: leveraging genetic variation to identify DNA sequence motifs mediating transcription factor binding and function." Bioinformatics 36, Supplement_1 (July 1, 2020): i84—i92. http://dx.doi.org/10.1093/bioinformatics/btaa476.

Full text
Abstract:
Abstract Motivation Genetic variation in regulatory elements can alter transcription factor (TF) binding by mutating a TF binding motif, which in turn may affect the activity of the regulatory elements. However, it is unclear which motifs are prone to impact transcriptional regulation if mutated. Current motif analysis tools either prioritize TFs based on motif enrichment without linking to a function or are limited in their applications due to the assumption of linearity between motifs and their functional effects. Results We present MAGGIE (Motif Alteration Genome-wide to Globally Investigate Elements), a novel method for identifying motifs mediating TF binding and function. By leveraging measurements from diverse genotypes, MAGGIE uses a statistical approach to link mutations of a motif to changes of an epigenomic feature without assuming a linear relationship. We benchmark MAGGIE across various applications using both simulated and biological datasets and demonstrate its improvement in sensitivity and specificity compared with the state-of-the-art motif analysis approaches. We use MAGGIE to gain novel insights into the divergent functions of distinct NF-κB factors in pro-inflammatory macrophages, revealing the association of p65–p50 co-binding with transcriptional activation and the association of p50 binding lacking p65 with transcriptional repression. Availability and implementation The Python package for MAGGIE is freely available at https://github.com/zeyang-shen/maggie. The accession number for the NF-κB ChIP-seq data generated for this study is Gene Expression Omnibus: GSE144070. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
46

Kong, Qing, Perng-Kuang Chang, Chunjuan Li, Zhaorong Hu, Mei Zheng, Quanxi Sun, and Shihua Shan. "Identification of AflR Binding Sites in the Genome of Aspergillus flavus by ChIP-Seq." Journal of Fungi 6, no. 2 (April 21, 2020): 52. http://dx.doi.org/10.3390/jof6020052.

Full text
Abstract:
We report here the AflR binding motif of Aspergillus flavus for the first time with the aid of ChIP-seq analysis. Of the 540 peak sequences associated with AflR binding events, 66.8% were located within 2 kb upstream (promoter region) of translational start sites. The identified 18-bp binding motif was a perfect palindromic sequence, 5′-CSSGGGWTCGAWCCCSSG’3′ with S representing G or C and W representing A or T. On closer examination, we hypothesized that the 18-bp motif sequence identified contained two identical parts (here called motif A and motif B). Motif A was in positions 8–18 on the upper strand, while motif B was in positions 11-1 on the bottom strand. The inferred length and sequence of the putative motif identified in A. flavus were similar to previous findings in A. parasiticus and A. nidulans. Gene ontology analysis indicated that AflR bound to other genes outside the aflatoxin biosynthetic gene cluster.
APA, Harvard, Vancouver, ISO, and other styles
47

Brylinski, Michał, Leszek Konieczny, Patryk Czerwonko, Wiktor Jurkowski, and Irena Roterman. "Early-Stage Folding in Proteins(In Silico)Sequence-to-Structure Relation." Journal of Biomedicine and Biotechnology 2005, no. 2 (2005): 65–79. http://dx.doi.org/10.1155/jbb.2005.65.

Full text
Abstract:
A sequence-to-structure library has been created based on the complete PDB database. The tetrapeptide was selected as a unit representing a well-defined structural motif. Seven structural forms were introduced for structure classification. The early-stage folding conformations were used as the objects for structure analysis and classification. The degree of determinability was estimated for the sequence-to-structure and structure-to-sequence relations. Probability calculus and informational entropy were applied for quantitative estimation of the mutual relation between them. The structural motifs representing different forms of loops and bends were found to favor particular sequences in structure-to-sequence analysis.
APA, Harvard, Vancouver, ISO, and other styles
48

Jia, Hui, and Jinming Li. "Finding Transcription Factor Binding Motifs for Coregulated Genes by Combining Sequence Overrepresentation with Cross-Species Conservation." Journal of Probability and Statistics 2012 (2012): 1–18. http://dx.doi.org/10.1155/2012/830575.

Full text
Abstract:
Novel computational methods for finding transcription factor binding motifs have long been sought due to tedious work of experimentally identifying them. However, the current prevailing methods yield a large number of false positive predictions due to the short, variable nature of transcriptional factor binding sites (TFBSs). We proposed here a method that combines sequence overrepresentation and cross-species sequence conservation to detect TFBSs in upstream regions of a given set of coregulated genes. We applied the method to 35S. cerevisiaetranscriptional factors with known DNA binding motifs (with the support of orthologous sequences from genomes ofS. mikatae,S. bayanus, andS. paradoxus), and the proposed method outperformed the single-genome-based motif finding methodsMEMEandAlignACEas well as the multiple-genome-based methodsPHYMEandFootprinterfor the majority of these transcriptional factors. Compared with the prevailing motif finding software, our method has some advantages in finding transcriptional factor binding motifs for potential coregulated genes if the gene upstream sequences of multiple closely related species are available. Although we used yeast genomes to assess our method in this study, it might also be applied to other organisms if suitable related species are available and the upstream sequences of coregulated genes can be obtained for the multiple closely related species.
APA, Harvard, Vancouver, ISO, and other styles
49

Sasso, E. H., K. Willems van Dijk, A. Bull, S. M. van der Maarel, and E. C. Milner. "VH genes in tandem array comprise a repeated germline motif." Journal of Immunology 149, no. 4 (August 15, 1992): 1230–36. http://dx.doi.org/10.4049/jimmunol.149.4.1230.

Full text
Abstract:
Abstract In a study of human VH gene heterogeneity, we have previously used sequence-specific oligonucleotide probes to demonstrate polymorphism of 56pl and three highly homologous VH3 germline elements. We now extend these findings with VH nucleotide sequences obtained from a person who possesses restriction fragments corresponding to each of these four VH3 genes. From a lambda-phage library of genomic DNA, distinct phage clones containing putative 56pl, hv3005, 1.9III, and hv3019b9 genes were selected by screening with oligonucleotide probes. PCR amplification, subcloning, and sequencing from the respective clones 3d216, 3d24, 3d28, and 3d277, yielded exact 56p1, hv3005, 1.9III, and hv3019b9 nucleotide sequences. A panel of oligonucleotide probes was shown to hybridize to these cloned VH3 genes with exact specificity, demonstrating the ability of the probes to predict the sequence of detected target DNA. Based on their chromosomal organization and their previously determined distribution in the population, these VH3 genes represent at least three distinct loci. From each of the VH3-containing phage clones, a VH4 element was also identified and sequenced. Linked to 3d24 and 3d28, respectively, were VH4 sequences identical to hv4005 and 1.9II, corroborating previous reports. The VH4 elements linked to 3d216 and 3d277 were distinct from published VH4 sequences. Nucleotide sequence homology was 97 to 99% among the VH3 sequences, and 93 to 99% among the VH4. These findings indicate that the VH3-VH4 gene pairs we have identified are a repeated germline motif, apparently resulting from multiple duplications of tandemly arrayed VH genes.
APA, Harvard, Vancouver, ISO, and other styles
50

Håland, Else Marie, Astrid Salte Wiig, Lars Magnus Hvattum, and Magnus Stålhane. "Evaluating the effectiveness of different network flow motifs in association football." Journal of Quantitative Analysis in Sports 16, no. 4 (November 18, 2020): 311–23. http://dx.doi.org/10.1515/jqas-2019-0097.

Full text
Abstract:
AbstractIn association football, a network flow motif describes how distinct players from a team are involved in a passing sequence. The flow motif encodes whether the same players appear several times in a passing sequence, and in which order the players make passes. This information has previously been used to classify the passing style of different teams. In this work, flow motifs are analyzed in terms of their effectiveness in terms of generating shots. Data from four seasons of the Norwegian top division are analyzed, using flow motifs representing subsequences of three passes. The analysis is performed with a generalized additive model (GAM), with a range of explanatory variables included. Findings include that motifs with fewer distinct players are less effective, and that motifs are more likely to lead to shots if the passes in the motif utilize a bigger area of the pitch.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography