Search for contacts, projects,
courses and publications

ModelArchive - Open Research Data (ORD) best practices for computational macromolecular models

People

 

Cavalli A.

(Project partner)

External participants

Schwede Torsten

(Responsible)

Abstract

Proteins, DNA, and RNA are essential for all biological processes, and their functions are intertwined with their 3D structure. Traditionally, structures are determined experimentally, mainly with X-ray crystallography, NMR, and cryo-EM techniques, but recently computational methods have made impressive progress in accurate 3D protein structure prediction. In fact, the journal Nature has nominated protein structure prediction as “Method of the Year 2021”. The structural biology community has pioneered open research data principles, as exemplified by the Protein Data Bank (PDB), the global de facto standard archive of experimentally-determined macromolecular structures. However, the PDB does not archive structures determined through computational modelling, resulting in computational models stored in undefined locations, in incompatible formats, and lacking essential metadata. Following recommendations from an international community workshop, we have developed an archive for computed macromolecular structures, ModelArchive (https://modelarchive.org), and an extension of the mmCIF data format to store model metadata. However, data standards and best practices are not yet established for complex computational models involving proteins, DNA, RNA and/or small molecules, different conformational states of the same macromolecule, and synthetic proteins constructed through design methodologies. With the technical infrastructure of ModelArchive now established, we are in a good position to further develop ORD practices in our community. This includes defining and promoting best practices for data and metadata standards, establishing deposition policies with publishers and funding agencies, improving usefulness of protein models through linking accuracy estimates and accompanying metadata, and connecting to other ORD resources to make models easily findable, accessible and reusable.

Additional information

Acronym
ModelArchive
Start date
01.01.2023
End date
31.12.2024
Duration
24 Months
Funding sources
External partners
UniBS, UniL, EPFL, SIB
Status
Active
Category
swissuniversities / Open Research Data calls / Measure A1, Track B: Establish projects