Internet Journal of Chemistry, 2000, 3, 1

MolBank: First Fully Web-Based Publication of Chemical Reaction Data of Individual Molecules with Structure Search and Submission

Shu-Kun Lin1* and Luc Patiny2

1Molecular Diversity Preservation International (MDPI), Saengergasse 25, CH-4054 Basel, Switzerland;
2Institute of Organic Chemistry, University Lausanne, ICO-BCH, 1015 Lausanne, Switzerland.

Keywords: Chemical information, chemical journal, MolBank, online journal, electronic publication, structure search, substructure search, online submission

Publication date: Jan 11 2000 14:19:00 GMT

Abstract:

Experimental data of organic synthesis and compound analysis is very precious chemical information. The monthly chemistry journal Molecules (http://www.mdpi.org/molecules/) started to publish a new column of MolBank in April 1997. The intention is to preserve a large amount of experimental data of organic synthesis and structural characterization of individual molecules, which has been conventionally unpublishable. Short notes of one-paper one-page for one structure are published. We report the experience and the recent improvement of the MolBank section. All the published structures can be searched by substrucure, in addition to text search functions at the http://www.molbank.org server, since August 1999. The structure-drawing and display are fully web-browser-based and independent of any other additional software.
 

Introduction

Electronic journals 1 and electronic conferences 2 have been launched during recent years. Some secondary and tertiary literature services (e.g., Beilstein) also provide online substructure-searchable services.

Since August 1999, we initiated the MolBank section of Molecules (http://www.mdpi.org/molecules/, ISSN 1420-3049) at the http://www.molbank.org server as a substructure searchable electronic journal in chemistry. All the published compounds can be readily searched on the website www.molbank.org by 2D substructure search.

At the very beginning in 1997, we launched the MolBank section of Molecules where small pieces of experimental work of individual compound preparation and structural characterization can be rapidly and easily published in the MolBank column of the monthly chemistry journal Molecules3456 . So far more than 125 short notes of the MolBank column have been published up to November 1999. Now it is ready for handling a very large number of short notes by fully electronic online submission, refereeing and editor-author communication as well as the final production at its own http://www.molbank.org server.

Many small pieces of very useful experimental findings cannot find a way to be formally recorded in the literature. This is a tremendous loss of useful chemical information. We believe the way to publish the MolBank column in Molecules is a very significant step towards the preservation of the information aspect of molecular diversity of precious experimental findings and an interesting experiment in electronic publishing. We would like to discuu it and hope more journals can join us to publish electronically the experimental findings.
 

The Loss of Chemical Information

Unlike other fields of science, chemists contribute a large volume of knowledge (chemical information) based on experimental data 789 .

Many chemists working in industry research and development have accumulated a large amount of synthetic and structural characterization results. However, many of them have never been published. It is understandable that if the chemists have done a good job and are well-paid, both sides, the chemists themselves and the companies are satisfied. They do not need to publish anything.

Their interests in publishing their works are further discouraged by the conventional standard of chemical paper publications: they do not want to spend a lot of time painstakingly preparing the "introduction" and the "discussion" sections required by conventional chemistry journals. The preparation of these two sections are tricky. Many very good experimental works have been rejected because the authors failed to prepare good paragraphs for these sections. Even published "experimental work" can be published virtually without any experimental data presented if you are a good writer: It is not rare in some prestigious journals of chemistry to find papers without any experimental data except a statement such as "all the compounds give satisfactory elemental analysis (C,N,H), IR and NMR". Examples are not rare in Chemical Communication and Tetrahedron Letters10 .

For similar reasons, chemists in universities also have this problem and feel reluctant to publish fully their experimental findings, particularly if they are scattered unassembled experimental data of individual compounds.

Therefore, most (we estimate at least 2/3) of the experimental data for organic molecules was previously never published.

On the other hand, we know that chemistry is a typical experimental science. Chemical knowledge as written in the text books is based on experimental findings. The recording as fully as possible of all these experimental findings would on any account be very useful 11 .

Can we do something better to change the situation of chemical information loss?
 

MolBank Section

Starting with volume 2 of the journal Molecules3 , the nonprofit international organization MDPI will provide, in addition to the sample deposit services, the service for the deposit of information (experimental data, particularly synthesis and spectroscopic data). Preserving and exploiting molecular diversity for both information and samples will continue to be the goal of the journal Molecules and the organization MDPI.

Authors are not required to prepare "introduction" and "discussion" parts - it is not necessary to be asked to tell why you want to prepare these compounds.

MolBank papers are short notes of synthetic works and the data of the structural characterization which can be one molecule (one structure) per paper and can be as short as one page only. Such publications in Molecules will serve as the experimental data deposit.

Figure 1 shows a typical paper published in MolBank section.

Figure 1 A MolBank short note in html format.

This project might be interesting because a large volume of very precious chemical information, particularly the very diverse works of synthesis and structural characterization have either never been published -- the submitted papers were rejected by editors because they were too trivial (normally because it belonged to the classical scope of chemistry, i.e., pure synthetic or pure spectroscopic measurements), or the chemists themselves never planned to publish such works for individual compounds and isolated data because they thought them not publishable in a traditional journal. We believe that synthesis and structural elucidation of individual compounds are still the essence of chemistry and the material foundation of other research.

The ready publication of all scattered unassembled data for individual compounds in Molecules as short posters will provide a bank for chemists to deposit all of their information of synthesis and structural characterization, together with the sample availability information. Those works that have no compound samples available will also be published. When a large volume of (say 1 million) structures is published in this way and also constructed as a retrievable databank, it will be a very useful treasure to all chemists and other related scientists. If every synthetic chemist contributes 100 such posters, this number will be easily reached within several years.
 
 

New Development

Two MolBank new features have been added recently: As seen from Figure 2 , the submission can be done easily by filling out the form.

Figure 2 Online submission of paper for consideration and publication in MolBank section.

A chemist started to serve as the MolBank section Editor of Molecules (http://www.mdpi.org/molecules/editors.htm#molbank) in August 1999.
 

Discussion

Many colleagues may think that it is not interesting and not worthwhile to publish solely a large volume of experimental findings. However, there are several strong reasons. Firstly, as a benefit of the authors, the priority of the synthetic work and characterization of the compounds will be recorded in the literature as the work performed by the author. Secondly, these publications will be greatly appreciated by other experimentalists: the experimental results published will be very useful for others to prepare the samples and the similar compounds 11 . Finally, it should be our obligation and duty to preserve as complete as possible all the experimental findings. If they are formally published and stored in a searchable database, it will be very convenient for new generations of chemists.

Thus, Molecules publishes in the section of MolBank (http://www.mdpi.org/molbank) very short notes of experimental data records for individual molecules. Any scattered, unassembled experimental data for individual compounds which is conventionally not publishable is particularly welcomed, to be published as one-paper one-page for one structure and given special page numbers (M1, M2, etc.). They have been published in HTML format, with at least a formula of the target molecule. MDL MOL file is also included for every MolBank short notes. All papers submitted for consideration and publication in this column of "MolBank" have been refereed and the accepted papers edited (English corrected and format unified). The related chemical samples are in most cases available and the availability information is also published.

So far all papers published in the MolBank section have been indexed and abstracted by several leading indexing and abstracting services, including Chemical Abstracts; CAPLUS; Science Citation Index Expanded; SciSearch, Research Alert; Chemistry Citation Index; Current Contents/Physical, Chemical & Earth Sciences.

We would say that this is only an interesting experiment.

Acknowledgments

The author would like to thank Drs. Radek Marek, Milata Viktor, Norbert Haider, Margaret A. Brimble, N. Peerzada and Jin-Cong Zhuo for their kind support and encouragement.

 This awork was originally presented at the Third International Electronic Conference on Synthetic Organic Chemistry (ECSOC-3), www.mdpi.org/ecsoc-3.htm, September 1-30, 1999, and the 218th ACS National Meeting August 22-26, 1999.
 

References

1 Online-only journals monitored by CAS are available at http://info.cas.org/EO/ejourn2.html.
2 Examples are Rzepa, H. R. (Organizer); ECTOC-1, June 12-July 7, 1995 (http://www.ch.ic.ac.uk/ectoc/ectoc-1.html) and the other three ECTOC meetings in the series at the website. Substructure searching, similar in style to what is described here, was utilized in ECTOC-2.
3 The MolBank website is http://www.mdpi.org/molbank/
4 The MolBank server http://www.molbank.org/ has the MolBank section papers in searchable database form. The web-based submission and reviewing systems are also installed there. 
5 S.-K. Lin and L. Patiny
MolBank: Preservation and Publication of Chemical Reaction Data, presented at the 218th ACS National Meeting August 22-26, 1999. 
6 S.-K. Lin
http://www.unibas.ch/mdpi/ecsoc/f0001.htm
MolBank: Rapid and Easy Publication of Short Notes of Individual Molecules, presented at the First International Electronic Conference on Synthetic Organic Chemistry (ECSOC-1), www.mdpi.org/ecsoc/, September 1-30, 1997
7 S.-K. Lin
"A Good Yield and a High Standard"
Molecules 1996, 1, 1-2
http://www.mdpi.org/molecules/edito96.htm
8 S.-K. Lin
"Preserving and Exploiting Molecular Diversity: Deposit and Exchange of Chemical Information and Chemical Samples"
Molecules 1997, 2, 1-2
http://www.mdpi.org/molecules/edito97.htm
9 S.-K. Lin
"Chemical information"
Chemical and Engineering News May 26 1997, 4
10 The "high yield" standard is another questionable standard for many publishable synthetic works [see Reference 7]. As reasonably skillful synthetic organic experimentalists, we found quite a number of papers that claimed (higher) yields which can never be reproduced. Normally 20% higher than what we find if we repeat a synthesis. Therefore, the chemistry journal Molecules does not set the "high yield" as a high standard of organic synthesis [see Reference 7] and authors can be encouraged to be honest and to report reproducible yields. 
11 As synthetic organic chemists, we may read carefully only the experimental section, sometimes making a Xerox copy of only the specific experimental paragraphs of a paper or a patent.