Collecting, Retrieving, and Estimating Reliable Thermochemical Data
ThermInfo is an Information System that provides users with easy access to structural and critically evaluated experimental thermochemical data of organic compounds. Unfortunately not all the accurate data is readily available in the literature and there is not enough time, money, equipment, expertise, or people to measure the properties in the lab or to collect and critically evaluate them. The exponential growth in the number of chemical compounds requires more computing power to analyze them and allow automatic prediction of their properties, since its realization in analytical laboratory is often not feasible. Thus, it becomes essential to use methods for predicting properties to obtain estimations of data for compounds that have not yet been analyzed experimentally. Using “chemically intelligent” software, ThermInfo allows to obtain a value of a thermochemical property, such as a gas-phase standard enthalpy of formation, by inputting, for example, the molecular structure or the name of a compound. These properties are critical to the manufacturing of any compound and with considerable economic importance. A variety of empirical methods to estimate new values will be implemented. These prediction methods will be selected on the basis of their reliability and will cover a wide range of (long-lived and transient) organic, inorganic, and organometallic molecules in the gas- and in condensed-phase. Although these methods are considered to be accurate, they are still limited in their use, requiring parameters that were not estimated or measured experimentally.
This work aims to increase the capability to predict thermochemical properties, using methods of data mining and machine learning applied to complex non-homogeneous data (chemical structures), for large information repositories. The knowledge obtained during the project will be published on an user-friendly interface, making it a valuable resource for chemical engineers and researchers.
ThermInfo is developing an Information System for collecting and retrieving thermochemical properties obtained from critically evaluated experimental data and several estimation methods.
- Chemical Information System
- Thermochemical Properties
- Thermochemical Properties Prediction
Available at: http://therminfo.lasige.di.fc.ul.pt
- Quick Search: allows searching compounds by name (IUPAC name or synonyms), molecular formula, molecular ID, CASRN or SMILES.
- Advanced Search: provides multiple search fields that allow limiting the search results to specific compound characteristics.
- Structural Search: allows searching for identical or similar compounds - according to the drawn chemical structure. The degree of similarity (based on the Tanimoto equation) can be selected.
- Substructure Search: allows searching for compounds that contain the drawn chemical structure.
- 2D Structure/SMILES to Image: allows converting a given SMILES or a drawn chemical structure to a chemical structure image (PNG or PDF format).
- Properties Prediction (based on ELBA method): allows predicting some thermochemical properties for a given SMILES or a drawn chemical structure. Currently, it is only available for hydrocarbons.
- Register an account/Login System
- Insert Data: allows inserting new organic compounds to the database. The users are encouraged to use this feature and help to enlarge the database.
- André Falcão
- Francisco Couto
- Ana Teixeira
- José A. Martinho Simões
- João P. Leal
- Rui C. Santos
- Rony Reis
- Period: 2010 to 2014
- SFRH/BD/64487/2009, Doctoral research scholarship for Ana Teixeira
- João P. Leal, Additive methods for prediction of thermochemical properties. The Laidler method revisited. 1. Hydrocarbons. J. Phys. Chem. Ref. Data 2006, 35, 55-76. (doi: 10.1063/1.1996609)
- Rui C. Santos, João P. Leal, José A. Martinho Simões, Additivity methods for prediction of thermochemical properties. The Laidler method revisited. 2. Hydrocarbons including substituted cyclic compounds. J. Chem. Thermodyn. 2009, 41, 1356-1373. (doi: 10.1016/j.jct.2009.06.013)
Ana L. Teixeira, Andre O. Falcao: A non-contiguous atom matching structural similarity function. Journal of Chemical Information and Modeling.
Ana L. Teixeira, João P. Leal, Andre O Falcao, Improving QSPR models for predicting standard enthalpy of formation with a hybrid approach for feature selection (as a poster).CQB - Day 2013 p. 70, Faculdade de Ciências da Universidade de Lisboa, Portugal, July, 2013.
Ana L. Teixeira, João P Leal, Andre O Falcao 2013: Random forests for feature selection in QSPR Models - an application for predicting standard enthalpy of formation of hydrocarbons. Journal of Cheminformatics.
Ana L. Teixeira, Rui C. Santos, João P. Leal, José A. Martinho Simões, Andre O. Falcao, ThermInfo: Collecting, Retrieving, and Estimating Reliable Thermochemical Data Technical Report. Technical Report . LaSIGE, Department of Informatics, Faculty of Sciences, University of Lisbon, January 2013.
Rony Reis, ThermInfo 2.0 - Estruturação e concretização de um sistema de informação para propriedades químicas Master Thesis, Faculty of Sciences of the University of Lisbon, December 2012.
R.C. Santos, Ana L. Teixeira, J.P. Leal, Andre O. Falcao, J.A. Martinho Simões, ThermInfo: Collecting, Retrieving, and Estimating Reliable Thermochemical Data. Poster presented at XXII Encontro Nacional da Sociedade Portuguesa de Química, Braga, July. 2011
Ana L. Teixeira, ThermInfo: Sistema de Informação para Coligir e Apresentar Propriedades Termoquímicas Master Thesis, Faculty of Sciences of the University of Lisbon, July 2009.
Ana L. Teixeira, Rui C Santos, Francisco Couto, ThermInfo: Collecting and Presenting Thermochemical Properties.INForum 2009.