2006-01-03

Back to What I Do Best

Today I put personal, philosohical, and aesthetic/literary speculation aside for a moment to concentrate on my preferred activity, which is invention. And by that I mean invention in the Thomas Edison sense. My friend Billy once told me that he could see me making a living as an old-school bowtie wearing "inventor," and many of my lifescripts involve coming up with some clever new product or process and starting up a company to exploit it commercially. Today I have two ideas that came to me while reviewing my sophomore organic chemistry text in preparation for the Spring qualifying exam in the UT O-chem division.

The first is a computerized reaction-predicting expert system incorporating a large neural-net architecture and trained using the CAS reaction database. One of the foremost marketable skills of an accomplished chemist is his or her ability to make better guesses than most folk about what will happen chemically when particular substances are combined under particular conditions. This ability accrues from long years of experience performing and studying chemical reactions and by the judicious application of analogic reasoning. A neural net is a computer system which imitates in a data structure the connectivity of animal neurons in a brain, and has been proven and applied to be useful--just like a human brain--in many complex pattern-recognition problems. At UT, for example, departmental chemists are working on developing an artificial chemical analysis system that imitates the human system of taste, mostly in that it uses a neural net and must be trained, like a real brain, to recognize certain chemical species by their "flavor." Basically, a large number of colorimetric chemical probes are combined into a single raster image, with each pixel representing the colorimetric response of a particular probe. The neural net "looks" at the complex picture that results and, during the training process, learns to associate particular patterns with particular analytes; subsequently it is able to identify solutions containing the same or similar analytes. Research is ongoing to develop the resolution of the system to a manportable "electronic tongue" that could be used to qualitatively identify all kinds of chemical mixtures in real-world applications. An interesting result of the neural-net pattern recognition process is that IT DOES NOT MATTER EXACTLY WHAT EACH CHEMICAL PROBE IS RESPONDING TO, only that there are a lot of them and that they respond in different ways. Thus the designers, builders, and operators never need to know if the color changes are happening as a result of pH or hydrophobic interactions or enzymatic complexing or any other conceivable chemical process--as long as there are a sufficient number of independently-responding probe channels the resulting patterns can still be diagnostic of particular analytes.

I propose to use the same technology to predict what will happen in a chemical system containing particular substances under particular conditions. The user inputs the chemical species present and the reaction conditions--including pressure and temperature ramps--and the system makes qualitative and quantitative predictions as to the outcome. It does this not by simulation or by theory-based calculations, but by pure neural-net-based pattern recognition based on extensive training from a database of known reactions. Since the introduction of computerized information storage and retrieval in chemistry, the Chemical Abstracts Service (CAS) has been assembling a large electronic database of experimentally-proven reactions; today this database contains tens of millions of known reactions including products, conditions, and yields, all already stored in an electronic format designed to be machine-parsable. So the software I propose would simply build an enormous virtual neural-net on a computer's hard disk (as large and complex a net as can be reasonably constructed given the presen state of the computational art), and then would automatically parse the entire CAS reaction database and use it to train the neural net. Subsequently the system's predictions would be tested against the outcome of real chemical reactions which were not part of the training set. Whether initially successful or not, the system could be designed to automatically familiarize itself with new reactions as the CAS reaction database was updated. Sooner or later in the course of technological history, depending on the rate of development of computational power and on the rate of accumulation of chemical knowledge in the CAS database, the system *will* begin to make practically useful predictions. My own intuition is that both contributing factors are already sufficiently advanced to allow useful predictions to be made given the present-day condition of technology, but of course only actual development and testing of the system will tell for certain. In fact, I would be surprised if such a system is not already in development/operation. If anyone who reads this knows of such an effort, I would love to hear about it.

No comments: