Python for Chemists
Book logo
Book series logo

Python for Chemists

Author(s):
Publication Date:
August 23, 2022
Copyright © 2022 American Chemical Society
eISBN:
‍9780841299252
DOI:
10.1021/acsinfocus.7e5030
Read Time:
six to seven hours
Collection:
1
Publisher:
American Chemical Society
Google Play Store

Programming in Python empowers chemists to apply their domain knowledge to scales unreachable by manual effort. Learning Python is easy, but contextualizing chemical problems in Python is not always obvious.

Readers of this primer develop the skill to identify problems in their research for which code may automate operations and scale a large volume of data or calculation. In addition, the authors shorten the time from “learning” to “using” Python through meaningful problem sets in Chapter One.

Book series logo
Detailed Table of Contents
About the Series
Preface
Chapter 1.
Begin Coding in Base Python
1.1
Introduction
1.2
Numerical Operations
1.2.1
Practice Problems
1.3
String Operations
1.3.1
Indexing
1.3.2
Primer Design for Polymerase Chain Reaction
1.3.3
Practice Problems
1.4
Functions
1.4.1
Practice Problems
1.5
Conditional Statements
1.5.1
Boolean Variables
1.5.2
Logic Gates
1.5.3
Practice Problems
1.6
Loops
1.6.1
For-Loop
1.6.1.1
List Comprehension
1.6.1.2
Iterables
1.6.2
While-Loop
1.6.3
Continue, Break, and Pass
1.6.4
Practice Problems
1.7
That’s a Wrap
1.8
Read These Next
Chapter 2.
Data Analysis in Python
2.1
Introduction
2.2
Scientific Computing with NumPy
2.2.1
Reshaping
2.2.2
Indexing
2.2.3
Algebra
2.2.4
Application
2.3
Pandas for Data Analysis
2.3.1
Loading the Data
2.3.2
Extraction from Raw Data
2.3.3
Exploratory Data Analysis
2.3.4
Data Manipulation
2.3.4.1
Subsetting
2.3.4.2
Sorting
2.3.4.3
Merging
2.4
Seaborn for Visualization
2.5
That’s a Wrap
2.6
Read These Next
Chapter 3.
Cheminformatics
3.1
Introduction
3.2
The SMILES and SMARTS Languages
3.2.1
SMILES
3.2.2
SMARTS
3.3
RDKit
3.4
Atoms and Bonds
3.5
Reactions
3.6
Inspecting a Database
3.7
Finding Substructures
3.8
Fingerprints
3.9
Molecular Similarity
3.10
That’s a Wrap
3.11
Read These Next
Chapter 4.
Machine Learning on Chemical Data
4.1
Introduction
4.2
Background
4.2.1
Human β-Secretase 1
4.2.2
pIC 50
4.3
Supervised Learning
4.3.1
Data Preparation
4.3.1.1
Load Data Set
4.3.1.2
Randomizing the Order of the Instances
4.3.1.3
Data Partitioning
4.3.1.4
Standardizing the Features
4.3.2
Regression of pIC 50
4.3.2.1
Training the Model
4.3.2.2
Model Performance
4.3.2.3
Random Forest Regressor
4.3.3
Classification of BACE-1 Inhibitor/Noninhibitor
4.3.3.1
Logistic Regression Classifier
4.3.3.2
Random Forest Classifier
4.3.4
Further Discussion Items for Supervised Learning
4.3.4.1
k-fold Cross Validation
4.3.4.2
Hyperparameter Selection
4.3.4.3
Saving Your Work
4.3.4.4
Understanding Your Work
4.4
Unsupervised Learning
4.4.1
Dimensionality Reduction
4.4.2
Clustering
4.4.3
Anomaly (Outlier) Detection
4.5
That’s a Wrap
4.6
Read These Next
Chapter 5.
Modeling Chemical Systems
5.1
Introduction
5.2
File Formats
5.3
Dynamic Modeling in SciPy
5.4
Atomic Simulation Environment for Standard Interface
5.4.1
The Atoms Object
5.4.2
Calculators
5.4.3
Geometry Optimization
5.5
Protein Structures with BIOPYTHON
5.5.1
File I/O
5.5.2
Navigating Protein Structure
5.5.3
Application
5.6
That’s a Wrap
5.7
Read These Next
Appendix A. Solutions to Practice Problems
Bibliography
Glossary
Index
Reviewer quotes
Alathea Davies, PhD student, Cornell University
Python for Chemists does a very good job of easing the reader into basic Python, data analysis techniques, and then into more advanced techniques with cheminformatics, machine learning, and modeling. Particularly, the chapter Machine Learning on Chemical Data. This chapter was incredibly helpful in explaining some of the basic concepts of machine learning and showing examples of the code itself. Similarly, the chapter Cheminformatics was beneficial for me as I will need to encode chemical descriptors of materials to perform machine learning, and I was otherwise unaware of some of the options for encoding this information.
Paul A. Craig, Professor of Biochemistry & Bioinformatics, Rochester Institute of Technology
This is a good collection of topics for introducing chemists to computing with Python: Introduction, Data Analysis, Cheminformatics, Machine Learning, and Modeling Chemical Systems. It is a good mixture of using numerical and string data types, as well as the application to several areas of chemistry.
Author Info
Kiyoto Aramis Tanemura
Kiyoto Aramis Tanemura is a Ph.D. student working in the research group of Prof. Kenneth M. Merz in the Department of Chemistry, Michigan State University. At the interface of computational chemistry and artificial intelligence, his research aims to develop methodologies to predict spectral properties of small organic molecules for high throughput identification. He completed his B.A. at Kalamazoo College in Chemistry and Mathematics, with concentrations in Biological Chemistry and Molecular Biology as well as Biological Physics. He uses Python every day in all aspects of his research.
author image
Diego Sierra-Costa
Diego Sierra-Costa is a doctoral candidate at the Department of Chemistry at Michigan State University under the supervision of Prof. Kenneth M. Merz. His research in mathematical artificial intelligence and chemistry focuses on developing new representations of small molecules for the prediction and calculation of physicochemical properties. Diego received his B.Sc. in Physics from the National Autonomous University of Mexico where he focused on quantum optics and cold atoms. Photo credit: Delilah Pacheco
author image
Kenneth M. Merz
Kenneth M. Merz, Jr. is the Joseph Zichis Chair in Chemistry and a University Distinguished Professor at Michigan State University. He is also the Editor-in-Chief of the ACS Journal of Chemical Information and Modeling. His research interest lies in the development of theoretical and computational tools and their application to biological problems including structure and ligand-based drug design, mechanistic enzymology, and methodological verification and validation. He has received several honors including election as an ACS Fellow, the 2010 ACS Award for Computers in Chemical and Pharmaceutical Research, election as a fellow of the American Association for the Advancement of Science, and a John Simon Guggenheim Fellowship.
author image