Using Machine Learning for Discovery on Earth, Mars and Beyond
“We are living in one of the most exciting and hyped eras of AI,” says Fei-Fei Li, Stanford University Professor and Chief Scientist of Artificial Intelligence and Machine Learning at Google Cloud. This statement certainly applies to machine learning as well. A search of the term “machine learning” on Google News reveals a daily onslaught of articles in the popular press. Consumers are just awakening to the fact that their Netflix queues and email spam filters use machine learning, and that critical tasks such as medical diagnosis and cybersecurity will inevitably as well. Though much of the world has just discovered machine learning, researchers of all stripes have been using machine learning for decades to enable scientific discovery. In this talk, I will give a brief introduction to machine learning and discuss its necessity for scientific exploration while providing examples from my own research of how machine learning solves problems on Earth, Mars and beyond. Specifically, I will discuss how machine learning can detect earthquake damage in high-resolution Earth-observed imagery, find scientifically-interesting anomalies on images taken by Mars rovers, and discover distant supernovae from the images of optical ground telescopes.
Technical Talk: Robust Machine Learning Systems for Astronomical Discovery
Researchers have been using machine learning for decades to enable scientific discovery. Recent advances have made gigantic leaps in performance on benchmark datasets, particularly in the area of computer vision, and generated a surge of interest. As a new generation of researchers embrace data-driven approaches and the latest techniques, it is important to remember that machine learning is still subject to certain fundamental assumptions, one of which is that a machine learning model is optimal only if it sees data that is similar to the data on which it was trained. In this talk, we discuss a machine learning system in operation at the intermediate Palomar Transient Factory, an astronomical sky survey that recently completed operation in February 2017. iPTF was a fully automated, wide-field survey for the systematic exploration of the optical transient sky. Approximately one million objects were detected each night, of which only a tiny fraction were true astronomical sources of scientific interest. Because manual vetting at iPTF data rates is impossible, iPTF relied on machine learning to automate the rejection of bogus imaging artifacts and prioritize real astronomical sources for review by its science team. This talk describes three machine learning systems developed by JPL that helped enable science operations at iPTF. In developing and maintaining these systems, we encountered many challenges that affected the performance of our systems such as inaccurate annotations of the data, as well as survey changes and software upgrades that affected data characteristics. We discuss these challenges and the use of advanced techniques that can make future systems robust over a survey’s lifetime by monitoring conditions that cause sub-optimal performance.
Bio:
Dr. Umaa Rebbapragada is a Data Scientist in the Machine Learning and Instrument Autonomy Group. Her work is focused on the infusion of state-of-the-art techniques from the academic machine learning community into large-scale data science systems, with a current focus on astronomical and Earth observing imaging systems.