Cognitive Class Uses Machine Learning to Help SETI Find Little Green Men
Posted on June 27, 2017 by Joseph Santarcangelo
This month the team at CognitiveClass.ai was in Galvanize San Francisco with Adam Cox and Patrick Titzler, running a Code challenge that will help SETI (Search for extraterrestrial intelligence) look for Aliens.
The goal of the event was to help the SETI Institute develop new signal classification algorithms and models that can aid in the radio signal detection efforts at the SETI Institute’s Allen Telescope Array (ATA) in Northern California.
Our Chief data scientist Saeed Aghabozorgi developed several Jupyter notebooks including one to transform the signals into spectrograms using a Spark cluster. In addition, Saeed provided several Tensorflow notebooks, one of which used a Convolutional Neural Network  to classify the Spectrogram. Check out the Github page and see all the scripts from Saeed Aghabozorgi , Adam Cox and Patrick Titzler.
Our developer Daniel Rudnitski developed a scoreboard that evaluates everyone’s algorithms. The scoreboard works by comparing the predicted results and the true labels in a holdout set for which the participants did not know the labels (shown in Figure 1). I gave a tutorial on Neural Networks and Tensorflow, helped the participants debug their code, and enjoyed the free food.
Figure 1: Cognitive Class’ leaderboard used to assess results of Hackathons
SETI searches for E.T. by scanning star systems with known exoplanets. The idea is that nature does not produce sine waves, therefore the system looks for narrow-band carrier waves like sign waves. The detection system sometimes triggers on signals that are not narrow-band signals. The goal of the event was to classify these signals accurately in real-time, allowing the signal detection system to make better informed observational decisions. 
We transformed the observed time-series radio signals into a spectrogram. A spectrogram is a 2-dimensional chart that represents how the power of the signal is distributed over time and frequency . An example is shown in Figure 2. The top chart is a spectrogram in which the bright green represents higher intensity values, and the blue represents low intensity values. The bottom chart contains two amplitude modulated signals labeled A and B. The two brightly colored patches in the spectrogram directly above the signal represent the distribution of the signal energy in time and frequency. The horizontal axis represents time, while the vertical axis represents frequency. If we examine signal A we see that it oscillates at a much lower rate than signal B, meaning that it has a much lower frequency. This is exhibited by a much lower location of the energy on the vertical axis of the Spectrogram.
Fig 2: Spectrogram (top) of two amplitude modulated Gaussian signals (bottom)
The 2D representation provided by the spectrogram allows us to change the problem into a visual recognition problem. Allowing us to apply methods such as convolutional neural networks. Individuals without expertise in design and implementing Deep Neural Networks could focus on the signal processing problem and let IBM Watson Visual Recognition tool handle the complex problem of image classification. The process is demonstrated in figure 3 with a Chirp signal (a signal in which the frequency increases or decreases over time). After the spectrogram, several convolutional layers are applied to extract features from the image, then the output is flattened and placed as inputs into a fully connected neural network. To learn more about deep learning check out our Deep Learning 101 and Deep Learning with TensorFlow courses.
Figure 3: Example architecture used in the event. (Source: Wikipedia)
To speed up the process of developing and testing these neural network, participants were given access to GPUs on IBM PowerAI Deep Learning. PowerAI speeds up deep learning and AI using GPU. Built on IBM’s Power Systems, PowerAI is a scalable software platform that accelerates deep learning and AI with blazing performance for individual users or enterprises. The PowerAI platform supports popular machine learning libraries, and was provided through public cloud provider, NIMBIX. Participants used libraries such as Caffe, Theano, Torch, and Tensorflow. In addition, given the vast amounts of data for signal processing, participants were also given access to an IBM Apache Spark Enterprise cluster. For example, the spectrograms where calculated on several nodes as shown in figure 4.
Figure 4: Example architecture used in the event.
The top team was Magic AI. This team used a wide neural net, a network that has less layers than a deep network, but more neurons per layer. According to Jerry Zhang, a Graduate Researcher at UC Berkeley Radio Astronomy Lab, the spectrogram exhibited less complex shapes then a standard image like those in Modified National Institute of Standards and Technology database (MIST), as a result less convolutional layers where required to encode features like edges. We see this by examining figure 5, the left image shows 5 spectrograms and the right image shows 5 images from MIST. The Spectrogram is colored using the standard gray scale where white represents the largest values and black represents the lowest values. We see the edges of the spectrogram are predominantly vertical and straight while the numbers exhibit horizontal lines, parallel lines, arches and circles.
Figure 5: Spectrograms and the right image shows 5 images from MIST
The Best Signal Processing Solution was by the Benders. They applied a method for detecting earthquakes to improve signal processing. Arun Kumar Ramamoorthy, one of the members, also made an interesting discovery while plotting out some of the data points. Check out their blog post here.
The prize for best Non Neural Network/Watson: went to team Explorers and most Interesting went to team Signy McSigFace. The trophies are shown in Figure 6.
Figure 6: Custom trophies designed for winners of this hackathon.
The weekend was quite interesting with talks from , Dr. Jill Tarter, Dr. Gerry Harp, and Jon Richards who gave talks about SETI, the radio data processing and operations. They were also available to answer questions from participants. Kyle Buckingham gave a talk about the radio telescope he built in his backyard! Everyone who participated is shown in the image below.
Figure 7: SETI Hackathon participants
Check out the event GitHub page: https://github.com/setiQuest/ML4SETI/
For more information on SETI, please check out: https://www.seti.org/
To donate to SETI: https://www.seti.org/donate/astrobiology-sb
 Krizhevsky, Alex, Ilya Sutskever, and. Hinton Geoffrey E ,. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.
 Cohen, Leon. “Time-frequency distributions-a review.” Proceedings of the IEEE 77.7 (1989): 941-981.