Your daily selection of the latest science news!
According to Science – Ars Technica
Machine learning has returned with a vengeance. I still remember the dark days of the late ’80s and ’90s, when it was pretty clear that the current generation of machine-learning algorithms didn’t seem to actually learn much of anything. Then big data arrived, computers became chess geniuses, conquered Go (twice), and started recommending sentences to judges. In most of these cases, the computer had sucked up vast reams of data and created models based on the correlations in the data.
But this won’t work when there aren’t vast amounts of data available. It seems that quantum machine learning might provide an advantage here, as a recent paper on searching for Higgs bosons in particle physics data seems to hint.
Learning from big data
In the case of chess, and the first edition of the Go-conquering algorithm, the computer wasn’t just presented with the rules of the game. Instead, it was given the rules and all the data that the researchers could find. I’ll annoy every expert in the field by saying that the computer essentially correlated board arrangements and moves with future success. Of course, it isn’t nearly that simple, but the key was in having a lot of examples to build a model and a decision tree that would let the computer decide on a move.
In the most-recent edition of the Go algorithm, this was still true. In that case, though, the computer had to build its own vast database, which it did by playing itself. I’m not saying this to disrespect machine learning but to point out that computers use their ability to gather and search for correlations in truly vast amounts of data to become experts—the machine played 5 million games against itself before it was unleashed on an unsuspecting digital opponent. A human player would have to complete a game every 18 seconds for 70 years to gather a similar data set.
Sometimes, however, you have a situation that would be perfect for this sort of big-data machine learning, except that the data is actually pretty small. This is the case for evaluating Higgs Boson observations. The LHC generates data at inconceivable rates, even after lots of pre-processing to remove most of the uninteresting stuff. But even in the filtered data set, collisions that generate a Higgs boson are pretty rare. And those particle showers that look like they might have a Higgs? Well, there is a large background that obscures the signal.
In other words, this is a situation where a few events must be found inside a very large data set, and the signal looks remarkably similar to the noise. That makes it quite difficult to apply machine learning, let alone train the algorithm in the first place.
This article and images were originally posted on [Science – Ars Technica] October 25, 2017 at 11:11AM
Credit to Author and Science – Ars Technica