ºÎ»ê½Ãû µµ¼­¿ä¾à
   ¹Ìµð¾î ºê¸®Çνº³»¼­Àç´ã±â 

åǥÁö






  • AI Transforms the Way We Do Science
     
    As we explained in Ride the Wave, the true magic of the coming phase of the Fifth Techno-Economic Revolution lies in mankind¡¯s growing ability to qual- itatively and quantitatively enhance almost every aspect of life and business using digital technology. Nowhere is this ¡°more true¡± than in scientific re-search, where better, faster, and cheaper ways to conduct every aspect of research are already creat- ing a self-re-enforcing virtuous cycle. More productive research leads quickly and cheaply to important new discoveries, which increase wealth, enabling society to devote more resources to research.


    Even better, digital research is enabling us to do things that could never have been done in the past, regardless of the amount of traditional resources al- located. And, at the same time, it¡¯s enabling us to in- vent solutions that benefit the wealthiest among us today, and even the poorest within just a few years.


    Never before, has discovery been moving so rapidly and been so accessible on a global basis. Because of its pace, scientists themselves are frequently only aware of the revolution within their own scientific research specialty. And very few managers or policy-makers appreciate the enormous scope of the digital revolution in science which is still in its infancy.
    ________________________________________
     
    To understand this revolution, it¡¯s necessary to understand the five steps of the scientific discovery cycle and consider how digitization can enhance the process at each step.


    Step One: Explore the scientific literature. Here the never-ending task is to identify the relevant scientific papers in a sea of millions, while tracking new topics as they emerge.


    Step Two: Design experiments. Here the challenge is to formulate hypotheses and determine how they can be tested. Like business strategy, experimental design determines the execution, investment, and metrics guiding the rest of the study. The key is to find the right trade-off between exploration of new ground and exploitation of well-understood phenomena.


    Step Three: Run experiments. Keep track of mil- lions of data points and their relationships. In the case of the life sciences, for instance, thousands of tiny tubes containing experiments on various molecules and cells must be meticulously monitored over precisely determined time periods, while avoiding contamination. Errors at this stage, can lead to career-ending consequences.


    Step Four: Interpret the Data. This involves making sense of the flood of raw data coming from the experiments. In the life sciences, for example, this could involve many terabytes of genetic and bio- chemical information. The goal is to transform the experimental results into scientific findings. Here the researcher determines whether the hypothesis is quantifiably confirmed or rejected; or perhaps, another, equally interesting hypothesis is formulated and confirmed. And,


    Step Five: Write a New Scientific Paper. This is where the cycle ends and a new one begins. The re- searchers make sure they cite every relevant precedent, regardless of whether it was identified in step one. Then, once peer-reviewed, the results are added to the body of scientific literature to be cited by other researchers. In the ideal case, the findings translate, not only into a frequently cited research paper, but become the basis for a valuable patent and perhaps even a whole new enterprise.
    ________________________________________


    From the dawn of civilization until the 1980s, every step in the cycle was painstakingly manual. That¡¯s when scientific literature became stored on computers, statistical analysis of large data sets became widely available using mainframes and minicomputers, and experimenters increasingly used digital instrumentation to build data sets. Then, over the next 35 years or so, those conventional digital solutions became better, cheaper and faster.


    However, it¡¯s only now that artificial intelligence, big data methods, and robotics are reaching the point where they are enabling a quantum leap when they¡¯re being applied to research. Going forward, the primary goal is harnessing these technologies to augment, or even replace, humans in the scientific process. The second and bigger objective is to make research, that was ¡°formerly impossible,¡± routine.


    To do this, researchers are already unleashing artificial intelligence, often in the form of artificial neu- ral networks, on the data torrents. Unlike earlier attempts at AI, these don¡¯t need to be programmed with a human expert¡¯s knowledge. Instead, they learn on their own, often from large sets of ¡°training data,¡± until they can ¡°see patterns¡± and ¡°spot anomalies¡± in data sets that are far larger and messier than human beings can cope with.
    ________________________________________


    Consider just a few examples from a range of scientific disciplines:


    Social media with billions of users and trillions of cumulative posts has brought big data to the social sciences. It has also opened an unprecedented opportunity to use artificial intelligence to glean meaning from this mass of human communications. For instance, researchers at the University of Pennsylvania¡¯s Positive Psychology Center are already using machine learning and natural language processing to sift through petabytes of data to gauge the public¡¯s emotional and physical health. That¡¯s traditionally been done with surveys. But social media data are ¡°unobtrusive, very inexpensive, and the sample sizes are orders of magnitude greater.¡± In one recent study, the team predicted county-lev- el heart disease mortality rates by analyzing 148 million tweets; risk factors turned out to include words related to anger and negative relationships. The predictions from the AI-based social media study matched actual mortality rates more closely than did predictions based on the 10 leading risk factors, such as smoking and diabetes. The same researchers have also used social media to predict personality, income, and political ideology, and to study hospital care, mystical experiences, and stereotypes. It¡¯s all part of a revolution going on in the analysis of language and its links to psychology.


    Meanwhile, particle physicists strive to understand the inner workings of the universe by smashing sub- atomic particles together with enormous energies to blast out exotic new bits of matter. At CERN, a Higgs boson emerges from roughly one out of every 1 billion proton collisions, and within a billionth of a picosecond it decays into other particles, such as a pair of photons or a quartet of muons. To ¡°re- construct¡± the Higgs, physicists must spot all those more-common particles and see whether they fit together in a way that¡¯s consistent with them coming from the same parent.


    Neural networks excel in sifting signal from back- ground. Today, these algorithms help distinguish the pairs of photons that originate from a Higgs de- cay, from random pairs. In 2024, researchers plan to upgrade the LHC to increase its collision rate by a factor of 10. At that point, machine learning may be the only way of keeping up with the torrent of data.


    An area of research with immediate commercial implications is system that quickly calculates the most effective molecular recipes for chemical synthesis. Instead of programming hard-and-fast rules for chemical reactions, a team of research- ers designed a ¡°deep neural network program¡± that learns how reactions proceed on its own based on millions of examples. The more data that¡¯s fed into it, the better it gets. Over time the network learns to predict the best reaction for a desired step in a synthesis. And, eventually, it comes up with its own recipes for making molecules from scratch. The re- searchers tested the program on 40 different molecular targets, comparing it with a conventional rule-based molecular design program. Whereas the conventional program came up with a solution for synthesizing target molecules 22.5% of the time in a 2-hour computing window, the neural network fig- ured it out 95% of the time. And it¡¯s dramatically faster than a human chemist trying to perform the same task.


    Given this trend we offer the following forecasts for your consideration.


    First, as soon as 2020, using labor-saving AI- based tools to explore the scientific literature will become standard operating procedure in almost every discipline.


    Today, over 75 million cumulative scientific papers have been published and approximately 2.5 million new scientific papers are published each year. While only a tiny fraction have relevance to the work of any given scientist, the load is still overwhelming, especially when you are looking for new findings that hint at a possible new research hypothesis. Fortunately, there are emerging AI-based tools like Science Surveyor, Se- mantic Scholar, and Iris AI that can help The goal of Science Surveyor is to take the text of an academic paper and search academic databases for other studies using similar terms. It then presents related articles that show how scientific thinking is changing over time by analyzing how language is used across all the selected articles. Semantic Scholar is an AI-based scientific search engine that returns results based on graphics and ¡°influential citations,¡± as well as key words. IRIS AI is a scientific browsing tool that searches scientific databases based on the ¡°crucial concepts¡± in related papers. Notably, just as AltaVista and the first release of Google presaged today¡¯s state-of-the-art search engines, to- day¡¯s exploration tools are just the beginning.


    Second AI-based experiment design will make spotty, but economically important progress over the next decade.


    A great deal of study is going into reducing the expensive human effort required to go from an understanding of the scientific literature to testing a meaningful hypothesis. Out- side of a few limited domains, existing AI has failed this test miserably. The Trends editors believe this will remain true for at least the next decade. However, within the domains where AI-based experiment design pays off, companies will reap big returns. Consider the case of Zymergen, a biotechnology company that ¡°tunes up¡± industrial microbes that produce ingredients for biofuels, plastics, or drugs. Seeking to boost production, companies send their workhorse strains to Zymergen. Robots at Zymer- gen run as many as 1000 experiments per week. Robots only follow orders, so giving them the right orders, that is experiment design, is the real bottle- neck. How does AI help Zymergen overcome this bottleneck? Management says, ¡°You¡¯ve got the original microbe here with about 5000 genes. Let¡¯s say there are 10 ways you could change a given gene. So that¡¯s 50,000 things you could be doing. Maybe 25 strains will produce slightly more of the target chemical. But if you just insert all 25 mutations that yielded small improvements into a single microbe, they don¡¯t add up to a big gain. Instead, the microbe becomes far less fit than the original strain. So, choosing the right path, including detours into promising valleys, requires a mental map showing all the effects of all the mutations at once; this map does not just have three dimensions, but thousands. 

     
    Machine learning keeps the whole process goal-oriented and consistent, achieving on average a 10% increase in bacteria productivity for Zymergen clients.¡± However, when the robots finally discover the genetic changes that optimize chemical output, the system doesn¡¯t have a clue about the biochemistry behind their effects. Not surprisingly, this is fine for a company where only results matter, but not very useful where new understanding is the goal.
     
    Third, going forward, fully-automated remote research labs will dramatically cut the time, cost and quality of running experiments, while eliminating human error.
    In the life sciences, cloud-based remote laboratories will deliver enormous benefits that will dramatically improve the speed, cost, quality and accessibility of state-of-the-art experimentation. Companies like Emerald Cloud Labs and Transcriptic sell time in their state- of-the-art robotic laboratories. Rather than invest a million dollars or more to build and operate a sterile, fully automated laboratory, any start-up or tech company can buy access to these facilities on an ¡°as-needed basis.¡± There, robots flawlessly exe- cute the researchers¡¯ experimental plan and deliver data files along with the frozen end-products of the experiments. Just as the cloud gives nearly every business access to supercomputing power, these labs give nearly every bio-tech researcher access to a laboratory. Suddenly a startup with angel or VC funding can compete with major company research centers.


    Fourth, because of its speed and power, AI- based data interpretation will deliver tens of billions of dollars in economic value over the next decade, and trillions in the years that follow.


    Consider, Eureqa, from Nutonian, a division Data-Robot Inc. Eureqa ingests very large data sets of the kind we see on social media, in genomics, and from climate studies. Then, it creates easy-to-interpret predictive models in minutes rather than weeks or months. That opens the door to better analysis of new data, as well new discoveries in enormous existing datasets that have never been exhaustively mined. And,


    Fifth, by 2030, AI will be used widely to help scientists publish their research, as well as file patents more quickly.


    To date, one of the most useful AI-tools is a free online resource called Citeomatic. It has been trained on several million papers and the citations made in them. Then, using the learned relationships discovered, it takes the author¡¯s preliminary paper with its preliminary set of citations and identifies any other citations that may be relevant. The result is a far better paper being submitted for peer-review. Looking ahead, it¡¯s obvious that when Citeomatic and similar tools are trained on the cumulative contents of the world¡¯s patent offices, patent applications will become easier to write and will be produced faster.


    References

    1. Science. John Bohannon. Jul. 5, 2017. A new breed of scientist, with brains of silicon.

    http://www.sciencemag.org/news/2017/07/new-breed-scientist-brains-silicon


    2. Science. Science News Staff. Jul. 5, 2017. AI is changing how we do science. Get a glimpse.

    http://www.sciencemag.org/news/2017/07/ai-changing-how-we-do-science-get-glimpse


    3. Inverse.com. Nathaniel Mott. September 21, 2016. How Microsoft Is Using Artificial Intelligence to ¡°Solve¡± Cancer.

    https://www.inverse.com/article/21232-microsoft-using-artificial-intelligence-solve-cancer


    4. Science. Paul Voosen. July 6, 2017. How AI detectives are cracking open the black box of deep learning.

    http://www.sciencemag.org/news/2017/07/how-ai-detectives-are-cracking-open-black-box-deep-learning