Research in the field of machine learning and AI, now a key technology in virtually every industry and company, is too voluminous for anyone to read in its entirety. This column aims to collect some of the most relevant recent discoveries and papers, particularly in artificial intelligence, among others, and explain why they are important.
This week, AI applications have found themselves in several unexpected niches due to their ability to classify large amounts of data or alternatively make sensible predictions based on limited evidence.
We’ve seen machine learning models draw on big data sets in biotech and finance, but researchers at ETH Zurich and LMU Munich are applying similar techniques to data generated by international development aid projects such as disaster relief and the House. The team trained their model on millions of projects (amounting to $2.8 trillion in funding) over the past 20 years, a huge dataset that is too complex to manually analyze in detail.
“You can think of the process as trying to read an entire library and sort similar books onto shelves by specific subjects. Our algorithm takes into account 200 different dimensions to determine how similar these 3.2 million projects are to each other, an impossible workload for a human being,” said study author Malte Toetzke.
Very high-level trends suggest that inclusion and diversity spending has increased, while climate spending has surprisingly decreased in recent years. You can examine the data set and the trends they analyzed here.
Another area that few people think about is the sheer number of machine parts and components that various industries churn out at an enormous rate. Some can be reused, some recycled, some should be disposed of responsibly, but there are too many for human specialists to analyze. German R&D team Fraunhofer has developed a machine learning model to identify parts so they can be used instead of heading to the scrapyard.
The system relies on more than ordinary camera views, as parts can look similar but be very different, or be mechanically identical but differ visually due to rust or wear. Therefore, each part is also weighed and scanned with 3D cameras, and metadata such as the origin is also included. The model then suggests what it thinks the part is so that the human inspecting it doesn’t have to start from scratch. It is expected that tens of thousands of pieces will soon be saved and the processing of millions will be accelerated by using this AI-assisted identification method.
Physicists have found an interesting way to apply the qualities of ML to a centuries-old problem. Essentially, researchers are always looking for ways to prove that the equations that govern fluid dynamics (some of which, like Euler’s, date back to the 18th century) are incomplete, that they break down at certain extreme values. Using traditional computational techniques, this is difficult to do, although not impossible. But researchers at CIT and Hang Seng University in Hong Kong are proposing a new deep learning method to isolate likely cases of fluid dynamics singularities, while others are applying the technique in other ways to the field. This Quanta article explains this interesting development quite well.
Another century-old concept that gets an ML layer is kirigami, the art of paper cutting that many will be familiar with in the context of creating paper snowflakes. The technique dates back centuries to Japan and China in particular, and can produce remarkably complex and flexible structures. Researchers at Argonne National Labs were inspired by the concept to theorize a 2D material that can retain microscopic-scale electronics but also flex easily.
The team had been doing tens of thousands of experiments with 1-6 slices manually and used that data to train the model. They then used a Department of Energy supercomputer to run simulations down to the molecular level. Within seconds he produced a 10-cut variation with 40 percent stretchability, far beyond anything the team had expected or even attempted on their own.
“He has discovered things that we never told him to decipher. He learned something the way a human learns and used his knowledge to do something different,” said project leader Pankaj Rajak. The success has prompted them to increase the complexity and scope of the simulation.
Another interesting extrapolation done by a specially trained AI has a computer vision model that reconstructs color data from infrared inputs. Normally, a camera that captures IR would know nothing about the color of an object in the visible spectrum. But this experiment found correlations between certain IR and visible bands, and created a model for converting images of human faces captured in IR to ones that approximate the visible spectrum.
It’s still just a proof of concept, but such spectrum flexibility could be a useful tool in science and photography.
Meanwhile, a new study co-authored with Google AI lead Jeff Dean pushes back against the notion that AI is an environmentally costly endeavor, due to its high computational requirements. While some research has found that training a large model like OpenAI’s GPT-3 can result in carbon dioxide emissions equivalent to that of a small neighborhood, the Google-affiliated study argues that “following best practices” can reduce carbon emissions from machine learning by up to 1,000 times.
The practices in question refer to the types of models used, the machines used to train models, “mechanization” (eg, cloud computing versus local computers), and “map” (selection of data center locations). data with the cleanest energy). According to the co-authors, selecting “efficient” models alone can reduce compute by factors of 5 to 10, while using machine learning-optimized processors, such as GPUs, can improve performance-per-watt by factors. from 2 to 5.
Any strand of research that suggests the environmental impact of AI can be reduced is cause for celebration, indeed. But it should be noted that Google is not a neutral party. Many of the company’s products, from Google Maps to Google Search, are based on models that require large amounts of energy to develop and run.
Mike Cook, a member of the Open Knife and Brush Research Group, points out that even if the study’s estimates are accurate, there are it’s just not a good reason why a company shouldn’t expand in an energy-inefficient way if it benefits them. While academic groups may pay attention to metrics like carbon impact, companies are not as incentivized in the same way, at least currently.
“The main reason we’re having this conversation to begin with is that companies like Google and OpenAI had effectively infinite funds and chose to leverage it to build models like GPT-3 and BERT at whatever cost, because they knew it gave them an advantage, ” Cook told TechCrunch via email. “General, I think the document says some good things and it’s great if we’re thinking about efficiency, but in my opinion the problem is not technical: we know for a fact that these companies will grow when they need to, they won. they don’t hold back, so to say this is now settled forever feels like an empty line.”
The last topic this week is not exactly about machine learning, but about what might be a way forward to simulate the brain in a more direct way. EPFL bioinformatics researchers created a mathematical model to create tons of unique yet precise simulated neurons that could eventually be used to build neuroanatomy digital twins.
“The findings are already enabling Blue Brain to build biologically detailed reconstructions and simulations of the mouse brain, by computationally reconstructing brain regions for simulations that replicate the anatomical properties of neuronal morphologies and include region-specific anatomy,” said Dr. researcher Lida Kanari.
Don’t expect sim-brains to produce better AIs – this is very much in the quest for breakthroughs in neuroscience – but perhaps insights from simulated neural networks can lead to fundamental improvements in understanding the processes AI seeks to mimic. digitally.