With $151,629 in grant funding from the Defense Advanced Research Projects Agency, researchers at the Benjamin M. Statler College of Engineering and Mineral Resources have developed a promising new method of utilizing dataless neural networks to respond to a wide range of optimization problems.
Lane Department of Computer Science and Electrical Engineering Professor K. Subramani and Research Fellow Sangram Jena, along with DARPA program manager Alvaro Velasquez, have presented the project research at the 2023 International Conference on Combinatorial Optimization and Applications as well as the 2024 International Symposium on Artificial Intelligence and Mathematics.
“Our [approach] is completely dataless, so the creativity lies in changing — or rather in converting — a problem in the traditional computing domain to one in the neural network domain,” Subramani said.
Computers learn how to perform functions and tasks through a series of commands and analyzing examples. Modeled after the human brain, neural networks describe methods of machine learning. Most modern neural networks are organized into layers of thousands, or sometimes millions, of processing nodes. Some networks are “feedforward,” meaning information only flows in one direction from input to output nodes, while others are recurrent and involve bi-directional information flow. One node is connected to nodes in the layer below, receiving information and then passing data to nodes in the layer above it and so on.
Nodes assign a number, or “weight,” to each incoming connection. In an active network, the node receives data items from all of its connections, multiples each item by its corresponding weight and calculates that to produce a single number. When the neural network is being programmed to give responses, all the weights and thresholds have random values. If that number meets a predetermined threshold value, the node passes the sum of those inputs or data to the next layer. For example, if a user wants to train a neural network in object recognition to label all the photos of cats in a photo library, the user would have to teach the network how to identify what a cat is. The image is the input and the output is the label. After showing the network several images of cats, the user trains the weights to respond with the correct label (output) of cat.
The network is fed data and taught by humans how to respond and categorize the data, assigning weights to each input. This is known as supervised learning. Applications like ChatGPT and AlphaGo use language learning models through reinforcement and human feedback — a traditional approach that Subramani says is costly and requires extensive training.
Large language models have to be able to respond to a huge number of human-written texts, which requires a lot of “brain” power. In the real world, this translates to massive facilities with enormous energy consumption that generates a lot of heat — and requires a lot of water to keep things cool. From 2021 to 2022, Microsoft’s global water consumption spiked 34% to 1.7 billion gallons, while Google reported a 20% increase during the same time period — growth that outside researchers have attributed to the explosion of the companies’ generative AI products.
WVU researchers are taking a theoretical approach to neural networks in efforts to skip the learning part altogether, avoiding the need for warehouses of servers responsible for excessive energy consumption. So how do dataless neural networks respond to large problems?
“This is important because in computer science, there are a class of problems which are considered computationally difficult and the traditional approaches to solve them have not been very successful,” Subramani said. “The hope is that with this approach, we will be able to get better results.”
Subramani says one problem the team has focused on solving extensively is utilizing the dataless neural network to solve the model for kidney donor exchange program cycles.
“So there are a bunch of donors, and there are a bunch of recipients,” he explained. “Some of the donors are altruistic, which is that they just want to donate. Some of them want a kidney in return, so they are willing to part with their kidney, but they want their relative or somebody else to get a kidney. So the question is, how do we achieve and one of the goals is to have all these operations done simultaneously so you don't have to keep the kidney out of the human body?”
These are called cycles. The kidney exchange problem in a mathematical model is the problem of finding a set of cycles with the maximum number of transplants — given a set of recipient-donor pairs, graphs to show their compatibility and a limit on the number of pairs involved in any cycle.
“So you want to have all these things scheduled at around the same time so that there's a quick motion that the kidney goes to the right recipient and so on. There could be change and incomplete cycles, but we want to find a solution in which you succeed in satisfying everybody's demand. You also want cycles to be short,” he said. “You don't want to have one guy here, one guy in California, the next guy in Alaska. So if you can have short cycles, that's definitely preferred, and that's the most exciting part of this research that we hope will have a concrete impact in society.”
Subramani and his team have had promising results with the potential for the neural network to be able to solve several cycles at the same time. The ability of these neural networks to analyze complex and unstructured data has exciting potential for other applications in health care, empowering organizations and professionals to make data-driven decisions that have a broad impact.