Artificial intelligence (AI), machine learning (ML), convolutional neural networks (CNN): all too often a technological crutch for those who don't know what they are doing, or even outright snake oil. When used as a crutch it is merely a state-of-the-art form of bloatware.
Just one example: there was a signal recognition program I was working on about 3 years ago before I retired. One company was hammering the AI/ML/CNN button hard. Their system worked but it required GPU level performance and thus GPU level size, weight, power and cost (SWAP-C). It took them three months and a shit-ton of sample data to train the thing on each new signal to be recognized. A competing company had a couple of PhDs working out recognition algorithms. They could bang out a new algorithm for a new signal in three weeks, required a lot less data to test (not train), and it ran on an embedded processor in a single FPGA. The customer down-selected to the PhD led effort.
So when you don't know what you are doing but you want software to figure out a method for you, AI is The Way. What you'll get will probably work, even work very well, but you won't know what it is doing and it'll require substantial resources to implement. Sometimes this is the best way, particularly if there is no good prior knowledge on the problem space.
On the other hand, if you
do know what you are doing, AI is almost certainly the wrong approach.
All human speech recovery type noise reduction requires the algorithm to model and estimate the noise spectral content and the speech spectral content. One is then subtracted from the other. Such estimations and models can be very simple or very complex. For instance, most of the NR built into the Japanese radios uses a fairly simple, broadband least mean squares (LMS) algorithm. Other approaches use a more complex spectral estimation model. For instance, Warren Pratt's NR2 algorithm converts audio from the time domain to the spectral domain via an FFT, makes separate estimates for
each individual FFT bin, then uses an IFFT to convert it back.
The RM Noise approach is undoubtedly doing the same thing, albeit one might argue that it has developed better models and estimates. Certainly the ability to train it on a sample of noise is an advantage over NR2. However NR2 "just works" without any training, without any separate software and, most notably, without the requirement for an internet connection to an "AI server" and tons of latency.
The movie and audio industry have developed many of the same tools over the last few decades, but they are relatively obscure outside those industries. With the advent of inexpensive or even free digital audio workstation (DAW) software they have become a lot more accessible. When RM Noise first came out I was playing around with some of these tools to see what was better or worse. I found
"Supertone Clear" to be a very, very good competitor to both RM Noise and NR2.