如何使波形渲染更有趣?

时间:2014-10-19 15:23:52

标签: algorithm audio graphics rendering waveform

我写了一个波形渲染器,它接收一个音频文件并创建如下内容:

enter image description here

逻辑非常简单。我计算每个像素所需的音频样本数,读取这些样本,平均它们并根据结果值绘制一列像素。

通常情况下,我会在大约600-800像素上渲染一首完整的歌曲,因此波浪非常紧凑。不幸的是,这通常会导致视觉效果不佳,因为几乎整首歌都是以几乎相同的高度呈现的。没有变化。

有趣的是,如果你看SoundCloud上的波形,几乎没有一个像我的结果一样无聊。他们都有一些变化。这可能是什么诀窍?我不认为他们只是添加随机噪音。

1 个答案:

答案 0 :(得分:21)

我不认为SoundCloud正在做一些特别特别的事情。我在他们的头版上看到很多很平常的歌曲。它更多地与细节的感知方式以及歌曲的整体动态有关。主要区别在于SoundCloud正在绘制绝对值。 (图像的负面只是一面镜子。)

为了演示,这是一个带有直线的基本白噪声图:

regular plot

现在,通常使用填充来使整体轮廓更容易看到。这对外观已经做了很多:

fill

较大的波形("缩小"特别是)通常使用镜面效果,因为动态变得更加明显:

wrap

条形图是另一种可视化的方式,可以给人一种细节的假象:

step

典型波形图形的伪例程(abs和镜像的平均值)可能如下所示:

for (each pixel in width of image) {
    var sum = 0

    for (each sample in subset contained within pixel) {
        sum = sum + abs(sample)
    }

    var avg = sum / length of subset

    draw line(avg to -avg)
}

这实际上就像将时间轴压缩为窗口的RMS一样。 (也可以使用RMS,但它们几乎相同。)现在波形显示整体动态。

这与你正在做的事情没什么不同,只是abs,镜子和填充。对于像SoundCloud一样使用的方框,您将绘制矩形。

作为奖励,这里是一个用Java编写的MCVE,用于生成带有框的波形。 (抱歉,如果Java不是您的语言。)实际绘图代码接近顶部。该程序也标准化,即波形被拉伸"达到图像的高度。

这个简单的输出与上面的伪例程相同:

normal output

带框的输出与SoundCloud非常相似:

box waveform

import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
import java.awt.image.*;
import java.io.*;
import javax.sound.sampled.*;

public class BoxWaveform {
    static int boxWidth = 4;
    static Dimension size = new Dimension(boxWidth == 1 ? 512 : 513, 97);

    static BufferedImage img;
    static JPanel view;

    // draw the image
    static void drawImage(float[] samples) {
        Graphics2D g2d = img.createGraphics();

        int numSubsets = size.width / boxWidth;
        int subsetLength = samples.length / numSubsets;

        float[] subsets = new float[numSubsets];

        // find average(abs) of each box subset
        int s = 0;
        for(int i = 0; i < subsets.length; i++) {

            double sum = 0;
            for(int k = 0; k < subsetLength; k++) {
                sum += Math.abs(samples[s++]);
            }

            subsets[i] = (float)(sum / subsetLength);
        }

        // find the peak so the waveform can be normalized
        // to the height of the image
        float normal = 0;
        for(float sample : subsets) {
            if(sample > normal)
                normal = sample;
        }

        // normalize and scale
        normal = 32768.0f / normal;
        for(int i = 0; i < subsets.length; i++) {
            subsets[i] *= normal;
            subsets[i] = (subsets[i] / 32768.0f) * (size.height / 2);
        }

        g2d.setColor(Color.GRAY);

        // convert to image coords and do actual drawing
        for(int i = 0; i < subsets.length; i++) {
            int sample = (int)subsets[i];

            int posY = (size.height / 2) - sample;
            int negY = (size.height / 2) + sample;

            int x = i * boxWidth;

            if(boxWidth == 1) {
                g2d.drawLine(x, posY, x, negY);
            } else {
                g2d.setColor(Color.GRAY);
                g2d.fillRect(x + 1, posY + 1, boxWidth - 1, negY - posY - 1);
                g2d.setColor(Color.DARK_GRAY);
                g2d.drawRect(x, posY, boxWidth, negY - posY);
            }
        }

        g2d.dispose();
        view.repaint();
        view.requestFocus();
    }

    // handle most WAV and AIFF files
    static void loadImage() {
        JFileChooser chooser = new JFileChooser();
        int val = chooser.showOpenDialog(null);
        if(val != JFileChooser.APPROVE_OPTION) {
            return;
        }

        File file = chooser.getSelectedFile();
        float[] samples;

        try {
            AudioInputStream in = AudioSystem.getAudioInputStream(file);
            AudioFormat fmt = in.getFormat();

            if(fmt.getEncoding() != AudioFormat.Encoding.PCM_SIGNED) {
                throw new UnsupportedAudioFileException("unsigned");
            }

            boolean big = fmt.isBigEndian();
            int chans = fmt.getChannels();
            int bits = fmt.getSampleSizeInBits();
            int bytes = bits + 7 >> 3;

            int frameLength = (int)in.getFrameLength();
            int bufferLength = chans * bytes * 1024;

            samples = new float[frameLength];
            byte[] buf = new byte[bufferLength];

            int i = 0;
            int bRead;
            while((bRead = in.read(buf)) > -1) {

                for(int b = 0; b < bRead;) {
                    double sum = 0;

                    // (sums to mono if multiple channels)
                    for(int c = 0; c < chans; c++) {
                        if(bytes == 1) {
                            sum += buf[b++] << 8;

                        } else {
                            int sample = 0;

                            // (quantizes to 16-bit)
                            if(big) {
                                sample |= (buf[b++] & 0xFF) << 8;
                                sample |= (buf[b++] & 0xFF);
                                b += bytes - 2;
                            } else {
                                b += bytes - 2;
                                sample |= (buf[b++] & 0xFF);
                                sample |= (buf[b++] & 0xFF) << 8;
                            }

                            final int sign = 1 << 15;
                            final int mask = -1 << 16;
                            if((sample & sign) == sign) {
                                sample |= mask;
                            }

                            sum += sample;
                        }
                    }

                    samples[i++] = (float)(sum / chans);
                }
            }

        } catch(Exception e) {
            problem(e);
            return;
        }

        if(img == null) {
            img = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
        }

        drawImage(samples);
    }

    static void problem(Object msg) {
        JOptionPane.showMessageDialog(null, String.valueOf(msg));
    }

    public static void main(String[] args) {
        SwingUtilities.invokeLater(new Runnable() {
            @Override
            public void run() {
                JFrame frame = new JFrame("Box Waveform");
                JPanel content = new JPanel(new BorderLayout());
                frame.setContentPane(content);

                JButton load = new JButton("Load");
                load.addActionListener(new ActionListener() {
                    @Override
                    public void actionPerformed(ActionEvent ae) {
                        loadImage();
                    }
                });

                view = new JPanel() {
                    @Override
                    protected void paintComponent(Graphics g) {
                        super.paintComponent(g);

                        if(img != null) {
                            g.drawImage(img, 1, 1, img.getWidth(), img.getHeight(), null);
                        }
                    }
                };

                view.setBackground(Color.WHITE);
                view.setPreferredSize(new Dimension(size.width + 2, size.height + 2));

                content.add(view, BorderLayout.CENTER);
                content.add(load, BorderLayout.SOUTH);

                frame.pack();
                frame.setResizable(false);
                frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
                frame.setLocationRelativeTo(null);
                frame.setVisible(true);
            }
        });
    }
}

注意:为简单起见,此程序将整个音频文件加载到内存中。某些JVM可能会抛出OutOfMemoryError。要更正此问题,请使用增加的堆大小as described here运行。