Java - 将人类可读大小转换为字节

时间:2013-06-06 17:36:38

标签: java byte long-integer human-readable

我发现了很多关于将原始字节信息转换为人类可读格式的信息,但我需要做相反的事情,即将字符串“1.6 GB”转换为长值1717990000.是否有内置/定义明确的方式,或者我几乎不得不自己动手?

[编辑]:这是我的第一次刺...

static class ByteFormat extends NumberFormat {
    @Override
    public StringBuffer format(double arg0, StringBuffer arg1, FieldPosition arg2) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public StringBuffer format(long arg0, StringBuffer arg1, FieldPosition arg2) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public Number parse(String arg0, ParsePosition arg1) {
        return parse (arg0);
    }

    @Override
    public Number parse(String arg0) {
        int spaceNdx = arg0.indexOf(" ");
        double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
        String unit = arg0.substring(spaceNdx + 1);
        int factor = 0;
        if (unit.equals("GB")) {
            factor = 1073741824;
        }
        else if (unit.equals("MB")) {
            factor = 1048576;
        }
        else if (unit.equals("KB")) {
            factor = 1024;
        }

        return ret * factor;
    }
}

8 个答案:

答案 0 :(得分:4)

Andremoniy的答案的修订版本,正确区分公斤和kibi等。

private final static long KB_FACTOR = 1000;
private final static long KIB_FACTOR = 1024;
private final static long MB_FACTOR = 1000 * KB_FACTOR;
private final static long MIB_FACTOR = 1024 * KIB_FACTOR;
private final static long GB_FACTOR = 1000 * MB_FACTOR;
private final static long GIB_FACTOR = 1024 * MIB_FACTOR;

public static double parse(String arg0) {
    int spaceNdx = arg0.indexOf(" ");
    double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
    switch (arg0.substring(spaceNdx + 1)) {
        case "GB":
            return ret * GB_FACTOR;
        case "GiB":
            return ret * GIB_FACTOR;
        case "MB":
            return ret * MB_FACTOR;
        case "MiB":
            return ret * MIB_FACTOR;
        case "KB":
            return ret * KB_FACTOR;
        case "KiB":
            return ret * KIB_FACTOR;
    }
    return -1;
}

答案 1 :(得分:4)

一站式答案,解析为long

public class SizeUtil {

    public static String units = "BKMGTPEZY";

    public static long parse(String arg0) {
        int spaceNdx = arg0.indexOf(" ");    
        double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
        String unitString = arg0.substring(spaceNdx+1);
        int unitChar = unitString.charAt(0);
        int power = units.indexOf(unitChar);
        boolean isSi = unitString.indexOf('i')!=-1;
        int factor = 1024;
        if (isSi) 
        {
            factor = 1000;
        }

        return new Double(ret * Math.pow(factor, power)).longValue();
    }

    public static void main(String[] args) {
        System.out.println(parse("300.00 GiB")); // requires a space
        System.out.println(parse("300.00 GB"));
        System.out.println(parse("300.00 B"));
        System.out.println(parse("300 EB"));
    }
}

答案 2 :(得分:2)

我从来没有听说过这样一个着名的库,它实现了这样的文本解析实用程序方法。但是你的解决方案似乎接近正确的实施。

我想在你的代码中纠正的唯一两件事是:

  1. 将方法Number parse(String arg0)定义为静态,因为它具有实用性

  2. 为每种类型的尺寸定义定义factor作为final static字段。

  3. 即。它会像这样:

    private final static long KB_FACTOR = 1024;
    private final static long MB_FACTOR = 1024 * KB_FACTOR;
    private final static long GB_FACTOR = 1024 * MB_FACTOR;
    
    public static double parse(String arg0) {
        int spaceNdx = arg0.indexOf(" ");
        double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
        switch (arg0.substring(spaceNdx + 1)) {
            case "GB":
                return ret * GB_FACTOR;
            case "MB":
                return ret * MB_FACTOR;
            case "KB":
                return ret * KB_FACTOR;
        }
        return -1;
    }
    

答案 3 :(得分:2)

我知道这要晚得多,但我正在寻找一个兼顾SI prefix的类似功能。 所以我自己创建了一个,我认为它可能对其他人有用。

public static String units = "KMGTPE";

/**
 * Converts from human readable to byte format
 * @param number The number value of the amount to convert
 * @param unit The unit: B, KB, MB, GB, TB, PB, EB
 * @param si Si prefix
 * @return byte value
 */
public static double parse(double number, String unit, boolean si)
{
    String identifier = unit.substring(0, 1);
    int index = units.indexOf(identifier);
    //not already in bytes
    if (index!=-1)
    {
        for (int i = 0; i <= index; i++)
            number = number * (si ? 1000 : 1024);
    }
    return number;
}

我确信这也可以用于递归。打扰太简单......

答案 4 :(得分:2)

Spring Framework在版本5.1上添加了一个DataSize类,该类允许将人类可读的数据大小解析为字节,并且还可以将其格式化为人类可读的形式。可以找到here

如果使用Spring Framework,则可以升级到> = 5.1并使用此类。否则,您可以c / p它和相关类(同时遵守许可证)。

然后您可以使用它:

DataSize dataSize = DataSize.parse("16GB");
System.out.println(dataSize.toBytes());

将给出输出:

  

17179869184

但是,用于解析输入的模式

  • 不支持小数(因此,您可以使用1GB2GB1638MB,但不能使用1.6GB
  • 不支持空格(因此,您可以使用1GB但不能使用1 GB

我建议遵循兼容性/易于维护的约定。 但是,如果这不满足您的需要,则需要复制和编辑文件-这是一个很好的起点。

答案 5 :(得分:1)

另一个基于@gilbertpilz代码的选项。在这种情况下,使用正则表达式获取值和因数。它也不区分大小写。

    private final static long KB_FACTOR = 1000;
    private final static long KIB_FACTOR = 1024;
    private final static long MB_FACTOR = 1000 * KB_FACTOR;
    private final static long MIB_FACTOR = 1024 * KIB_FACTOR;
    private final static long GB_FACTOR = 1000 * MB_FACTOR;
    private final static long GIB_FACTOR = 1024 * MIB_FACTOR;

    private long parse(String arg0) throws ParseException {
        Pattern pattern = Pattern.compile("([0-9]+)(([KMG])I?B)");
        Matcher match = pattern.matcher(arg0);

        if( !match.matches() || match.groupCount()!=3)
            throw new ParseException("Wrong format", 0);

        long ret = Long.parseLong(match.group(0));
        switch (match.group(2).toUpperCase()) {
            case "GB":
                return ret * GB_FACTOR;
            case "GIB":
                return ret * GIB_FACTOR;
            case "MB":
                return ret * MB_FACTOR;
            case "MIB":
                return ret * MIB_FACTOR;
            case "KB":
                return ret * KB_FACTOR;
            case "KIB":
                return ret * KIB_FACTOR;
        }

        throw new ParseException("Wrong format", 0);
    }

答案 6 :(得分:0)

也可以使用以下方法并使之通用,而不依赖于空格字符进行解析。

感谢@RobAu提供上述提示。添加了一种新方法来获取字符串中第一个字母的索引,并更改了parse方法以基于该新方法来获取索引。我保留了原始的parse方法并添加了新的parseAny方法,因此可以比较结果。希望对别人有帮助。

还要感谢indexOf方法的答案-https://stackoverflow.com/a/11214786/6385674

public class ConversionUtil {

    public static String units = "BKMGTPEZY";

    public static long parse(String arg0) {
        int spaceNdx = arg0.indexOf(" ");    
        double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
        String unitString = arg0.substring(spaceNdx+1);
        int unitChar = unitString.charAt(0);
        int power = units.indexOf(unitChar);
        boolean isSi = unitString.indexOf('i')!=-1;
        int factor = 1024;
        if (isSi) 
        {
            factor = 1000;
        }

        return new Double(ret * Math.pow(factor, power)).longValue();
    }
    /** @return index of pattern in s or -1, if not found */
    public static int indexOf(Pattern pattern, String s) {
        Matcher matcher = pattern.matcher(s);
        return matcher.find() ? matcher.start() : -1;
    }    
    public static long parseAny(String arg0)
    {
        int index = indexOf(Pattern.compile("[A-Za-z]"), arg0);
        double ret = Double.parseDouble(arg0.substring(0, index));
        String unitString = arg0.substring(index);
        int unitChar = unitString.charAt(0);
        int power = units.indexOf(unitChar);
        boolean isSi = unitString.indexOf('i')!=-1;
        int factor = 1024;
        if (isSi) 
        {
            factor = 1000;
        }

        return new Double(ret * Math.pow(factor, power)).longValue();       

    }
    public static void main(String[] args) {
        System.out.println(parse("300.00 GiB")); // requires a space
        System.out.println(parse("300.00 GB"));
        System.out.println(parse("300.00 B"));        
        System.out.println(parse("300 EB"));
        System.out.println(parseAny("300.00 GiB"));
        System.out.println(parseAny("300M"));
    }
}

答案 7 :(得分:0)

我写了一个文件大小的可读实用程序枚举类,希望对您有帮助!

/**
 * The file size human readable utility class, 
 * provide  mutual conversions from human readable size to byte size
 * 
 * The similar function in stackoverflow, linked:
 *  https://stackoverflow.com/questions/3758606/how-to-convert-byte-size-into-human-readable-format-in-java?r=SearchResults
 * 
 * Apache also provide similar function
 * @see org.apache.commons.io.FileUtils#byteCountToDisplaySize(long)
 * 
 * @author Ponfee
 */
public enum HumanReadables {

    SI    (1000, "B", "KB",  "MB",  "GB",  "TB",  "PB",  "EB" /*, "ZB",  "YB" */), // 

    BINARY(1024, "B", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB"/*, "ZiB", "YiB"*/), // 
    ;

    private static final String FORMAT = "#,##0.##";
    private static final Pattern PATTERN = Pattern.compile(".*[0-9]+.*");

    private final int      base;
    private final String[] units;
    private final long[]   sizes;

    HumanReadables(int base, String... units) {
        this.base  = base;
        this.units = units;
        this.sizes = new long[this.units.length];

        this.sizes[0] = 1;
        for (int i = 1; i < this.sizes.length; i++) {
            this.sizes[i] = this.sizes[i - 1] * this.base; // Maths.pow(this.base, i);
        }
    }

    /**
     * Returns a string of bytes count human readable size
     * 
     * @param size the size
     * @return human readable size
     */
    public strictfp String human(long size) {
        if (size == 0) {
            return "0" + this.units[0];
        }

        String signed = "";
        if (size < 0) {
            signed = "-";
            size = size == Long.MIN_VALUE ? Long.MAX_VALUE : -size;
        }

        /*int unit = (int) Maths.log(size, this.base);
        return signed + format(size / Math.pow(this.base, unit)) + " " + this.units[unit];*/

        int unit = find(size);
        return new StringBuilder(13) // 13 max length like as "-1,023.45 GiB"
            .append(signed)
            .append(formatter().format(size / (double) this.sizes[unit]))
            .append(" ")
            .append(this.units[unit])
            .toString();
    }

    public strictfp long parse(String size) {
        return parse(size, false);
    }

    /**
     * Parse the readable byte count, allowed suffix units: "1", "1B", "1MB", "1MiB", "1M"
     * 
     * @param size   the size
     * @param strict the strict, if BINARY then verify whether contains "i"
     * @return a long value bytes count
     */
    public strictfp long parse(String size, boolean strict) {
        if (size == null || size.isEmpty()) {
            return 0L;
        }
        if (!PATTERN.matcher(size).matches()) {
            throw new IllegalArgumentException("Invalid format [" + size + "]");
        }

        String str = size = size.trim();
        long factor = this.sizes[0];
        switch (str.charAt(0)) {
            case '+': str = str.substring(1);               break;
            case '-': str = str.substring(1); factor = -1L; break;
        }

        int end = 0, lastPos = str.length() - 1;
        // last character isn't a digit
        char c = str.charAt(lastPos - end);
        if (c == 'i') {
            // last pos cannot end with "i"
            throw new IllegalArgumentException("Invalid format [" + size + "], cannot end with \"i\".");
        }

        if (c == 'B') {
            end++;
            c = str.charAt(lastPos - end);

            boolean flag = isBlank(c);
            while (isBlank(c) && end < lastPos) {
                end++;
                c = str.charAt(lastPos - end);
            }
            // if "B" head has space char, then the first head non space char must be a digit
            if (flag && !Character.isDigit(c)) {
                throw new IllegalArgumentException("Invalid format [" + size + "]: \"" + c + "\".");
            }
        }

        if (!Character.isDigit(c)) {
            // if not a digit character, then assume is a unit character
            if (c == 'i') {
                if (this == SI) {
                    // SI cannot contains "i"
                    throw new IllegalArgumentException("Invalid SI format [" + size + "], cannot contains \"i\".");
                }
                end++;
                c = str.charAt(lastPos - end);
            } else {
                if (this == BINARY && strict) {
                    // if strict, then BINARY must contains "i"
                    throw new IllegalArgumentException("Invalid BINARY format [" + size + "], miss character \"i\".");
                }
            }

            switch (c) {
                case 'K': factor *= this.sizes[1]; break;
                case 'M': factor *= this.sizes[2]; break;
                case 'G': factor *= this.sizes[3]; break;
                case 'T': factor *= this.sizes[4]; break;
                case 'P': factor *= this.sizes[5]; break;
                case 'E': factor *= this.sizes[6]; break;
                /*
                case 'Z': factor *= this.bytes[7]; break;
                case 'Y': factor *= this.bytes[8]; break;
                */
                default: throw new IllegalArgumentException("Invalid format [" + size + "]: \"" + c + "\".");
            }

            do {
                end++;
                c = str.charAt(lastPos - end);
            } while (isBlank(c) && end < lastPos);
        }

        str = str.substring(0, str.length() - end);
        try {
            return (long) (factor * formatter().parse(str).doubleValue());
        } catch (NumberFormatException | ParseException e) {
            throw new IllegalArgumentException("Failed to parse [" + size + "]: \"" + str + "\".");
        }
    }

    public int base() {
        return this.base;
    }

    public String[] units() {
        return Arrays.copyOf(this.units, this.units.length);
    }

    public long[] sizes() {
        return Arrays.copyOf(this.sizes, this.sizes.length);
    }

    private int find(long bytes) {
        int n = this.sizes.length;
        for (int i = 1; i < n; i++) {
            if (bytes < this.sizes[i]) {
                return i - 1;
            }
        }
        return n - 1;
    }

    private DecimalFormat formatter() {
        return new DecimalFormat(FORMAT);
    }

    private boolean isBlank(char c) {
        return c == ' ' || c == '\t';
    }

}