如何计算“完美”列宽

时间:2014-06-24 19:31:55

标签: php formula date-arithmetic

所以,这是计划:我正在使用TCPDF生成包含表格的PDF文档。我正在PHP中生成一个html表,我将其传递给TCPDF。但是,TCPDF使每列的宽度相等,这是一个问题,因为每列中的内容长度是完全不同的。解决方案是在表的width上设置<td>属性。但我不能完美地锻炼这样做的完美方式。这就是我目前正在做的事情:

  1. 我生成一个名为$maxColumnSizes的数组,其中我存储每列最大个字母数。
  2. 我生成一个名为$averageSizes的数组,其中我存储每列平均个字母数。
  3. 因此,下面您将看到一个示例计算。第0列平均有8个字母,最多26个字母,第4列平均有10个字母,最多有209个字母: enter image description here

    所以,问题在于:我无法想到将这些信息结合起来以获得“完美”列宽的“正确”方法。如果我忽略$maxColumnSizes数组并根据$averageSizes设置列宽,则该表看起来非常好。 第4列有209个字符的一行除外。由于第4列非常小,所以有209个字符的行具有疯狂的高度,以适应209个字符。

    总结一下:如何计算“完美”表格列宽(给定表格数据)?

    注意:

    • “完美的宽度”对我来说意味着整个桌子的高度尽可能小。
    • 我目前不考虑字母宽度(我不区分iw的宽度)
    • 由于我可以访问所有数据,因此我还可以进行任何其他计算。我上面提到的两个数组我只用在我的第一次尝试中。

    修改

    根据评论,我添加了另一个计算$ maxColumnSize / $ averageColumnSize的计算: enter image description here

2 个答案:

答案 0 :(得分:1)

这是相当主观的,但要采用算法:

// Following two functions taken from this answer:
// http://stackoverflow.com/a/5434698/697370

// Function to calculate square of value - mean
function sd_square($x, $mean) { return pow($x - $mean,2); }

// Function to calculate standard deviation (uses sd_square)    
function sd($array) {
    // square root of sum of squares devided by N-1
    return sqrt(array_sum(array_map("sd_square", $array, array_fill(0,count($array), (array_sum($array) / count($array)) ) ) ) / (count($array)-1) );
}

// For any column...
$colMaxSize = /** from your table **/;
$colAvgSize = /** from your table **/;

$stdDeviation = sd(/** array of lengths for your column**/);
$coefficientVariation = $stdDeviation / $colAvgSize;

if($coefficientVariation > 0.5 && $coefficientVariation < 1.5) {
    // The average width of the column is close to the standard deviation
    // In this case I would just make the width of the column equal to the 
    // average.
} else {
    // There is a large variance in your dataset (really small values and 
    // really large values in the same set).
    // What to do here? I would base the width off of the max size, perhaps 
    // using (int)($colMaxSize / 2) or (int)($colMaxSize / 3) to fix long entries 
    // to a given number of lines.
}

有一个PECL扩展为您提供stats_standard_deviation功能,但默认情况下它不与PHP捆绑在一起。你也可以使用上面的0.5和1.5值,直到你得到看起来“恰到好处”的东西。

答案 1 :(得分:1)

根据@ watcher的回答,我想出了以下代码。它在我的测试用例中效果很好。我还制作了一个GitHub repository with my code,因为它比StackOverflow上的可读性要好得多。

<?php
/**
 * A simple class to auto-calculate the "perfect" column widths of a table.
 * Copyright (C) 2014 Christian Flach
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
 *
 * This is based on my question at StackOverflow:
 * http://stackoverflow.com/questions/24394787/how-to-calculate-the-perfect-column-widths
 *
 * Thank you "watcher" (http://stackoverflow.com/users/697370/watcher) for the initial idea!
 */

namespace Cmfcmf;

class ColumnWidthCalculator
{
    /**
     * @var array
     */
    private $rows;

    /**
     * @var bool
     */
    private $html;

    /**
     * @var bool
     */
    private $stripTags;

    /**
     * @var int
     */
    private $minPercentage;

    /**
     * @var Callable|null
     */
    private $customColumnFunction;

    /**
     * @param array $rows An array of rows, where each row is an array of cells containing the cell content.
     * @param bool  $html Whether or not the rows contain html content. This will call html_entity_decode.
     * @param bool  $stripTags Whether or not to strip tags (only if $html is true).
     * @param int   $minPercentage The minimum percentage each row must be wide.
     * @param null  $customColumnFunction A custom function to transform a cell's value before it's length is measured.
     */
    public function __construct(array $rows, $html = false, $stripTags = false, $minPercentage = 3, $customColumnFunction = null)
    {
        $this->rows = $rows;
        $this->html = $html;
        $this->stripTags = $stripTags;
        $this->minPercentage = $minPercentage;
        $this->customColumnFunction = $customColumnFunction;
    }

    /**
     * Calculate the column widths.
     *
     * @return array
     *
     * Explanation of return array:
     * - $columnSizes[$colNumber]['percentage'] The calculated column width in percents.
     * - $columnSizes[$colNumber]['calc'] The calculated column width in letters.
     *
     * - $columnSizes[$colNumber]['max'] The maximum column width in letters.
     * - $columnSizes[$colNumber]['avg'] The average column width in letters.
     * - $columnSizes[$colNumber]['raw'] An array of all the column widths of this column in letters.
     * - $columnSizes[$colNumber]['stdd'] The calculated standard deviation in letters.
     *
     * INTERNAL
     * - $columnSizes[$colNumber]['cv'] The calculated standard deviation / the average column width in letters.
     * - $columnSizes[$colNumber]['stdd/max'] The calculated standard deviation / the maximum column width in letters.
     */
    public function calculateWidths()
    {
        $columnSizes = array();

        foreach ($this->rows as $row) {
            foreach ($row as $key => $column) {
                if (isset($this->customColumnFunction)) {
                    $column = call_user_func_array($this->customColumnFunction, array($column));
                }
                $length = $this->strWidth($this->html ? html_entity_decode($this->stripTags ? strip_tags($column) : $column) : $column);

                $columnSizes[$key]['max'] = !isset($columnSizes[$key]['max']) ? $length : ($columnSizes[$key]['max'] < $length ? $length : $columnSizes[$key]['max']);

                // Sum up the lengths in `avg` for now. See below where it is converted to the actual average.
                $columnSizes[$key]['avg'] = !isset($columnSizes[$key]['avg']) ? $length : $columnSizes[$key]['avg'] + $length;
                $columnSizes[$key]['raw'][] = $length;
            }
        }

        // Calculate the actual averages.
        $columnSizes = array_map(function ($columnSize) {
            $columnSize['avg'] = $columnSize['avg'] / count ($columnSize['raw']);

            return $columnSize;
        }, $columnSizes);

        foreach ($columnSizes as $key => $columnSize) {
            $colMaxSize = $columnSize['max'];
            $colAvgSize = $columnSize['avg'];

            $stdDeviation = $this->sd($columnSize['raw']);
            $coefficientVariation = $stdDeviation / $colAvgSize;

            $columnSizes[$key]['cv'] = $coefficientVariation;
            $columnSizes[$key]['stdd'] = $stdDeviation;
            $columnSizes[$key]['stdd/max'] = $stdDeviation / $colMaxSize;

            // $columnSizes[$key]['stdd/max'] < 0.3 is here for no mathematical reason, it's been found by trying stuff
            if(($columnSizes[$key]['stdd/max'] < 0.3 || $coefficientVariation == 1) && ($coefficientVariation == 0 || ($coefficientVariation > 0.6 && $coefficientVariation < 1.5))) {
                // The average width of the column is close to the standard deviation
                // In this case I would just make the width of the column equal to the
                // average.
                $columnSizes[$key]['calc'] = $colAvgSize;
            } else {
                // There is a large variance in the dataset (really small values and
                // really large values in the same set).
                // Do some magic! (There is no mathematical rule behind that line, it's been created by trying different combinations.)
                if ($coefficientVariation > 1 && $columnSizes[$key]['stdd'] > 4.5 && $columnSizes[$key]['stdd/max'] > 0.2) {
                    $tmp = ($colMaxSize - $colAvgSize) / 2;
                } else {
                    $tmp = 0;
                }

                $columnSizes[$key]['calc'] = $colAvgSize + ($colMaxSize / $colAvgSize) * 2 / abs(1 - $coefficientVariation);
                $columnSizes[$key]['calc'] = $columnSizes[$key]['calc'] > $colMaxSize ? $colMaxSize - $tmp : $columnSizes[$key]['calc'];
            }
        }

        $totalCalculatedSize = 0;
        foreach ($columnSizes as $columnSize) {
            $totalCalculatedSize += $columnSize['calc'];
        }

        // Convert calculated sizes to percentages.
        foreach ($columnSizes as $key => $columnSize) {
            $columnSizes[$key]['percentage'] = 100 / ($totalCalculatedSize / $columnSize['calc']);
        }

        // Make sure everything is at least 3 percent wide.
        if ($this->minPercentage > 0) {
            foreach ($columnSizes as $key => $columnSize) {
                if ($columnSize['percentage'] < $this->minPercentage) {
                    // That's how many percent we need to steal.
                    $neededPercents = ($this->minPercentage - $columnSize['percentage']);

                    // Steal some percents from the column with the $coefficientVariation nearest to one and being big enough.
                    $lowestDistance = 9999999;
                    $stealKey = null;
                    foreach ($columnSizes as $k => $val) {
                        // This is the distance from the actual $coefficientVariation to 1.
                        $distance = abs(1 - $val['cv']);
                        if ($distance < $lowestDistance
                            && $val['calc'] - $neededPercents > $val['avg'] /* This line is here due to whatever reason :/ */
                            && $val['percentage'] - $this->minPercentage >= $neededPercents /* Make sure the column we steal from would still be wider than $this->minPercentage percent after stealing. */
                        ) {
                            $stealKey = $k;
                            $lowestDistance = $distance;
                        }
                    }
                    if (!isset($stealKey)) {
                        // Dang it! We could not get something reliable here. Fallback to stealing from the largest column.
                        $max = -1;
                        foreach ($columnSizes as $k => $val) {
                            if ($val['percentage'] > $max) {
                                $stealKey = $k;
                                $max = $val['percentage'];
                            }
                        }
                    }
                    $columnSizes[$stealKey]['percentage'] = $columnSizes[$stealKey]['percentage'] - $neededPercents;

                    $columnSizes[$key]['percentage'] = $this->minPercentage;
                }
            }
        }

        return $columnSizes;
    }

    /**
     * Function to calculate standard deviation.
     * http://stackoverflow.com/a/5434698/697370
     *
     * @param $array
     *
     * @return float
     */
    protected function sd($array)
    {
        if (count($array) == 1) {
            // Return 1 if we only have one value.
            return 1.0;
        }
        // Function to calculate square of value - mean
        $sd_square = function ($x, $mean) { return pow($x - $mean,2); };

        // square root of sum of squares devided by N-1
        return sqrt(array_sum(array_map($sd_square, $array, array_fill(0,count($array), (array_sum($array) / count($array)) ) ) ) / (count($array)-1) );
    }


    /**
     * Helper function to get the (approximate) width of a string. A normal character counts as 1, short characters
     * count as 0.4 and long characters count as 1.3.
     * The minimum width returned is 1.
     *
     * @param $text
     *
     * @return float
     */
    protected function strWidth($text)
    {
        $smallCharacters = array('!', 'i', 'f', 'j', 'l', ',', ';', '.', ':', '-', '|',
            ' ', /* normal whitespace */
            "\xC2", /* non breaking whitespace */
            "\xA0", /* non breaking whitespace */
            "\n",
            "\r",
            "\t",
            "\0",
            "\x0B" /* vertical tab */
        );
        $bigCharacters = array('w', 'm', '—', 'G', 'ß', '@');

        $width = strlen($text);
        foreach (count_chars($text, 1) as $i => $val) {
            if (in_array(chr($i), $smallCharacters)) {
                $width -= (0.6 * $val);
            }
            if (in_array(chr($i), $bigCharacters)) {
                $width += (0.3 * $val);
            }
        }
        if ($width < 1) {
            $width = 1;
        }

        return (float)$width;
    }
}

那就是它! $columnSizes[$colNumber]['percentage']现在为每列包含一个合适的(&#34;完美&#34;)宽度。