PHP CSV到XML如何处理管道分隔的字符串

时间:2015-05-22 17:37:47

标签: php xml excel csv

首先,如果你正在寻找我的问题的要点,我知道这是一个相当长/详细的帖子,你可以跳到我有TLDR的底部。在此先感谢所有评论者

我一直在为我的客户网站制作一项功能。他们在MAC上有一个旧版本的Microsoft Excel,它不支持.XML - 他们使用的商店系统.XML

所以我需要编写将CSV转换为XML的能力,但XML必须符合商店组件所需的结构。我已经编写了一个XML到CSV函数,它可以正常工作。

这是商店系统的XML输出(我删除了客户端客户安全的值):

<orders>
  <order>
    <order_id>38</order_id>
    <order_number>000015</order_number>
    <order_status>Authorized</order_status>
    <order_date>0000-00-00 00:00:00</order_date>
    <customer_email>test@someemail.ca</customer_email>
    <order_amount>order total</order_amount>
    <base_order_amount>pre shipping order total</base_order_amount>
    <shipping_type>Basic Shipping</shipping_type>
    <shipping_price> $0.00</shipping_price>
    <billing_first_name>Name</billing_first_name>
    <billing_last_name>B</billing_last_name>
    <billing_address1>PO / Add</billing_address1>
    <billing_address2></billing_address2>
    <billing_city>Town</billing_city>
    <billing_state_province>province</billing_state_province>
    <billing_country>Canada</billing_country>
    <billing_postal_code>postal code</billing_postal_code>
    <billing_phone></billing_phone>
    <emt_quest>test</emt_quest>
    <emt_answ>test</emt_answ>
    <emt_answ_conf>test</emt_answ_conf>
    <shipping_first_name>Name</shipping_first_name>
    <shipping_last_name>B</shipping_last_name>
    <shipping_address1>PO / Add</shipping_address1>
    <shipping_address2></shipping_address2>
    <shipping_city>Town</shipping_city>
    <shipping_state_province>province</shipping_state_province>
    <shipping_country>Canada</shipping_country>
    <shipping_postal_code>postal code</shipping_postal_code>
    <shipping_phone></shipping_phone>
    <items>
      <item>
        <item_name>Sample Item</item_name>
        <item_price>$8.00</item_price>
        <item_quantity>12</item_quantity>
      </item>
      <item>
        <item_name>Sample Item 2</item_name>
        <item_price>$12.00</item_price>
        <item_quantity>12</item_quantity>
      </item>
    </items>
  </order>

这是我的XML到CSV功能的代码

<?php

function xml2csv($xmlFile, $xPath) {
    $csvData = "";
    // Load the XML file
    $xml = simplexml_load_file($xmlFile);

    // xpath to search
    $path = $xml->order;

    //get headers (xpath must match above)
    $headers = get_object_vars($xml->order[0]);

    // Loop through the first row to get headers
    foreach($headers as $key => $value){
        $csvData .= $key . ',';
    }
            // Trim off the extra comma
        $csvData = trim($csvData, ',');

        // Add an LF
        $csvData .= "\n";
    foreach($path as $item) {

        // Loop through the elements in specificed xpath

        foreach($item as $key => $value) {

            //check for a second generation children of specified first generation child
            if ($key == "items") {
                $itemString = "";
                // if first generation child has children then loop through each second gen child
                foreach ($item->children() as $child) {
                    // loop through each xpath of second generation child
                    foreach($child as $value) {
                        // for value of each xpath of second generation child get value as out
                        foreach($value->children() as $out) {
                            //combine each value into itemString for export to .csv
                            $itemString .= $out . "|";
                            }
                        }
                    }
                    // place item string in csvData string and remove extra pipe
                    $csvData .= trim($itemString, "|");
                }
            //else put xpath values of first geneartion child in .csv
            else {

            $csvData .=  trim($value) . ',';
            }

        }
        // Trim off the extra comma
        $csvData = trim($csvData, ',');

        // Add an LF
        $csvData .= "\n";

    }

    // Return the CSV data
    return $csvData;

} 

当使用来自商店系统的给定.XML文件调用时,它会输出以下.CSV文件(我使用了虚拟值,'项目价格'不是偶然的)

order_id,order_number,order_status,order_date,customer_email,order_amount,base_order_amount,shipping_type,shipping_price,billing_first_name,billing_last_name,billing_address1,billing_address2,billing_city,billing_state_province,billing_country,billing_postal_code,billing_phone,emt_quest,emt_answ,emt_answ_conf,medicinal_use,shipping_first_name,shipping_last_name,shipping_address1,shipping_address2,shipping_city,shipping_state_province,shipping_country,shipping_postal_code,shipping_phone,items
00,000000,Authorized,0000-00-00 00:00:00,i@me.ca,$00.00,$00.00,Basic Shipping,$0.00,Me,Initial,123 Some Person Street,,Personville,Prov/State,Country,postal,,test,test,test,test,test,test,test,,test,test,test,test,,item name|item price|item quantity
01,000000,Authorized,0000-00-00 00:00:00,i@me.ca,$00.00,$00.00,Basic Shipping,$0.00,Me,Initial,123 Some Person Street,,Personville,Prov/State,Country,postal,,test,test,test,test,test,test,test,,test,test,test,test,,item name|item price|item quantity
02,000000,Authorized,0000-00-00 00:00:00,i@me.ca,$00.00,$00.00,Basic Shipping,$0.00,Me,Initial,123 Some Person Street,,Personville,Prov/State,Country,postal,,test,test,test,test,test,test,test,,test,test,test,test,,item name|item price|item quantity
03,000000,Authorized,0000-00-00 00:00:00,i@me.ca,$00.00,$00.00,Basic Shipping,$0.00,Me,Initial,123 Some Person Street,,Personville,Prov/State,Country,postal,,test,test,test,test,test,test,test,,test,test,test,test,,item name|item price|item quantity
04,000000,Authorized,0000-00-00 00:00:00,i@me.ca,$00.00,$00.00,Basic Shipping,$0.00,Me,Initial,123 Some Person Street,,Personville,Prov/State,Country,postal,,test,test,test,test,test,test,test,,test,test,test,test,,item name|item price|item quantity|item name|item price|item quantity

这里的目的是我的客户端可以直接从商店系统下载.CSV(而不是默认的.XML) - 在excel中处理它,因为他们需要处理他们的订单,然后将.CSV上传回商店 - 它将自动转换为如上所示形成的XML。

由于.CSV是一种平面格式,我所做的就是将XML项目压缩成一个简单的.CSV字符串,其中每个值都由一个|不会在我们网站上的任何标记文本中使用。就这样item name|item price|item quantity

这是我的代码试图实现这一点,我接近但我对输出有一些不稳定的行为。它在注释行$itemvalue = $doc->createTextNode($irow[$g]);上抛出一个未定义的offet错误(好像循环运行了太多次)并且也没有产生预期的输出。

function contains($substring, $string) {
        $pos = strpos($string, $substring);

        if($pos === false) {
                // string needle NOT found in haystack
                return false;
        }
        else {
                // string needle found in haystack
                return true;
        }

}

function csv2xml($csvData) {
    $outputFilename   = 'test.xml';
    // Open csv to read
    $input  = fopen($csvData, 'rt');

    // Get the headers of the file
    $headers = fgetcsv($input);

    // Create a new dom document with pretty formatting
    $doc  = new DomDocument();
    $doc->formatOutput   = true;

    // Add a root node to the document
    $root = $doc->createElement('orders');
    $root = $doc->appendChild($root);

    while (($row = fgetcsv($input)) !== FALSE) {

        $container = $doc->createElement('order');

 foreach ($headers as $i => $header)
 {
      //set temp file name here
     $tempFile = "temp.csv";

     //prepare mockCSV
    $mockCSV = "";
    $mockCSV .= "item_name,item_price,item_quantity";
    $mockCSV .= "\n";

    //check if current property has items data with |
     if (contains("|", $row[$i])) {
         //if it does create array of data
          $item_arr = explode("|", $row[$i]);

          //create header for 'items' node
          $child = $doc->createElement($header);
          $child = $container->appendChild($child);

          //count for items
          $count = 0;
          foreach($item_arr as $k => $item) {
              $mockCSV .= trim($item) . ",";
              if($count == 2) {
                            // Trim off the extra comma
                $mockCSV = trim($mockCSV, ',');

                // Add an LF
                $mockCSV .= "\n";
                }
                $count++;
              }
                                        // Trim off the extra comma
                $mockCSV = trim($mockCSV, ',');

                // Add an LF
                $mockCSV .= "\n";

                //put mock CSV data in temp file
                $f = fopen($tempFile, "w");
                fwrite($f, $mockCSV);
                fclose($f);

                //get data from temp file
                $iteminput = fopen($tempFile, 'rt');
                //get headers from temp file
                $itemheaders = fgetcsv($iteminput);

                    while (($irow = fgetcsv($iteminput)) !== FALSE) {
                                                    $itemchild = $doc->createElement('item');
                        foreach($itemheaders as $g => $itemheader) {
          $subchild = $doc->createElement($itemheader);
          $subchild = $itemchild->appendChild($subchild);
          $itemvalue = $doc->createTextNode($irow[$g]);  /* OFFSET HAPPENS HERE */
          $itemvalue = $subchild->appendChild($itemvalue);
                        }
                    }
                                $itemchild = $child->appendChild($itemchild);

         }

    else {
          $child = $doc->createElement($header);
          $child = $container->appendChild($child);
          $value = $doc->createTextNode($row[$i]);
          $value = $child->appendChild($value);
        } 
 }

        $root->appendChild($container);
    }

    $strxml = $doc->saveXML();
$handle = fopen($outputFilename, "w");
fwrite($handle, $strxml);
fclose($handle);

}

echo csv2xml("test.csv");

?>

预期的输出应该与我上面发布的XML结构相同,但是它正在这样做:

<orders>
  <order>
    <order_id>38</order_id>
    <order_number>000015</order_number>
    <order_status>Authorized</order_status>
    <order_date>0000-00-00 00:00:00</order_date>
    <customer_email>test@someemail.ca</customer_email>
    <order_amount>$96.00</order_amount>
    <base_order_amount>$96.00</base_order_amount>
    <shipping_type>Basic Shipping</shipping_type>
    <shipping_price> $0.00</shipping_price>
    <billing_first_name>Name</billing_first_name>
    <billing_last_name>B</billing_last_name>
    <billing_address1>PO / Add</billing_address1>
    <billing_address2></billing_address2>
    <billing_city>Town</billing_city>
    <billing_state_province>province</billing_state_province>
    <billing_country>Canada</billing_country>
    <billing_postal_code>postal code</billing_postal_code>
    <billing_phone></billing_phone>
    <emt_quest>test</emt_quest>
    <emt_answ>test</emt_answ>
    <emt_answ_conf>test</emt_answ_conf>
    <shipping_first_name>Name</shipping_first_name>
    <shipping_last_name>B</shipping_last_name>
    <shipping_address1>PO / Add</shipping_address1>
    <shipping_address2></shipping_address2>
    <shipping_city>Town</shipping_city>
    <shipping_state_province>province</shipping_state_province>
    <shipping_country>Canada</shipping_country>
    <shipping_postal_code>postal code</shipping_postal_code>
    <shipping_phone></shipping_phone>
    <items>
      <item>
        <item_name></item_name>
        <item_price></item_price>
        <item_quantity></item_quantity>
      </item>
    </items>
  </order>

并没有将值放在某些字段中。此外,它不会重复显示的双重产品条目,其源.CSV字段看起来像item name|item price|item quantity|item name|item price|item quantity

这是我的问题,我似乎无法正确处理管道分隔字段,它没有按预期输出。在早期版本的代码中,我获得了所有数据,但它没有创建单独的“项目”节点。

非常感谢任何帮助,此时我觉得它很简单,我只需要另外一双眼睛。

更重要的是,我在这里使用非常不完整的代码,我觉得,我没有使用.PHP练习 - 我觉得必须存在某种逻辑问题我的方式如何 - 我的方式可以工作但是那里必须是一个更简化的方法。如果有人能告诉我那是什么 - 这就是我真正想要的答案。

TL:DR从这里开始 我正在尝试使用管道分隔为第二代和第三代XML子项将.CSV数据转换为结构化.XML数据

我的源.CSV文件'items'中只有一个字段包含此类信息 - 所有其他项都是单键单项,数据如下所示item name|item price|item quantity|item name|item price|item quantity

所以我做的是检查|在当前正在循环中运行的.CSV字符串内部,如果检测到它,我使用explode()创建一个包含在那里的数组。

我尝试重新创建一个模拟CSV文件并将其放入临时目录中以放置此信息,然后使用基本CSV到XML,这在我的程序中可以将数据放入XML Dom文档中

预期产出:

<items>
  <item>
    <item_name>Sample Item</item_name>
    <item_price>$8.00</item_price>
    <item_quantity>12</item_quantity>
  </item>
  <item>
    <item_name>Sample Item 2</item_name>
    <item_price>$8.00</item_price>
    <item_quantity>12</item_quantity>
  </item>
</items>

我得到的输出:

<items>
  <item>
    <item_name></item_name>
    <item_price></item_price>
    <item_quantity></item_quantity>
  </item>
</items>

我需要提供很多信息来正确说明问题但我的问题很简单 - 如何实现我想要的输出。

1 个答案:

答案 0 :(得分:1)

让我先备份并提供CSV到XML的例程,然后处理管道元素。

一些意见:

  • 我更喜欢SimpleXML而不是DOM因为它易于使用,所以我会在示例中使用它。当然,也可以使用DOM来完成。
  • 我将使用str_getcsv()代替fgetcsv(),以便能够在线创建工作示例。

基本CSV到XML

// XML: set up object
$xml = simplexml_load_string("<orders/>");

// CSV: assume CSV in $c, get it as a whole
$csv = str_getcsv($c, "\n");

// CSV: separate 1st row with field names from the following rows
$names = str_getcsv(array_shift($csv));

// CSV: parse row by row
foreach ($csv as $row) {

    // CSV: combine names as keys => data as values
    $row = array_combine($names, str_getcsv($row));

    // XML: create new <order>
    $xml_order = $xml->addChild("order");

    // CSV: parse a single row
    foreach ($row as $key => $value) {

        // *****
        // XML: create field as child of <order>
        $xml_order->addChild($key, $value);
        // *****
    }
}

处理管道元素

以下代码替换了// *****以上

之间的行
// CSV: check for pipes, attention use strict comparison ===
if (strpos($value, "|") === false) {

    // XML: no pipe, create node as a child of <order>
    $xml_order->addChild($key, $value);

} else {

    // CSV: pipe present, split up data
    $csv_items = str_getcsv($value,"|");

    // XML: create <items> node     
    $xml_items = $xml_order->addChild($key);

    // CSV: iterate over $csv_items, each 3 elements = 1 row
    // chop row after row 
    while (!empty($csv_items)) {

        // XML: create <item> node as child of <items>
        $xml_item = $xml_items->addChild("item");

        // XML: create children of <item> node
        $xml_item->addChild("item_name", array_shift($csv_items));
        $xml_item->addChild("item_price", array_shift($csv_items));
        $xml_item->addChild("item_quantity", array_shift($csv_items));
    }
}

合并代码,不带评论

$xml = simplexml_load_string("<orders/>");
$csv = str_getcsv($c, "\n"); // assume CSV in $c
$names = str_getcsv(array_shift($csv));

foreach ($csv as $row) {
    $row = array_combine($names, str_getcsv($row));
    $xml_order = $xml->addChild("order");

    foreach ($row as $key => $value) {

        if (strpos($value, "|") === false) 
            $xml_order->addChild($key, $value);
        else {
            $csv_items = str_getcsv($value,"|");
            $xml_items = $xml_order->addChild($key);

            while (!empty($csv_items)) {
                $xml_item = $xml_items->addChild("item");
                $xml_item->addChild("item_name", array_shift($csv_items));
                $xml_item->addChild("item_price", array_shift($csv_items));
                $xml_item->addChild("item_quantity", array_shift($csv_items));
            }
        }
    }
}

看到它正常工作: https://eval.in/368945