我正在使用大量的数据集(~130,000条记录),我已经设法按照我想要的方式(到csv)进行转换。
以下是List的外观简化示例:
"Surname1, Name1;Address1;State1;YES;Group1" "Surname2, Name2;Address2;State2;YES;Group2" "Surname2, Name2;Address2;State2;YES;Group1" "Surname3, Name3;Address3;State3;NO;Group1" "Surname1, Name1;Address2;State1;YES;Group1"
现在,如果第1列,第2列和第3列匹配,我想合并记录,如下所示:
输出
"Surname1, Name1;Address1;State1;YES;Group1" "Surname2, Name2;Address2;State2;YES;Group2 Group1" "Surname3, Name3;Address3;State3;NO;Group1" "Surname1, Name1;Address2;State1;YES;Group1"
这是我到目前为止所得到的:
output.GroupBy(x => new { c1 = x.Split(';')[0], c2 = x.Split(';')[1], c3 = x.Split(';')[2] }).Select(//have no idea what should go here);
答案 0 :(得分:2)
首先尝试获取您需要以匿名类型投射结果的列:
var query= from r in output
let columns= r.Split(';')
select new { c1 =columns[0], c2 =columns[1], c3 = columns[2] ,c5=columns[4]};
然后创建组,但现在使用您在上一个查询中定义的匿名对象:
var result= query.GroupBy(e=>new {e.c1, e.c2, e.c3})
.Select(g=> new {SurName=g.Key.c1,
Name=g.Key.c2,
Address=g.Key.c3,
Groups=String.Join(",",g.Select(e=>e.c4)});
我知道我错过了一些专栏,但我认为你可以理解。
PS:我在两个查询中分离逻辑的事实只是为了提高可读性,你可以在一个中组合两个查询,但这不会改变性能,因为LINQ使用deferred evaluation。答案 1 :(得分:0)
我就是这样做的:
<?php
include "db.inc.php";
include "redirect.php";
session_start();
$conn = mysqli_connect($db_server, $db_user, $db_password, $db_name) or die("you did not connect bozo");
$postleitzahl = $_GET['postalcode'];
$firstname = $_GET['firstname'];
$lastname = $_GET['lastname'];
$citytext = $_GET['citytext'];
$address = $_GET['address'];
$e_mail = $_GET['e_mail'];
$birthday=$_GET['birthday'];
$test=$_GET['test'];
//echo " base64 string";
echo $test;
$statement = $conn->prepare("INSERT INTO kunden(kunden_plz, kunden_nachname, kunden_vorname, kunden_adresse, kunden_ort, kunden_email, kunden_geburtsdatum) VALUES (?,?,?,?,?,?,?)");
$statement->bind_param("sssssss", $postleitzahl, $firstname,$lastname,$citytext,$address,$e_mail,$birthday);
$statement->execute();
将前4列视为分组键,但仅使用前3列进行比较(因此自定义class Program
{
static void Main(string[] args)
{
List<string> input = new List<string> {
"Surname1, Name1;Address1;State1;YES;Group1",
"Surname2, Name2;Address2;State2;YES;Group2",
"Surname2, Name2;Address2;State2;YES;Group1",
"Surname3, Name3;Address3;State3;NO;Group1",
"Surname1, Name1;Address2;State1;YES;Group1",
};
var transformed = input.Select(s => s.Split(';'))
.GroupBy( s => new string[] { s[0], s[1], s[2], s[3] },
(key, elements) => string.Join(";", key) + ";" + string.Join(" ", elements.Select(e => e.Last())),
new MyEqualityComparer())
.ToList();
}
}
internal class MyEqualityComparer : IEqualityComparer<string[]>
{
public bool Equals(string[] x, string[] y)
{
return x[0] == y[0] && x[1] == y[1] && x[2] == y[2];
}
public int GetHashCode(string[] obj)
{
int hashCode = obj[0].GetHashCode();
hashCode = hashCode ^ obj[1].GetHashCode();
hashCode = hashCode ^ obj[2].GetHashCode();
return hashCode;
}
}
)。
然后,如果你有(键,元素)组,转换它们,以便你加入键的元素; (记住,密钥由前4列组成)并添加组中每个成员的最后一个元素,并用空格连接。