处理文件行的有效方法

时间:2017-12-08 08:53:43

标签: go concurrency

我正在尝试学习一些内容,我正在尝试制作一个从csv文件中读取的脚本,执行某些过程并获取一些结果。

我正在遵循管道模式,以便goroutine从扫描仪每行读取文件,将行发送到通道,不同的goroutine消耗通道内容。

我想要做的一个例子: https://gist.github.com/pkulak/93336af9bb9c7207d592

我的问题是:在csv文件中有很多记录。我想在后续行之间做一些数学运算。让我们说record1是r1 record2是r2,依此类推。

当我阅读文件时,我有click()。在下一个扫描仪循环中,我有r1。我想知道r2对我来说是否有效。如果是,他们之间做一些数学运算。然后以同样的方式检查r1-r2 r3。如果r4无效,请不要关心r1-r2并执行r2,依此类推。

在将文件行放入频道并处理频道内容后,我是否应该在阅读文件时处理此问题?

任何不破坏并发性的建议?

1 个答案:

答案 0 :(得分:1)

我认为您应该确定" r1-r2是否为您的有效数字"在Read the lines into the work queue函数内。

因此,您应该阅读当前行,并在没有valid numbers对的情况下逐一阅读下一行。当你得到它 - 你会在workQueue频道内发送这一对并搜索下一对。

这是您的代码,包含更改:

package main

import (
    "bufio"
    "log"
    "os"
    "errors"
)

var concurrency = 100

type Pair struct {
    line1 string
    line2 string
}

func main() {

    // It will be better to receive file-path from somewhere (like args or something like this)
    filePath := "/path/to/file.csv"

    // This channel has no buffer, so it only accepts input when something is ready
    // to take it out. This keeps the reading from getting ahead of the writers.
    workQueue := make(chan Pair)

    // We need to know when everyone is done so we can exit.
    complete := make(chan bool)

    // Read the lines into the work queue.
    go func() {

        file, e := os.Open(filePath)
        if e != nil {
            log.Fatal(e)
        }
        // Close when the function returns
        defer file.Close()

        scanner := bufio.NewScanner(file)

        // Get pairs and send them into "workQueue" channel
        for {
            line1, e := getNextCorrectLine(scanner)
            if e != nil {
                break
            }
            line2, e := getNextCorrectLine(scanner)
            if e != nil {
                break
            }
            workQueue <- Pair{line1, line2}
        }

        // Close the channel so everyone reading from it knows we're done.
        close(workQueue)
    }()

    // Now read them all off, concurrently.
    for i := 0; i < concurrency; i++ {
        go startWorking(workQueue, complete)
    }

    // Wait for everyone to finish.
    for i := 0; i < concurrency; i++ {
        <-complete
    }
}

func getNextCorrectLine(scanner *bufio.Scanner) (string, error) {
    var line string
    for scanner.Scan() {
        line = scanner.Text()
        if isCorrect(line) {
            return line, nil
        }
    }
    return "", errors.New("no more lines")
}

func isCorrect(str string) bool {
    // Make your validation here
    return true
}

func startWorking(pairs <-chan Pair, complete chan<- bool) {
    for pair := range pairs {
        doTheWork(pair)
    }

    // Let the main process know we're done.
    complete <- true
}

func doTheWork(pair Pair) {
    // Do the work with the pair
}