如何让不可变的F#更高效?

时间:2011-04-14 18:22:39

标签: f#

我想使用不可变的F#编写大量的C#代码。它是一个设备监视器,当前的实现通过不断从串行端口获取数据并根据新数据更新成员变量来工作。我想将它转移到F#并获得不可变记录的好处,但我在概念验证实现中的第一次拍摄非常慢。

open System
open System.Diagnostics

type DeviceStatus = { RPM         : int;
                      Pressure    : int;
                      Temperature : int }

// I'm assuming my actual implementation, using serial data, would be something like 
// "let rec UpdateStatusWithSerialReadings (status:DeviceStatus) (serialInput:string[])".
// where serialInput is whatever the device streamed out since the previous check: something like
// ["RPM=90","Pres=50","Temp=85","RPM=40","Pres=23", etc.]
// The device streams out different parameters at different intervals, so I can't just wait for them all to arrive and aggregate them all at once.
// I'm just doing a POC here, so want to eliminate noise from parsing etc.
// So this just updates the status's RPM i times and returns the result.
let rec UpdateStatusITimes (status:DeviceStatus) (i:int) = 
    match i with
    | 0 -> status
    | _ -> UpdateStatusITimes {status with RPM = 90} (i - 1)

let initStatus = { RPM = 80 ; Pressure = 100 ; Temperature = 70 }
let stopwatch = new Stopwatch()

stopwatch.Start()
let endStatus = UpdateStatusITimes initStatus 100000000
stopwatch.Stop()

printfn "endStatus.RPM = %A" endStatus.RPM
printfn "stopwatch.ElapsedMilliseconds = %A" stopwatch.ElapsedMilliseconds
Console.ReadLine() |> ignore

这在我的机器上运行大约1400毫秒,而等效的C#代码(具有可变成员变量)在大约310毫秒运行。有没有办法在不失去不变性的情况下加快速度?我希望F#编译器会注意到initStatus和所有中间状态变量从未被重用,因此只是改变场景后面的那些记录,但我猜不是。

4 个答案:

答案 0 :(得分:12)

在F#社区中,只要不是公共接口的一部分,命令式代码和可变数据就不会被所束缚。即,只要您封装它并将其与其余代码隔离,使用可变数据就可以了。为此,我建议如下:

type DeviceStatus =
  { RPM         : int
    Pressure    : int
    Temperature : int }

// one of the rare scenarios in which I prefer explicit classes,
// to avoid writing out all the get/set properties for each field
[<Sealed>]
type private DeviceStatusFacade =
    val mutable RPM         : int
    val mutable Pressure    : int
    val mutable Temperature : int
    new(s) =
        { RPM = s.RPM; Pressure = s.Pressure; Temperature = s.Temperature }
    member x.ToDeviceStatus () =
        { RPM = x.RPM; Pressure = x.Pressure; Temperature = x.Temperature }

let UpdateStatusITimes status i =
    let facade = DeviceStatusFacade(status)
    let rec impl i =
        if i > 0 then
            facade.RPM <- 90
            impl (i - 1)
    impl i
    facade.ToDeviceStatus ()

let initStatus = { RPM = 80; Pressure = 100; Temperature = 70 }
let stopwatch = System.Diagnostics.Stopwatch.StartNew ()
let endStatus = UpdateStatusITimes initStatus 100000000
stopwatch.Stop ()

printfn "endStatus.RPM = %d" endStatus.RPM
printfn "stopwatch.ElapsedMilliseconds = %d" stopwatch.ElapsedMilliseconds
stdin.ReadLine () |> ignore

这样,公共接口不受影响 - UpdateStatusITimes仍然需要并返回一个本质上不可变的DeviceStatus - 但内部UpdateStatusITimes使用可变类来消除分配开销。

编辑:(回复评论)这是我通常喜欢的课程风格,使用主要构造函数和let s +属性而不是val s :

[<Sealed>]
type private DeviceStatusFacade(status) =
    let mutable rpm      = status.RPM
    let mutable pressure = status.Pressure
    let mutable temp     = status.Temperature
    member x.RPM         with get () = rpm      and set n = rpm      <- n
    member x.Pressure    with get () = pressure and set n = pressure <- n
    member x.Temperature with get () = temp     and set n = temp     <- n
    member x.ToDeviceStatus () =
        { RPM = rpm; Pressure = pressure; Temperature = temp }

但是对于简单的门面类,每个属性都是盲目的getter / setter,我觉得这有点乏味。

F#3+允许以下内容,但我仍然没有发现它是个人的改进(除非有一个教条地避免字段):

[<Sealed>]
type private DeviceStatusFacade(status) =
    member val RPM         = status.RPM with get, set
    member val Pressure    = status.Pressure with get, set
    member val Temperature = status.Temperature with get, set
    member x.ToDeviceStatus () =
        { RPM = x.RPM; Pressure = x.Pressure; Temperature = x.Temperature }

答案 1 :(得分:7)

这不会回答你的问题,但它可能值得退一步并考虑大局:

  1. 您认为这个用例的不可变数据结构的优势是什么? F#也支持可变数据结构。
  2. 你声称F#“真的很慢” - 但它只比C#代码慢4.5倍,并且每秒更新超过7000万次......这对你的实际应用来说可能是不可接受的性能吗?您是否有特定的性能目标?有理由相信这种类型的代码会成为您应用程序的瓶颈吗?
  3. 设计始终是权衡。您可能会发现,为了在短时间内记录许多更改,根据您的需要,不可变数据结构会产生令人无法接受的性能损失。另一方面,如果您有一些要求,例如同时跟踪数据结构的多个旧版本,那么不可变数据结构的好处可能会使它们具有吸引力,尽管性能会受到影响。

答案 2 :(得分:7)

我怀疑你所看到的性能问题是由于在循环的每次迭代中克隆记录时所涉及的块内存归零(加上分配它的时间可忽略不计并随后进行垃圾收集)。您可以使用结构重写您的示例:

[<Struct>]
type DeviceStatus =
    val RPM : int
    val Pressure : int
    val Temperature : int
    new(rpm:int, pres:int, temp:int) = { RPM = rpm; Pressure = pres; Temperature = temp }

let rec UpdateStatusITimes (status:DeviceStatus) (i:int) = 
    match i with
    | 0 -> status
    | _ -> UpdateStatusITimes (DeviceStatus(90, status.Pressure, status.Temperature)) (i - 1)

let initStatus = DeviceStatus(80, 100, 70)

现在,性能将接近使用全局可变变量或将UpdateStatusITimes status i重新定义为UpdateStatusITimes rpm pres temp i的性能。这只有在你的结构长度不超过16个字节时才有效,否则它将以与记录相同的缓慢方式被复制。

如果您在评论中暗示过,您打算将其用作共享内存多线程设计的一部分,那么您将需要在某些时候进行可变性。您的选择是a)每个参数的共享可变变量b)一个包含结构的共享可变变量或c)包含可变字段的共享外观对象(如ildjarn的答案)。我会选择最后一个,因为它很好地封装并扩展到超过四个int字段。

答案 3 :(得分:4)

使用如下的元组比原始解决方案快15倍:

type DeviceStatus = int * int * int

let rec UpdateStatusITimes (rpm, pressure, temp) (i:int) = 
    match i with
    | 0 -> rpm, pressure, temp
    | _ -> UpdateStatusITimes (90,pressure,temp) (i - 1)

while true do
  let initStatus = 80, 100, 70
  let stopwatch = new Stopwatch()

  stopwatch.Start()
  let rpm,_,_ as endStatus = UpdateStatusITimes initStatus 100000000
  stopwatch.Stop()

  printfn "endStatus.RPM = %A" rpm
  printfn "Took %fs" stopwatch.Elapsed.TotalSeconds

顺便说一下,你应该在计时时使用stopwatch.Elapsed.TotalSeconds