计算文本文件的标准偏差

时间:2019-01-13 23:39:07

标签: python

我正在尝试计算“ ClosePrices”列中所有数据的标准差,请参见pastebin https://pastebin.com/JtGr672m

我们需要计算所有1029个浮标中的一个标准偏差。

这是我的代码:

ins1 = open("bijlage.txt", "r")
for line in ins1:

        numbers = [(n) for n in number_strings] 
        i = i + 1
        ClosePriceSD = []
        ClosePrice = float(data[0][5].replace(',', '.'))
        ClosePriceSD.append(ClosePrice)

def sd_calc(data):
    n = 1029

    if n <= 1:
        return 0.0

    mean, sd = avg_calc(data), 0.0

    # calculate stan. dev.
    for el in data:
        sd += (float(el) - mean)**2
    sd = math.sqrt(sd / float(n-1))

    return sd

def avg_calc(ls):
    n, mean = len(ls), 0.0

    if n <= 1:
        return ls[0]

    # calculate average
    for el in ls:
        mean = mean + float(el)
    mean = mean / float(n)

    return mean
print("Standard Deviation:")
print(sd_calc(ClosePriceSD))
print()

所以我要计算的是“收盘价”部分下所有浮动货币的标准差。

好吧,我有这个“ ClosePrice = float(data [0] [5] .replace(',','。'))”,这应该从ClosePrice下的所有浮点计算标准差,但它只能计算它来自data [0] [5]。但是我希望它根据ClosePrice

从所有1029个浮点中计算一个标准差

2 个答案:

答案 0 :(得分:0)

您并未真正指定问题/错误所在。尽管如果这是一个学校项目,这可能无济于事,但是您可以安装具有standard deviation function的scipy。在这种情况下,只需将数组作为参数放入即可。您能详细说明遇到的问题吗?当前代码有错误吗?

编辑: 查看数据,您需要每行中的第6个元素(ClosePrice)。如果您的函数正常运行,并且您只需要一个ClosedPrice数组,这就是我的建议。

data = []
lines = []

ins1 = open("bijlage.txt", "r")
lines = [lines.rstrip('\n') for line in ins1]

for line in lines:
    line.split('\;')
    data.append(line[5])

for i in data:
    data[i] = float(data[i])

def sd_calc(data):
    n = 1029

    if n <= 1:
        return 0.0

    mean, sd = avg_calc(data), 0.0

    # calculate stan. dev.
    for el in data:
        sd += (float(el) - mean)**2
    sd = math.sqrt(sd / float(n-1))

    return sd

def avg_calc(ls):
    n, mean = len(ls), 0.0

    if n <= 1:
        return ls[0]

    # calculate average
    for el in ls:
        mean = mean + float(el)
    mean = mean / float(n)

    return mean
print("Standard Deviation:")
print(sd_calc(data))
print()

答案 1 :(得分:0)

我认为您的错误一开始是在for循环中。您拥有for line in ins1,但是您再也不会在循环内使用line。并且在循环中,您还使用了之前未定义的number_stringdata

这是从txt文件提取数据的方法。

with open("bijlage.txt", "r") as ff:
    ll = ff.readlines() #extract a list, each element is a line of the file

data = []
for line in ll[1:]: #excluding the first line wich is an header
    d = line.split(';')[5] #split each line in a list using semicolon as a separator and keep the element with index 5
    data.append(float(d.replace(',', '.'))) #substituting the comma with the dot in the string and convert it to a float

print data #data is a list with all the numbers you want

您应该可以从此处计算平均值和标准差。

相关问题