从.xlsx创建文件夹结构

时间:2018-05-19 01:35:51

标签: python-3.x powershell

我有一个具有特定结构的.xlsx文件(请查看图片)我想通过mkdir将包含的信息处理到某个文件夹结构,我唯一拥有的是.xlsx和文件夹的名称是.xlsx的前3个竞争中的字符串(也许我必须使用VBA?):

在流程结束时,应该有3个新文件夹,其中包含图片和.txt文件

这就是它最终的样子 - > workflow and folder structure

.png的URL应该被删除到创建的文件夹中,以及一个包含两个ProdDec(Production Description)和Collection内容的.txt文件。

我安装了python 3.6.x,PowerShell也在我的win7 64x机器上。

非常感谢

Rainer Zufall

1 个答案:

答案 0 :(得分:0)

您可以使用openpyxl读取Excel文件,收集相应的信息,然后为每个产品创建一个目录。可以使用requests库下载图像。以下是针对您的问题的经过测试的解决方案:

import os

import openpyxl
import requests

wb = openpyxl.load_workbook('A1.xlsx')

sheet = wb.active

rows = [tuple(cell.value for cell in row if cell.value is not None) for row in sheet] # convert the cells to text

dirnames = list()
images = list()
text = list()

for row in rows[1:]: # [1:], ignore column headers for better looping
    if row[0] is not None:
        dirnames.append('_'.join(row[:3])) # joins the Brand, Family, and Ref columns
        images.append(row[3:-2]) # stores the uris in a tuple
        text.append('\r\n'.join(row[-2:])) # joins the last two columns

for i in range(len(dirnames)):
    if not(os.path.exists(dirnames[i])):
        os.mkdir(dirnames[i]) # create the dir
    os.chdir(dirnames[i]) # enter the dir
    print('creating', dirnames[i])
    for j, image in enumerate(images[i]): # for all of the images
        imagereq = requests.get(image)
        imagename = 'Img{}.png'.format(j + 1)
        if imagereq.status_code == 200: # prevents filewriting errors for bad requests
            with open(imagename, 'wb') as fp:
                fp.write(imagereq.content)
            print(' ' * 4 + 'image write successful for', imagename)
        else:
            print(' ' * 4 + 'could not download image {}, error'.format(imagename), imagereq.status_code, imagereq.reason)
    with open('ProdDesc_and_Collection.txt', 'wb') as fp:
        fp.write(text[i].encode('utf8'))

    os.chdir('..') # back out of the dir

更新:代码现在适用于每个产品的多个图片并忽略空单元格