我有一个具有特定结构的.xlsx文件(请查看图片)我想通过mkdir
将包含的信息处理到某个文件夹结构,我唯一拥有的是.xlsx和文件夹的名称是.xlsx的前3个竞争中的字符串(也许我必须使用VBA?):
在流程结束时,应该有3个新文件夹,其中包含图片和.txt文件
这就是它最终的样子 - > workflow and folder structure
.png的URL应该被删除到创建的文件夹中,以及一个包含两个ProdDec(Production Description)和Collection内容的.txt文件。
我安装了python 3.6.x,PowerShell也在我的win7 64x机器上。
非常感谢
Rainer Zufall
答案 0 :(得分:0)
您可以使用openpyxl读取Excel文件,收集相应的信息,然后为每个产品创建一个目录。可以使用requests库下载图像。以下是针对您的问题的经过测试的解决方案:
import os
import openpyxl
import requests
wb = openpyxl.load_workbook('A1.xlsx')
sheet = wb.active
rows = [tuple(cell.value for cell in row if cell.value is not None) for row in sheet] # convert the cells to text
dirnames = list()
images = list()
text = list()
for row in rows[1:]: # [1:], ignore column headers for better looping
if row[0] is not None:
dirnames.append('_'.join(row[:3])) # joins the Brand, Family, and Ref columns
images.append(row[3:-2]) # stores the uris in a tuple
text.append('\r\n'.join(row[-2:])) # joins the last two columns
for i in range(len(dirnames)):
if not(os.path.exists(dirnames[i])):
os.mkdir(dirnames[i]) # create the dir
os.chdir(dirnames[i]) # enter the dir
print('creating', dirnames[i])
for j, image in enumerate(images[i]): # for all of the images
imagereq = requests.get(image)
imagename = 'Img{}.png'.format(j + 1)
if imagereq.status_code == 200: # prevents filewriting errors for bad requests
with open(imagename, 'wb') as fp:
fp.write(imagereq.content)
print(' ' * 4 + 'image write successful for', imagename)
else:
print(' ' * 4 + 'could not download image {}, error'.format(imagename), imagereq.status_code, imagereq.reason)
with open('ProdDesc_and_Collection.txt', 'wb') as fp:
fp.write(text[i].encode('utf8'))
os.chdir('..') # back out of the dir
更新:代码现在适用于每个产品的多个图片并忽略空单元格