Python pptx(Power Point)查找和替换文本(ctrl + H)

时间:2016-06-20 14:15:10

标签: python python-2.7 powerpoint python-pptx

简短问题:如何使用Python-pptx模块使用查找和替换选项( Ctrl + H )?< / p>

示例代码:

from pptx import Presentation

nameOfFile = "NewPowerPoint.pptx" #Replace this with: path name on your computer + name of the new file.

def open_PowerPoint_Presentation(oldFileName, newFileName):
    prs = Presentation(oldFileName)

    prs.save(newFileName)
open_PowerPoint_Presentation('Template.pptx', nameOfFile)

我有一个名为“Template.pptx”的Power Point文档。使用我的Python程序,我将添加一些幻灯片并在其中放入一些图片。将所有图片放入文档后,将其另存为另一个电源点演示文稿。

问题是这个“Template.pptx”中包含了所有旧的周数,比如“第20周”。我想让Python查找并将所有这些单词组合替换为“第25周”(例如)。

7 个答案:

答案 0 :(得分:2)

对于那些只希望将代码复制并粘贴到程序中的人,这些程序可以在保留格式的同时查找和替换PowerPoint中的文本(就像我以前一样), >

def search_and_replace(search_str, repl_str, input, output):
    """"search and replace text in PowerPoint while preserving formatting"""
    #Useful Links ;)
    #https://stackoverflow.com/questions/37924808/python-pptx-power-point-find-and-replace-text-ctrl-h
    #https://stackoverflow.com/questions/45247042/how-to-keep-original-text-formatting-of-text-with-python-powerpoint
    from pptx import Presentation
    prs = Presentation(input)
    for slide in prs.slides:
        for shape in slide.shapes:
            if shape.has_text_frame:
                if(shape.text.find(search_str))!=-1:
                    text_frame = shape.text_frame
                    cur_text = text_frame.paragraphs[0].runs[0].text
                    new_text = cur_text.replace(str(search_str), str(repl_str))
                    text_frame.paragraphs[0].runs[0].text = new_text
    prs.save(output)

先验是许多答案的结合,但可以完成工作。每次出现search_str时,它仅用repl_str替换search_str

在此答案的范围内,您将使用: search_and_replace('Week 20', 'Week 25', "Template.pptx", "NewPowerPoint.pptx")

答案 1 :(得分:1)

您必须访问每个形状上的每张幻灯片,并使用可用的文本功能查找匹配项。它可能并不漂亮,因为PowerPoint有一种习惯,就是将运行分成可能看似奇怪的块。它这样做是为了支持拼写检查等功能,但它的行为是不可预测的。

因此,找到像Shape.text这样的事件可能很容易。在不丢失任何字体格式的情况下更换它们可能会更困难,具体取决于您的具体情况。

答案 2 :(得分:1)

我知道这个问题很老,但我刚刚完成了一个使用python每天更新powerpoint的项目。基本上每天早上运行python脚本,它从数据库中提取当天的数据,将数据放在powerpoint中,然后执行powerpoint viewer来播放powerpoint。

要回答您的问题,您必须遍历页面上的所有形状,并检查您要搜索的字符串是否在shape.text中。您可以通过检查shape.has_text_frame是否为true来检查形状是否有文本。这可以避免错误。

这是事情变得三得利的地方。如果您只是将shape.text中的字符串替换为要插入的文本,则可能会丢失格式。 shape.text实际上是形状中所有文本的连接。该文本可能被拆分为许多“运行”,并且所有这些运行可能具有不同的格式,如果您在shape.text上写或替换部分字符串,这些格式将会丢失。

在幻灯片上你有形状,形状可以有text_frame,text_frames有段落(至少一个。总是。即使它是空白的),段落也可以有。任何级别都可以具有格式,并且您无法确定字符串被拆分的运行次数。

在我的情况下,我确保任何要替换的字符串都是自己的形状。您仍然需要一直钻到运行并在那里设置文本,以便保留所有格式。此外,您在shape.text中匹配的字符串实际上可能分布在多个运行中,因此在第一次运行中设置文本时,我还将该段落中所有其他运行中的文本设置为空白。

随机代码snippit:

from pptx import Presentation

testString = '{{thingToReplace}}'
replaceString = 'this will be inserted'
ppt = Presentation('somepptxfile.pptx')

def replaceText(shape, string,replaceString):
    #this is the hard part
    #you know the string is in there, but it may be across many runs


for slide in ppt.slides:
    for shape in slide.shapes:
        if shape.has_text_frame:
            if(shape.text.find(testString)!=-1:
                replaceText(shape,testString,replaceString)

对不起,如果有任何错别字。我在工作.....

答案 3 :(得分:0)

以下一些代码可能会有所帮助。 I found it here

search_str = '{{{old text}}}'
repl_str = 'changed Text'
ppt = Presentation('Presentation1.pptx')
for slide in ppt.slides:
    for shape in slide.shapes:
        if shape.has_text_frame:
            shape.text = shape.text.replace(search_str, repl_str)
ppt.save('Presentation1.pptx')

答案 4 :(得分:0)

从我自己的项目中发布代码,因为没有其他答案能够成功地使用包含多个段落的复杂文本的字符串来达到目标​​,而不会丢失格式:

<img src="{{ asset('assets/img/logo.png', absolute=true) }}">

答案 5 :(得分:0)

将上述响应和其他响应以对我来说很好的方式进行合并(PYTHON 3)。保留了所有原始格式:

from pptx import Presentation

def replace_text(replacements, shapes):
    """Takes dict of {match: replacement, ... } and replaces all matches.
    Currently not implemented for charts or graphics.
    """
    for shape in shapes:
        for match, replacement in replacements.items():
            if shape.has_text_frame:
                if (shape.text.find(match)) != -1:
                    text_frame = shape.text_frame
                    for paragraph in text_frame.paragraphs:
                        whole_text = "".join(run.text for run in paragraph.runs)
                        whole_text = whole_text.replace(str(match), str(replacement))
                        for idx, run in enumerate(paragraph.runs):
                            if idx != 0:
                                p = paragraph._p
                                p.remove(run._r)
                        if(not(not paragraph.runs)):
                            paragraph.runs[0].text = whole_text

if __name__ == '__main__':

    prs = Presentation('input.pptx')
    # To get shapes in your slides
    slides = [slide for slide in prs.slides]
    shapes = []
    for slide in slides:
        for shape in slide.shapes:
            shapes.append(shape)

    replaces = {
                        '{{var1}}': 'text 1',
                        '{{var2}}': 'text 2',
                        '{{var3}}': 'text 3'
                }
    replace_text(replaces, shapes)
    prs.save('output.pptx')

答案 6 :(得分:0)

我遇到了一个类似的问题,即格式化的占位符分布在​​多个运行对象上。我想保留格式,所以我无法在段落级别进行替换。最后,我想出了一种方法来替换占位符。

variable_pattern = re.compile("{{(\w+)}}")
def process_shape_with_text(shape, variable_pattern):
if not shape.has_text_frame:
    return

whole_paragraph = shape.text
matches = variable_pattern.findall(whole_paragraph)
if len(matches) == 0:
    return

is_found = False
for paragraph in shape.text_frame.paragraphs:
    for run in paragraph.runs:
        matches = variable_pattern.findall(run.text)
        if len(matches) == 0:
            continue
        replace_variable_with(run, data, matches)
        is_found = True

if not is_found:
    print("Not found the matched variables in the run segment but in the paragraph, target -> %s" % whole_paragraph)

    matches = variable_pattern.finditer(whole_paragraph)
    space_prefix = re.match("^\s+", whole_paragraph)

    match_container = [x for x in matches];
    need_modification = {}
    for i in range(len(match_container)):
        m = match_container[i]
        path_recorder = space_prefix.group(0)

        (start_0, end_0) = m.span(0)
        (start_1, end_1) = m.span(1)

        if (i + 1) > len(match_container) - 1 :
            right = end_0 + 1
        else:
            right = match_container[i + 1].start(0)

        for paragraph in shape.text_frame.paragraphs:
            for run in paragraph.runs:
                segment = run.text
                path_recorder += segment

                if len(path_recorder) >= start_0 + 1 and len(path_recorder) <= right:
                    print("find it")

                    if len(path_recorder) <= start_1:
                        need_modification[run] = run.text.replace('{', '')

                    elif len(path_recorder) <= end_1:
                        need_modification[run] = data[m.group(1)]

                    elif len(path_recorder) <= right:
                        need_modification[run] = run.text.replace('}', '')

                    else:
                        None


    if len(need_modification) > 0:
        for key, value in need_modification.items():
            key.text = value
相关问题