如何使用Pandoc的Python过滤器将md与tikz转换为Windows 8.1上的html

时间:2015-06-07 18:07:51

标签: markdown pandoc

我正在尝试使用Pandoc过滤器将带有tikz图片的markdown文件转换为html。我在Win 8.1(我有所有依赖项 - pdflatex,Python 2.7,ImageMagick和pandocfilters Python包)。我正在使用John MacFarlane在github上提供的tikz.py脚本。

我在Pandoc Google Group上找到了类似的question,John MacFarlane建议将过滤器包装在Windows批处理脚本中(过滤器必须是可执行文件)。这是我的命令行输入(我将在下面提供文件内容)。

pandoc -o temp.html --filter .\tikz.bat -s temp.md

但我一直收到以下错误。

pandoc: Failed reading: satisfyElem

该脚本生成“tikz-images”子文件夹,但它是空的,结果输出文件temp.html也是如此。

我怎样才能让它发挥作用? FWIW,更大的目标是输入文件为R Markdown,但我想先了解Pandoc Markdown to HTML进程。

以下是文件内容。

tikz.bat

python tikz.py %*

temp.md

\begin{tikzpicture}

\draw [<->](-3,0)--(3,0);
\draw (-2,-.2)--(-2,.2);
\draw (-1,-.2)--(-1,.2);
\draw(0,-.2)--(0,.2);
\draw (1,-.2)--(1,.2);
\draw (2,-.2)--(2,.2);
\node[align=left,below] at (-4.5,-0.2) {Cash flow};
\node[align=left,above] at (-4.5,0.2) {Time period};
\node[align=left,above] at (-2,0.2) {-2};
\node[align=left,above] at (-1,0.2) {-1};
\node[align=left,above] at (0,0.2) {0};
\node[align=left,above] at (1,0.2) {+1};
\node[align=left,above] at (2,0.2) {+2};
\node[align=left,below] at (1,-0.2) {\$100};
\node[align=left,below] at (2,-0.2) {\$100};

\end{tikzpicture}

Can this work?

tikz.py

#!/usr/bin/env python

"""
Pandoc filter to process raw latex tikz environments into images.
Assumes that pdflatex is in the path, and that the standalone
package is available.  Also assumes that ImageMagick's convert
is in the path. Images are put in the tikz-images directory.
"""

import hashlib
import re
import os
import sys
import shutil
from pandocfilters import toJSONFilter, Para, Image
from subprocess import Popen, PIPE, call
from tempfile import mkdtemp

imagedir = "tikz-images"


def sha1(x):
    return hashlib.sha1(x.encode(sys.getfilesystemencoding())).hexdigest()


def tikz2image(tikz, filetype, outfile):
    tmpdir = mkdtemp()
    olddir = os.getcwd()
    os.chdir(tmpdir)
    f = open('tikz.tex', 'w')
    f.write("""\\documentclass{standalone}
             \\usepackage{tikz}
             \\begin{document}
             """)
    f.write(tikz)
    f.write("\n\\end{document}\n")
    f.close()
    p = call(["pdflatex", 'tikz.tex'], stdout=sys.stderr)
    os.chdir(olddir)
    if filetype == 'pdf':
        shutil.copyfile(tmpdir + '/tikz.pdf', outfile + '.pdf')
    else:
        call(["convert", tmpdir + '/tikz.pdf', outfile + '.' + filetype])
    shutil.rmtree(tmpdir)


def tikz(key, value, format, meta):
    if key == 'RawBlock':
        [fmt, code] = value
        if fmt == "latex" and re.match("\\\\begin{tikzpicture}", code):
            outfile = imagedir + '/' + sha1(code)
            if format == "html":
                filetype = "png"
            elif format == "latex":
                filetype = "pdf"
            else:
                filetype = "png"
            src = outfile + '.' + filetype
            if not os.path.isfile(src):
                try:
                    os.mkdir(imagedir)
                    sys.stderr.write('Created directory ' + imagedir + '\n')
                except OSError:
                    pass
                tikz2image(code, filetype, outfile)
                sys.stderr.write('Created image ' + src + '\n')
            return Para([Image([], [src, ""])])

if __name__ == "__main__":
    toJSONFilter(tikz)

更新我在评论中提到caps.py过滤器也出现了相同的症状。也许我还应该添加来自python caps.py temp.md的症状,这是在pandoc之外调用过滤器。我的理解是,这应该将caps.py文件全部大写打印到屏幕上。

但是,当我从Windows命令提示符运行python caps.py temp.md时,它会挂起。我用CTRL-C杀了命令,然后我得到以下内容。

C:\Users\Richard\Desktop\temp>python caps.py temp.md
Traceback (most recent call last):
  File "caps.py", line 15, in <module>
    toJSONFilter(caps)

python tikz.py temp.md也是如此。挂起,然后是:

C:\Users\Richard\Desktop\temp>python tikz.py temp.md
Traceback (most recent call last):
  File "tikz.py", line 70, in <module>
    toJSONFilter(tikz)

更新2 我试图在命令提示符下运行Windows调试器,但我不确定它是否有效。有时命令提示符会挂起。似乎调试器也挂起了。这是调试器的输出。

*** wait with pending attach
Symbol search path is: *** Invalid ***
****************************************************************************
* Symbol loading may be unreliable without a symbol search path.           *
* Use .symfix to have the debugger choose a symbol path.                   *
* After setting your symbol path, use .reload to refresh symbol locations. *
****************************************************************************
Executable search path is: 
ModLoad: 00007ff7`0d920000 00007ff7`0d97d000   C:\windows\system32\cmd.exe
ModLoad: 00007fff`b7c20000 00007fff`b7dcc000   C:\windows\SYSTEM32\ntdll.dll
ModLoad: 00007fff`b5c90000 00007fff`b5dce000   C:\windows\system32\KERNEL32.DLL
ModLoad: 00007fff`b4e40000 00007fff`b4f55000   C:\windows\system32\KERNELBASE.dll
ModLoad: 00007fff`b7b70000 00007fff`b7c1a000   C:\windows\system32\msvcrt.dll
ModLoad: 00007fff`b3070000 00007fff`b307e000   C:\windows\SYSTEM32\winbrand.dll
(1c7c.29a0): Break instruction exception - code 80000003 (first chance)
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\windows\SYSTEM32\ntdll.dll - 
ntdll!DbgBreakPoint:
00007fff`b7cb2cf0 cc              int     3

更新3 以下是Dropbox folder中的文件。此文件夹与我上面粘贴的文件相同,加上直接来自Pandoc过滤器github repo的caps.py文件。

1 个答案:

答案 0 :(得分:0)

使用-t选项后跟一个格式而不是带扩展名的文件例如pandoc -f json -t markdown将输出markdown,-t html将输出html etcetera来捕获使用重定向操作符操作&gt; file.some_extension。但是您的输出将转到控制台。所以正确的语法就是pandoc -f json -t markdown。

Also the pandoc documentation。 如果您遇到问题,请尝试修改您的lune:pandoc -o temp.html --filter .\tikz.bat -s temp.md ==&gt; pandoc -t json | ./caps.py latex | pandoc -f json -t html.

这是它的工作原理。

                 source format = input_file.html
                      ↓
                   (pandoc) = pandoc -t json input_file.html
                      ↓
              JSON-formatted AST 
                      ↓
                   (filter)    = python $HOME/Downloads/pandocfilters-1.2.4/examples/caps.py
                      ↓
              JSON-formatted AST
                      ↓
                   (pandoc)    =  pandoc -f json -t markdown
                      ↓
                target format = output_file.md

分离命令以检查输出并使用管道|重定向输出:

 pandoc -t json ~/testing/testing.html | python examples/caps.py | pandoc -f json -t markdown > output_file.md

无需安装pandocfilters下载tar文件,运行tar -xvf file.xyz或使用任何其他选择的应用程序并参考调用python dir / to / script.py的示例然后再将输出管道输出到pandoc并且将输出重定向到所需的文件格式。这是逐行的:

 $pandoc -t json ~/testing/testing.html
[{"unMeta":{"viewport":{"t":"MetaInlines","c":[{"t":"Str","c":"width=device-width,"},{"t":"Space","c":[]},{"t":"Str","c":"initial-scale=1"}]},"title":{"t":"MetaInlines","c":[]},"description":{"t":"MetaInlines","c":[]}}},[{"t":"Para","c":[{"t":"Str","c":"Hello"},{"t":"Space","c":[]},{"t":"Str","c":"world!"},{"t":"Space","c":[]},{"t":"Str","c":"This"},{"t":"Space","c":[]},{"t":"Str","c":"is"},{"t":"Space","c":[]},{"t":"Str","c":"HTML5"},{"t":"Space","c":[]},{"t":"Str","c":"Boilerplate."}]},{"t":"Para","c":[{"t":"Str","c":"l"}]}]]

然后:

$pandoc -t json ~/testing/testing.html | python examples/caps.py 
[{"unMeta": {"description": {"c": [], "t": "MetaInlines"}, "viewport": {"c": [{"c": "WIDTH=DEVICE-WIDTH,", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "INITIAL-SCALE=1", "t": "Str"}], "t": "MetaInlines"}, "title": {"c": [], "t": "MetaInlines"}}}, [{"c": [{"c": "HELLO", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "WORLD!", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "THIS", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "IS", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "HTML5", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "BOILERPLATE.", "t": "Str"}], "t": "Para"}, {"c": [{"c": "L", "t": "Str"}], "t": "Para"}]]

最后:

pandoc -t json ~/testing/testing.html | python examples/caps.py | pandoc -f json -t markdown
HELLO WORLD! THIS IS HTML5 BOILERPLATE.

注释:

diff -y pandoc_json.txt caps_json.txt
[{"unMeta":{"viewport":{"t":"MetaInlines","c":[{"t":"Str","c" / [{"unMeta": {"description": {"c": [], "t": "MetaInlines"}, "v