Python:用于打印语句的日志记录模块:重复的日志条目

时间:2013-08-13 16:54:50

标签: python logging module system-calls tee

更新:向下滚动到EDIT SECTION(4)以获得几乎完全正常工作的版本。还删除了EDIT SECTION(1)因为这个帖子太长了,否则,这可能是最没用的部分。最初在EDIT SECTION(1)中的链接如下。

How do I duplicate sys.stdout to a log file in python?

这里有很长时间的搜索者,但是第一次问一个问题。

说明:

我需要将打印重定向到日志,因为我通过使用print语句记录消息的系统调用来合并C代码。还有一堆由我的同事编写的旧Python代码被调用,它也使用print语句进行调试。

最终,我希望能够为我更新的代码处理logging.info('message'),但也能够通过内置日志记录模块重定向打印语句,以便我无法更改代码或根本没有得到更新。

以下是我提出的一些示例代码,以简明扼要地展示我的问题。

问题:

  1. 我使用下面的设置作为我的日志,但每次打印时,我的日志中都会出现重复的条目(和空白行)。任何人都可以解释为什么会这样吗?
  2. 最好找出更好的日志记录设置,以便格式化语句在我通过日志记录重定向打印命令时包含正确的模块名称。
  3. 我对这个Tee(对象)类的使用似乎偶尔会破坏它。请参阅下面的支持信息部分。
  4. 我的代码:

    编辑:最初是setuplog.Tee。 init 包含if os.path.exists(LOGNAME): os.remove(LOGNAME)。这已被删除并放入base.py。

    setuplog.py:

    #!/usr/bin/python
    import sys
    import os
    import logging
    import logging.config
    
    LOGNAME = 'log.txt'
    
    CONFIG = {
        'version': 1,
        'disable_existing_loggers': True,
        'formatters': {
            'simple': {
                'format': '%(module)s:%(thread)d: %(message)s'
                },
            },
        'handlers': {
            'console': {
                'level': 'NOTSET',
                'class': 'logging.StreamHandler',
                'formatter': 'simple'
                },
            'file': {
                'level': 'NOTSET',
                'class': 'logging.FileHandler',
                'formatter': 'simple',
                'filename': LOGNAME
                },
            },
        'root': {
            'level': 'NOTSET',
            'handlers': ['console', 'file']
            },
        }
    
    
    class Tee(object):
        def __init__(self):
            logging.config.dictConfig(CONFIG)
            self.logger = logging.getLogger(__name__)
            self.stdout = sys.stdout
            sys.stdout = self
        def __del__(self):
            sys.stdout = self.stdout
        def write(self, data):
            self.logger.info(data)
    

    base.py:

    #!/usr/bin/python
    import sys
    import os
    import logging
    # My modules.
    import setuplog
    #import aux2
    
    
    LOGNAME = 'log.txt'
    if os.path.exists(LOGNAME):
        os.remove(LOGNAME)
    
    not_sure_what_to_call_this = setuplog.Tee()
    
    print '1 base'
    logging.info('2 base')
    print '3 base'
    os.system('./aux1.py')
    logging.info('4 base')
    #aux2.aux2Function()
    #logging.info('5 base')
    

    aux1.py:

    #!/usr/bin/python
    import sys
    import os
    import logging
    import setuplog
    
    not_sure_what_to_call_this = setuplog.Tee()
    
    def main():
        print '1 aux1'
        logging.info('2 aux1')
        print '3 aux1'
        logging.info('4 aux1')
    
    
    if __name__ == '__main__':
        main()
    

    aux2.py:

    #!/usr/bin/python
    import sys
    import os
    import logging
    import setuplog
    
    not_sure_what_to_call_this = setuplog.Tee()
    
    def aux2Function():
        print '1 aux2'
        logging.info('2 aux2')
        print '3 aux2'
    

    然后我从shell“./base.py”运行并生成以下输出(在控制台和log.txt中):

    setuplog:139833296844608: 1 aux1
    setuplog:139833296844608:
    
    aux1:139833296844608: 2 aux1
    setuplog:139833296844608: 3 aux1
    setuplog:139833296844608:
    
    aux1:139833296844608: 4 aux1
    

    如您所见,重复使用print生成的条目(问题1)。另外,我需要提出一个更好的约定来显示模块名称(问题2)。

    支持问题3的信息:

    从base.py,如果我取消注释“import aux2”,“aux2.aux2Function()”和“logging.info('5 base')”,这里是新输出(直接来自我的控制台,因为这是Python错误唯一的地方):

    base:140425995155264: 2 base
    setuplog:140360687101760: 1 aux1
    setuplog:140360687101760:
    
    aux1:140360687101760: 2 aux1
    setuplog:140360687101760: 3 aux1
    setuplog:140360687101760:
    
    aux1:140360687101760: 4 aux1
    base:140425995155264: 4 base
    aux2:140425995155264: 2 aux2
    base:140425995155264: 5 base
    Exception AttributeError: "'NoneType' object has no attribute 'stdout'" in <bound method Tee.__del__ of <setuplog.Tee object at 0x7fb772f58f10>> ignored
    

    编辑部分(2):

    我一直在玩,这种作品。以下是(再次)示例代码的更新版本。

    它“有点”的原因是:

    1. 我认为应该不惜一切代价避免例外情况,这就是利用一个。
    2. 日志输出现在有些手动了。让我解释。 %(name)s按预期出现,但我必须手动设置。我宁愿选择某种描述符来自动选择文件的名称,或类似的东西(选择功能作为奖励?)。 %(模块)始终为print语句显示“setuplog”(正确),即使我希望报告的模块是print语句的来源,而不是我的类将打印语句指向日志模块的模块。< / LI>

      setuplog.py:

      #!/usr/bin/python
      import sys
      import os
      import logging
      import logging.config
      
      def startLog(name):
          logname = 'log.txt'
          config = {
              'version': 1,
              'disable_existing_loggers': True,
              'formatters': {
                  'simple': {
                      'format': '%(name)s:%(module)s:%(thread)s: %(message)s'
                      },
                  },
              'handlers': {
                  'console': {
                      'level': 'NOTSET',
                      'class': 'logging.StreamHandler',
                      'formatter': 'simple'
                      },
                  'file': {
                      'level': 'NOTSET',
                      'class': 'logging.FileHandler',
                      'formatter': 'simple',
                      'filename': logname
                      },
                  },
              'root': {
                  'level': 'NOTSET',
                  'handlers': ['console', 'file'],
                  },
              }
          logging.config.dictConfig(config)
          return logging.getLogger(name)
      
      class Tee():
          def __init__(self, logger):
              self.stdout = sys.stdout
              self.data = ''
              self.logger = logger
              sys.stdout = self
          def __del__(self):
              try:
                  sys.stdout = self.stdout
              except AttributeError:
                  pass
          def write(self, data):
              self.data += data
              self.data = str(self.data)
              if '\x0a' in self.data or '\x0d' in self.data:
                  self.data = self.data.rstrip('\x0a\x0d')
                  self.logger.info(self.data)
                  self.data = ''
      

      base.py:

      #!/usr/bin/python
      import sys
      import os
      import logging
      # My modules.
      import setuplog
      import aux2
      
      LOGNAME = 'log.txt'
      if os.path.exists(LOGNAME):
          os.remove(LOGNAME)
      logger = setuplog.startLog('base')
      setuplog.Tee(logger)
      
      print '1 base'
      logger.info('2 base')
      print '3 base'
      os.system('./aux1.py')
      logger.info('4 base')
      aux2.aux2Function()
      logger.info('5 base')
      

      aux1.py:

      #!/usr/bin/python
      import sys
      import os
      import logging
      import setuplog
      
      
      def main():
          logger = setuplog.startLog('aux1')
          setuplog.Tee(logger)
          print '1 aux1'
          logger.info('2 aux1')
          print '3 aux1'
          logger.info('4 aux1')
      
      
      if __name__ == '__main__':
          main()
      

      aux2.py:

      #!/usr/bin/python
      import sys
      import os
      import logging
      import setuplog
      
      
      def aux2Function():
          logger = setuplog.startLog('aux2')
          setuplog.Tee(logger)
          print '1 aux2'
          logger.info('2 aux2')
          print '3 aux2'
      

      输出:

      base:setuplog:139712687740736: 1 base
      base:base:139712687740736: 2 base
      base:setuplog:139712687740736: 3 base
      aux1:setuplog:140408798721856: 1 aux1
      aux1:aux1:140408798721856: 2 aux1
      aux1:setuplog:140408798721856: 3 aux1
      aux1:aux1:140408798721856: 4 aux1
      base:base:139712687740736: 4 base
      aux2:setuplog:139712687740736: 1 aux2
      aux2:aux2:139712687740736: 2 aux2
      aux2:setuplog:139712687740736: 3 aux2
      

      编辑部分(3):

      感谢reddit(http://www.reddit.com/r/learnpython/comments/1kaduo/python_logging_module_for_print_statements/cbn2lef)的精彩回复,我能够为AttributeError开发一个解决方法。我没有使用异常,而是将类转换为单例。

      以下是更新的Tee类的代码:

      class Tee(object):
          _instance = None
          def __init__(self, logger):
              self.stdout = sys.stdout
              self.data = ''
              self.logger = logger
              sys.stdout = self
          def __new__(cls, *args, **kwargs):
              if not cls._instance:
                  cls._instance = super(Tee, cls).__new__(cls, *args, **kwargs)
              return cls._instance
          def __del__(self):
              sys.stdout = self.stdout
          def write(self, data):
              self.data += data
              self.data = str(self.data)
              if '\x0a' in self.data or '\x0d' in self.data:
                  self.data = self.data.rstrip('\x0a\x0d')
                  self.logger.info(self.data)
                  self.data = ''
      

      编辑部分(4):

      这几乎完全有效!它足以让我实现它。现在唯一的问题是让输出的格式更有帮助。例如,对于所有重新路由的打印语句,%(filename)s是setuplog.py。如果%(filename)s是源自print语句的文件,那将更有益。有什么想法吗?

      另外,我不得不放弃字典方法。我能够完成所有工作的唯一方法是通过Python代码设置记录器。

      最后一点,请看一下aux3.py.如果使用os.system而不是subprocess,则日志记录的顺序会搞乱。有没有人知道仍然使用os.system并获得正确的顺序(所以我不必将每个最后的os.system更改为subprocess.Popen)?

      setuplog.py(您可以忽略函数startDictLog和startFileLog,因为它们不起作用。但是,startCodeLog可以!):

      #!/usr/bin/python
      import sys
      import os
      import logging
      import logging.config
      
      def startLog(name, propagate):
          '''The only point of this function was to enable me to quickly
          and easily switch how I wanted to configure the logging. So far,
          I have only been able to get the last way working
          (startCodeLog).'''
          #return startDictLog(name)
          #return startFileLog(name)
          return startCodeLog(name, propagate)
      
      def startDictLog(name):
          '''Configure logging usinga dictionary.'''
          LOGNAME = 'loop.log'
          DEBUGNAME = 'debug.log'
          config = {
              'version': 1,
              'disable_existing_loggers': True,
              'formatters': {
                  'bare': {
                      # Added the BARE to distinguish between normal prints
                      # and those that get rerouted. In use, I would format it
                      # such that only the message is printed.
                      'format': 'BARE: %(message)s'
                      },
                  'simple': {
                      'format': '%(module)s-%(name)s: %(message)s'
                      },
                  'time': {
                      'format': '%(asctime)s-%(filename)s-%(module)s-%(name)s: %(message)s',
                      'datefmt': '%H:%M:%S'
                      },
                  },
              'handlers': {
                  'console': {
                      'level': 'NOTSET',
                      'class': 'logging.StreamHandler',
                      'formatter': 'bare'
                      },
                  'normal': {
                      'level': 'INFO',
                      'class': 'logging.FileHandler',
                      'formatter': 'simple',
                      'filename': LOGNAME
                      },
                  'debug': {
                      'level': 'NOTSET',
                      'class': 'logging.FileHandler',
                      'formatter': 'time',
                      'filename': DEBUGNAME
                      },
                  },
              'root': {
                  'level': 'NOTSET',
                  'handlers': ['console', 'normal', 'debug'],
                  },
              }
          logging.config.dictConfig(config)
          return logging.getLogger(name)
      
      def startFileLog(name):
          '''Configure logging using a configuration file.'''
          CONFIGFILE = 'logging.conf'
          logging.config.fileConfig(CONFIGFILE)
          return logging.getLogger(name)
      
      def startCodeLog(name, propagate):
          '''Configure logging using this code.'''
          LOGFILE = 'loop.log'
          DEBUGFILE = 'debug.log'
          _logger = logging.getLogger(name)
          _logger.setLevel(logging.NOTSET)
          if propagate in [False, 'n', 'no', 0]:
              _logger.propagate = False
          _console = logging.StreamHandler()
          _normal = logging.FileHandler(LOGFILE)
          _debug = logging.FileHandler(DEBUGFILE)
      
          _bare = logging.Formatter('BARE: %(message)s')
          _simple = logging.Formatter('%(module)s-%(name)s: %(message)s')
          _time = logging.Formatter('%(asctime)s-%(module)s-%(name)s: %(message)s',
                                             datefmt = '%H:%M:%S')
          # I added _complex only here, to the working way of setting up the configuration,
          # in hopes that this data may help someone figure out how to get the printout
          # to be more useful. For example, it's not helpful that the filename is
          # setuplog.py for every print statement. It would be more beneficial to somehow
          # get the filename of where the print statement originated.
          _complex = logging.Formatter('%(filename)s-%(funcName)s-%(name)s: %(message)s')
      
          # Normally this is set to _bare to imitate the output of the old version of the
          # scripts I am updating, but for our purposes, _complex is more convenient.
          _console.setLevel(logging.NOTSET)
          _console.setFormatter(_complex)
      
          # This imitates the format of the logs from versions before I applied this update.
          _normal.setLevel(logging.INFO)
          _normal.setFormatter(_bare)
      
          # This is a new log file I created to help me debug other aspects of the scipts
          # I am updating.
          _debug.setLevel(logging.DEBUG)
          _debug.setFormatter(_time)
      
          _logger.addHandler(_console)
          _logger.addHandler(_normal)
          _logger.addHandler(_debug)
          return _logger
      
      class Tee(object):
          '''Creates a singleton class that tees print statements to the
          handlers above.'''
          _instance = None
          def __init__(self, logger):
              self.stdout = sys.stdout
              self.logger = logger
              sys.stdout = self
              self._buf = ''
              # Part of old method in the write function.
              #self.data = ''
          def __new__(cls, *args, **kwargs):
              '''This is the singleton implementation. This avoids errors with
              multiple instances trying to access the same standard output.'''
              if not cls._instance:
                  cls._instance = super(Tee, cls).__new__(cls, *args, **kwargs)
              return cls._instance
          def write(self, data):
              # This method doesn't work with how I had to implement subprocess.
              #self.data = data.rstrip('\r\n')
              #if self.data:
              #    self.logger.info(self.data)
      
              # Also doesn't seem to work with my subprocess implementation.
              #self.data += data
              #self.data = str(self.data)
              #if '\x0a' in self.data or '\x0d' in self.data:
              #    self.data = self.data.rstrip('\x0a\x0d')
              #    self.logger.info(self.data)
              #    self.data = ''
      
              # Only way I could get it working with my subprocess implementation.
              self._buf = ''.join([self._buf, data])
              while '\n' in self._buf:
                  line, _, tail = self._buf.partition('\n')
                  if line.strip():
                      self.logger.info(line)
                  self._buf = tail
      

      base.py:

      #!/usr/bin/python
      import sys
      import os
      import logging
      import subprocess
      # My modules below.
      import setuplog
      import aux2
      
      # It is assumed that this script will be executed via the command line,
      # and each time we run it, we want new log files.
      LOGNAME = 'loop.log'
      DEBUGNAME ='debug.log'
      if os.path.exists(LOGNAME):
          os.remove(LOGNAME)
      if os.path.exists(DEBUGNAME):
          os.remove(DEBUGNAME)
      
      # It seems more convenient to store the logging configuration elsewhere,
      # hence my module setuplog.
      logger = setuplog.startLog('', True)
      logger = setuplog.startLog('base', False)
      # This initializes sys.stdout being redirected through logging. Can anyone
      # explain how calling this class achieves this? I am a bit fuzzy on my
      # understanding here.
      setuplog.Tee(logger)
      
      # Test to see how it works in this module.
      print '1 base'
      logger.info('2 base')
      print '3 base'
      
      # Below shows how to get logging to work with scripts that can
      # not be modified. In my case, I have C code that I don't want to modify,
      # but I still need to log it's output.
      
      # I will have to go through the old code and change all os.system calls
      # to instead utilize subprocess. Too bad because this will take some time
      # and be "busy work", but at least it works. What can ya do, os.system is
      # depreciated anyway. The positive side is that only the root application
      # needs to change its system calls to use subprocess. The scipts that it
      # calls upon can remain untouched.
      
      aux1_py_path = '"%s"' % os.path.join(os.path.dirname(__file__), 'aux1.py')
      #os.system(aux1_py_path) # Example to show how os.system doesn't work.
      print 'aux1_py_path:', aux1_py_path
      sys_call = subprocess.Popen(aux1_py_path, shell=True, stdout=subprocess.PIPE,
                                  stderr=subprocess.PIPE)
      sys_stdout, sys_stderr = sys_call.communicate()
      print sys_stdout
      print sys_stderr
      
      # This is to ensure that the order of the logging events is correct. With an
      # old method I used (not with logging, simply redirecting stdout into an
      # opened file), the order would often get messed up.
      logger.info('4 base')
      # This example is to show how to get logging to work with Python scripts I am
      # willing to modify, at least partially. See aux2.py. It is simply a print
      # statement, followed by a logging statement, followed by another print
      # statement. They all output properly with this method.
      aux2.aux2Function()
      # Again to ensure the order of events is correct.
      logger.info('5 base')
      

      aux1.py:

      #!/usr/bin/python
      import logging
      import subprocess
      import os
      
      def main():
          '''We expect the print statements to go through, as they are
          being sent to logging. However, these logging statements
          do nothing as no logger has been instantiated. This is the
          behavior we should expect, as this script mimics a script
          that we would not modify, so it would not have logging calls
          anyway.'''
          print '1 aux1'
          logging.info('2 aux1')
          print '3 aux1'
          logging.info('4 aux1')
      
          # Here, neither option works unless the root of all these calls
          # was made with subprocess and in the script that called the
          # tee class. If both conditions are met, subprocess works as it
          # should. However, os.system returns the print out from the call
          # out of order.
      
          aux3_py_path = '"%s"' % os.path.join(os.path.dirname(__file__), 'aux3.py')
          #os.system(aux3_py_path)
          sys_call = subprocess.Popen(aux3_py_path, shell=True,
                                      stdout=subprocess.PIPE,
                                      stderr=subprocess.PIPE)
          sys_stdout, sys_stderr = sys_call.communicate()
          print sys_stdout
          print sys_stderr
      
      if __name__ == '__main__':
          main()
      

      aux2.py:

      #!/usr/bin/python
      import sys
      import os
      import logging
      import setuplog
      
      def aux2Function():
          # The following two lines are not necessary for most functionality.
          # However, if the following two lines are uncommented, then %(name)s
          # in the format descriptor of logging will correctly be 'aux2' rather
          # than base. Both lines are required for this to work.
          logger = setuplog.startLog('aux2', False)
          setuplog.Tee(logger)
      
          print '1 aux2'
          logging.info('2 aux2')
          print '3 aux2'
      

      aux3.py:

      #!/usr/bin/python
      import logging
      
      def main():
          '''See __doc__ for aux1.py. Again, we don't expect the logging.info
          to work, but that's okay because theoretically, this is some script
          we can't modify that simply generates output with print or print
          like functions.'''
          print '1 aux3'
          logging.info('2 aux3')
          print '3 aux3'
      
      if __name__ == '__main__':
          main()
      

      logging.conf(不起作用):

      [loggers]
      keys=root
      
      [handlers]
      keys=console,normal,debug
      
      [formatters]
      keys=bare,simple,time
      
      [logger_root]
      level=NOTSET
      handlers=console,normal,debug
      
      [handler_console]
      level=NOTSET
      class=StreamHandler
      formatter=bare
      args=(sys.stdout,)
      
      [handler_normal]
      level=INFO
      class=FileHandler
      formatter=simple
      args=('loop.log',)
      
      [handler_debug]
      level=DEBUG
      class=FileHandler
      formatter=time
      args=('debug.log',)
      
      [formatter_bare]
      format=%(message)s
      
      [formatter_simple]
      format=%(module)s-%(name)s: %(message)s
      
      [formatter_time]
      format=%(asctime)s-%(filename)s-%(module)s-%(name)s: %(message)s
      datefmt=%H:%M:%S
      

1 个答案:

答案 0 :(得分:3)

这至少是你第一个问题的部分答案。你在每个print上得到这些空行,因为Python 2.x中的print语句可能会调用stdout.write() 两次,一次使用来自评估表达式的数据在语句中,如果可选换行未被尾随逗号抑制,则再次出现。

此外,如果将INFO级别的消息发送到level 'NOTSET'的记录器集,则默认情况下该消息也会在sys.stderr上回显。记录器文档 - 即使Tee生效,您也会看到控制台输出的原因。我没有看到重复的日志条目。

要防止空白行,请尝试使用此Tee类定义。请注意write()方法修改(更新为与“编辑部分(3)”匹配的单例):

class Tee(object):
    _instance = None
    def __init__(self, logger):
        self.stdout = sys.stdout
        self.logger = logger
        sys.stdout = self
    def __new__(cls, *args, **kwargs):
        if not cls._instance:
            cls._instance = super(Tee, cls).__new__(cls, *args, **kwargs)
        return cls._instance
    def __del__(self):
        sys.stdout = self.stdout
    def write(self, data):
        data = data.rstrip('\r\n')
        if data: # anything left?
           self.logger.info(data)

有了这个以及所有其他更新,它似乎对我有效(包括未注释的aux2内容。从您的角度来看是否还有一些问题?如果是这样,您最初的长期问题已经得到解决甚至更多,应该完全清理,只留下最新的代码,并专注于任何剩余的问题。