Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

外部引用paddleocr的logger打印两次 #13876

Closed
3 tasks done
cwmore opened this issue Sep 16, 2024 · 1 comment
Closed
3 tasks done

外部引用paddleocr的logger打印两次 #13876

cwmore opened this issue Sep 16, 2024 · 1 comment

Comments

@cwmore
Copy link

cwmore commented Sep 16, 2024

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

如下代码,按理说只是打印一次,但引入后会打印两次,

import paddleocr.ppocr.utils.logging as utilLogging
logger = utilLogging.get_logger()
logger.info("test!")

截图_选择区域_20240916084240

因为我想拿到paddleocr的logger做动态设置。使用 from paddleocr.ppocr.utils.logging import get_logger 去引入的方式也不对,看了代码,最后发现 logger_initialized 这个全局变量在import后不一致。按逻辑,应该是初始化后有了就直接返回。import时会被ppocr/utils/predict_rec.py等文件初始化默认的'ppocr'这个logger,但是当第三方一引入后,logger_initialized又变为空了。
(之前其它人有说使用logreset方案绕过,但感觉就是bug。当然在get_logger的时候传不同的名字可以规避这个问题,但还是获取不到默认初始化的那个logger)

本人真不确定该怎么改,希望社区给点建议。

类似问题:
#5743
PaddlePaddle/Paddle#57165

🏃‍♂️ Environment (运行环境)

python 3.10.14
paddle2onnx 1.2.8
paddleclas 2.5.2
paddlefsl 1.1.0
paddlenlp 2.8.1
paddleocr 2.8.1
paddlepaddle 3.0.0b1

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

import paddleocr.ppocr.utils.logging as utilLogging
logger = utilLogging.get_logger()
logger.info("test!")
@jingsongliujing
Copy link
Collaborator

日志记录器(logger)被配置为同时输出到控制台和文件。当你调用logger.info("test!")时,它会将日志信息写入文件,并在控制台上显示相同的信息。所以会打印两次,把logging代码改成
image

import os
import sys
import logging
import functools
import paddle.distributed as dist

logger_initialized = {}

def _get_rank():
    # Simplified check for distributed environment setup
    return dist.get_rank() if dist.is_initialized() else 0

@functools.lru_cache()
def get_logger(name="ppocr", log_file=None, log_level=logging.DEBUG):
    """Initialize and get a logger by name.
    """
    if name in logger_initialized:
        return logging.getLogger(name)

    logger = logging.getLogger(name)
    if len(logger.handlers) > 0:
        # The logger already has handlers configured, return it directly.
        return logger

    formatter = logging.Formatter(
        "[%(asctime)s] %(name)s %(levelname)s: %(message)s", datefmt="%Y/%m/%d %H:%M:%S"
    )

    stream_handler = logging.StreamHandler(stream=sys.stdout)
    stream_handler.setFormatter(formatter)
    logger.addHandler(stream_handler)

    if log_file is not None and _get_rank() == 0:
        log_file_folder = os.path.split(log_file)[0]
        os.makedirs(log_file_folder, exist_ok=True)
        file_handler = logging.FileHandler(log_file, "a")
        file_handler.setFormatter(formatter)
        logger.addHandler(file_handler)

    logger.setLevel(log_level if _get_rank() == 0 else logging.ERROR)
    logger_initialized[name] = True
    logger.propagate = False
    return logger

# 使用示例
# import paddleocr.ppocr.utils.logging as utilLogging
logger = get_logger()

# 输出测试日志信息
logger.info("test!")

@GreatV GreatV closed this as completed Sep 25, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants