You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In my case running radon raw big-package takes 28.38s. In reality the module I'm trying to analyse has ~ 5000 lines with a similar amount of AST per line.
If I double my 800 line example, the script takes roughly 115.50s to run, so my feeling is that there might be something which scales worse than O(n) per-AST.
Any pointers if there might be something that can be optimised here, or if the nature of the analysis is such that speeding this process up is simply not possible?
Thanks in advance, if anyone can share their experience.
Cheers,
Sam
On a side note, while researching this issue, I found radon cited in an academic paper, which I thought was interesting and worth sharing (https://arxiv.org/pdf/2007.08978.pdf).
The text was updated successfully, but these errors were encountered:
Hi Sam, thanks for sharing the example. Indeed, it's quite surprising to see such a long run time for such a simple file.
The raw command is definitely the slowest, and that's because it does not use the ast module to parse the file, instead it uses tokenize. The latter is written in pure Python instead of C, so that's already a slowing factor. Moreover, when parsing the AST we can use efficient techniques like the visitor pattern, which are not available with the tokenize module.
However, the superlinear complexity is definitely in Radon's code. It performs some complicated operations to count logical lines, and I suspect that's where the slowest code is. I think your example highlights one of the inefficiencies particularly well.
The next step would be to profile the code. A flamegraph should already give some very useful hints. I'll try to investigate this when I've got time.
I have some large files that radon struggles to analyse. I created an example to demonstrate the problem: https://gist.githubusercontent.com/Sam152/50e8ef27cceb899084b42a069237a7b8/raw/bb21870395df86a0062c22353b532b45d31bd3f5/sample.py (~800 lines)
In my case running
radon raw big-package
takes 28.38s. In reality the module I'm trying to analyse has ~ 5000 lines with a similar amount of AST per line.If I double my 800 line example, the script takes roughly 115.50s to run, so my feeling is that there might be something which scales worse than
O(n)
per-AST.Any pointers if there might be something that can be optimised here, or if the nature of the analysis is such that speeding this process up is simply not possible?
Thanks in advance, if anyone can share their experience.
Cheers,
Sam
On a side note, while researching this issue, I found radon cited in an academic paper, which I thought was interesting and worth sharing (https://arxiv.org/pdf/2007.08978.pdf).
The text was updated successfully, but these errors were encountered: