-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filebeat doesn't remove deleted files if new File with same name is written #3789
Comments
I can see the issue, that the state will not be removed. How does your rsync exactly work? Is the file first removed and later the new one added, or does that happen in one go? Do you have a predefined rsync cycle? If yes, you could use The part that is not clear to me is why filebeat should continue to read the file? It loads states based on |
The rsync syncronises a complete log folder wit the -a option active (so that all timestamps are preserved) every 10 minutes. After the next logrotation and sync phase linux will give the newly created log(log.1) the inode number of the deleted log.9 (through inode reuse) and filebeat will think it is a rename and not a create. To your questions explicitly:
|
It is currently by design, that in case of "doubt" states are not removed as normally it is worse to remove one state too many then one too few. But in your inode reuse case this leads to issues. I'm trying to think trough what the negative / side affects could be if we not only compare the path but also the inode/device and if it does not match, we remove the state. If a file is renamed after the reading has finished and clean_removed gets into play, the state will be removed even though the file is still around. This should happen rarely as the prospector also keeps track of renamed files and it should not matter, as this file already falls under Other cases? |
I created #3827 to run some tests on this change. It affects some test with heavy file rotation. It can definitively be discussed if the new outcome is the one expected. The most important thing I found is that our docs state:
That would be the ideal case for me but as we only compare the source and can't ask the disk to check if an inode/device is still around, I think the above statement is not fully correct :-( Also the changed behaviour is not perfect in my opinion as it still relies on the file path. The only place where we know that an file handler was removed is in the harvester. Currently the harvester does not report back why it was closed. We could potentially change that in the future that the harvester reports back that a file was deleted and updates the state directly. |
Yes but the back reporting would only work if the harvester is still running and not already closed. Doesn't it? To the possible side affects: i can't think of possible side affects this would have and that would be contrary to the intended function of this feature. |
The prospector also reports on file changes like renaming. It happens on each scan. There are some side affects visible in the failing tests above. But TBH it's not a "real" use case as we use |
Previously if a file could be found under the same name as a state, the state was not removed. But this could have been also an other file. With this change also the file itself is compared and if it is not the same file, the state will be removed. This has the affect that in case a file is renamed after monitoring the file finished, the state will be removed. In most cases this should not have any side affect. The positive effect of this change is that there will be less left over states in the registry. Closes elastic#3789
* Modify clean_removed handling Previously if a file could be found under the same name as a state, the state was not removed. But this could have been also an other file. With this change also the file itself is compared and if it is not the same file, the state will be removed. This has the affect that in case a file is renamed after monitoring the file finished, the state will be removed. In most cases this should not have any side affect. The positive effect of this change is that there will be less left over states in the registry. Closes #3789 * implement review changes
@Shaoranlaos A change of behaviour was just merged into master: #3827 This will be available in the snapshot builds in the next hours if you are interested to test it: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/ |
@ruflin After a short Test this seems to work as intended. However, I noticed another problem: However this is an error specific to my very special use case and i have in the moment a working workaround with the simultaneously sync of only the rotating files that belong together. So there is no need to work on a (in my opinion) complicated solution for this. |
@Shaoranlaos Thanks for the testing, appreciate it. Agree that the new issue is more a rsync issue then fb issue. |
Version: 5.2.2
OS: CentOS 7
Due to an old OS on another machine, under which filebeat doesn't run, i rsync the logs to my elastic server on which a filebeat reads them.
Through this setup i found a bug in the clean_removed Routine, which creates a situation, in combination with the inode reuse under linux, in which there are two entrys in the registry file for the same source path with different inodes.
{ "source":"/root/copied_logs/<path to the same log>", "offset":5243941, "FileStateOS":{ "inode":1433391, "device":2049 }, "timestamp":"2017-03-22T12:59:31.413988399+01:00", "ttl":-1 }, { "source":"/root/copied_logs/<path to the same log>", "offset":5246116, "FileStateOS":{ "inode":1433400, "device":2049 }, "timestamp":"2017-03-22T13:02:32.749620477+01:00", "ttl":-1 }
The second entry is the one that doesn't exist anymore and when the new log file is rsynced it gets the inode from this entry and filebeat trys to continue reading it.
As a solution for this Bug:
The cleanup method should not only check if it can get os.Stat for the source but also if this stat is the same as the one in the registry entry which is searched. Otherwise it should be deleted.
beats/filebeat/prospector/prospector_log.go
Line 94 in 4527162
The text was updated successfully, but these errors were encountered: