-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converting Audit bson file to JSON #27
Comments
Can you describe the issue you're facing? What does the "audit bson file" look like? Does it contain many small documents, many large documents, or a single large document? Have you tried using bson.decode_file_iter() from pymongo? This method decodes a bson stream file without needing to read the entire file at once. with open(auditFile, 'rb') as file:
for doc in bson.decode_file_iter(file): # Iterate over all the documents in the file
print(doc) |
Shane, Thanks for your response, our audit bson file is huge around 2gb, I was wondering if there is a way to use bsonjs to iter over the bson document while converting it to json as we can do it in bson.decode_file_iter Thanks, Anil |
There is no decode_file_iter equivalent in bsonjs yet. We could add one or you could implement it yourself with some reading of the BSON format (see http://bsonspec.org/spec.html). Check out the decode_file_iter source from pymongo: https://github.com/mongodb/mongo-python-driver/blob/3.11.3/bson/__init__.py#L1135-L1161 Or you could try using bson.decode_file_iter() from pymongo instead of using bsonjs. |
Hi Shane,
We have a requirment to process all audit bson file of Mongo database and store it at a centralized location for reporting. In our current process, we scan all the audit bson file convert it to json and send the json file to be persisted at centralized location via REST API call. Following is a snippet of code...
**from bson.json_util import loads, dumps, DEFAULT_JSON_OPTIONS
from bson import decode_all
if not self.util.isFileExists(auditFile):
return self.util.buildResponse(self.Globals.unsuccess, f"file {auditFile} is missing ")
myAuditFileSize = self.util.getFileSizeBytes(auditFile) / (1024*1024)
if myAuditFileSize > self.BSON_FILE_SIZE_LIMIT_MB:
print(f"Audit bson file '{auditFile}' size is larger than {self.BSON_FILE_SIZE_LIMIT_MB}")
return self.util.buildResponse(self.Globals.unsuccess, f"Audit bson file '{auditFile}' size is larger than {self.BSON_FILE_SIZE_LIMIT_MB}MB ")
3. processing - converting bson to json
try:
if self.util.getFileExtn(auditFile).lower() == "json":
myMongoAuditData = self.util.readJsonFile(auditFile)
else:
with open(auditFile, 'rb') as file:
myMongoAuditData = decode_all(file.read())
return myMongoAuditData**
We are facing issue on processing larger bson file thus restricting the size of audit bson file which will be processed. I need your help to use "bsonjs" module to process the audit bson file to generate the json file (will be better to generate smaller json file).
Pls assist.
Thanks,
Anil Kumar
The text was updated successfully, but these errors were encountered: