-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROOT-5073: Explore changing the on-file byte format to little endian #136
Conversation
1. Fix tag out of range issue, this is due to TKey calling ReadObjWithBuffer which does not set as Little Endian 2. Fix TTree ReadVersion error, this is because in TBufferFile::ReadVersion we changed the cnt format and it will cause the startoffset of reading version goes back to the cnt offset
… kByteCountMask, IsBufBigEndian());
1. In roottest mathcore, some tests call cl->GetMethod which finds the interface of creator of TBufferFile. Since I changed the interface and add two Bool_t arguments, so as should we change the cl->GetMethod here. 2. If we change TBufferFile to LittleEndian, there is a situation could be inappropriately characterized as Having ByteCount and here is the situation happens: For example, if there is no byte count and the first two bytes stores version and the next four bytes store fBufferSize like following: UShort_t + UInt_t ( Version_t + fBufferSize) But in TBufferFile.cxx, it takes a look at (v.cnt & kByteCountMask)(v.cnt is the first four bytes). If it is 0, ROOT determine there is no byte count. Otherwise, the next four bytes should be byte count. If TBufferFile is BigEndian: 00 02 00 00 7d 00 and in this example, version = 2 and fBufferSize = 32000. So v.cnt is 00 00 02 00. However, if we change TBufferFile to LittleEndian, the memory layout becomes: 02 00 00 7d 00 00 and in this case, v.cnt is 7d 00 00 02 and (v.cnt & kByteCountMask != 0) so ROOT thinks the next four bytes are byte count.
@pcanal @bbockelm The lines of code in following link is to determine if there is byte count TBuffer I discussed with Brian this morning. I left some comments above it. |
{ | ||
#ifdef R__BYTESWAP | ||
if(buffBigEndian) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coding convention: there should be a space between if and parenthesis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the 'cost' of this added if statement (it is in the inner most part of the processing)?
It is unclear how the code when reading an arbitrary file decides whether the buffer is big endian or little endian. What is the plan there (and/or did I miss some code where it is already implemented)? |
1. By convention, there must be a space between if and left parentheses. 2. TKeyXML and TKeySQL derive from TKey, therefore ReadObjWithBuffer() must add two more argument. Additionally, we only add argument type without variable name to avoid compiling error of "arguments are not used".
@pcanal I have not implemented it yet. I think I could add one more bit in TFile and indicate the endianness of the file. Does that sound like a plan? Otherwise, I might also evaluate the fBufBigEndian bit in TBuffer and determine each buffer is big endian or little endian. |
I agree it will probably be sufficient to restrict a full file to be one or the other. We have to find a place in the header on the beginning of the file to store this information. |
@zzxuanyuan @pcanal I understand this is still work in progress? |
Can one of the admins verify this patch? |
I think this branch is lack of performance test among different alternatives as discussed in https://sft.its.cern.ch/jira/browse/ROOT-5073. Do we still want to work on that? |
Hi, I think we can close for now. Thanks! Brian |
Hi, @pcanal @bbockelm
This branch implements little endian in TBuffer.
The code is not ready to be merged and I hope it would be more convenient to discuss it on github. There is still a design issue. This is the link: https://sft.its.cern.ch/jira/browse/ROOT-5073.
Let's take an example of writing to a TFile, we need to update header (TFile::WriteHeader), streamer info (TFile::WriteStreamerInfo) and free segments (TFile::WriteFree).
The problem is that all of three information are read by TFile::ReadBuffer or TFile::ReadKeyBuffer without converting the endianesss. header and free segments can be processed without any problems. But streamer info is read in reversed endianess.
To address this issue, one way is adding a fBit in TFile class and change all meta data (header, free segments and streamer info) to little endian. Another way is modifying the read function for streamer info and convert its endianess before read it from buffer.
Zhe