-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: could not create file mapping: The operation comleted successfully #424
Comments
Can you share your system specs? os? ram? etc? On my system, (osx, Julia 1.1, 16gb ram) the file reads, it takes about 365 seconds. Uncompressed, the file is 12.5gb. |
Windows 10 Pro. 64G RAM. Julia 1.1 CSV.jl 0.5 |
Hmmm, this might be a windows issue then, I'll have to dig out my windows box tonight and test some things. That For reference, data.table |
Ok, I can reproduce the error on my windows machine w/ 16gb ram. Playing around w/ just mmap, I can't seem to do an anonymous mmap over about 10.5gb, and if I allocate several, I can't even get that much. This stack overflow sounds a lot like what we're seeing, but doesn't give an immediate answer. Do we try to create a file full of null bytes the size that we need and then mmap that? That's annoying because we're actually taking up disk space w/ that temporary file and hopefully can manage to delete it when we're done. Maybe there's some way we can allocate smaller chunks up to what we need and concat them together somehow? That seems annoying, but more possible. I'll keep researching this. |
… lot faster for large files than going through normal IO. Helps a little for #424
* If the source isn't an IO, let's mmap to a new mmapped buffer; it's a lot faster for large files than going through normal IO. Helps a little for #424 * Only define our grisu method for our own Floats. This was inadvertently working on Dec64, which isn't valid
I merged #426 which might help a little, but probably isn't a full solution to this issue. I wonder if a 64gb ram system would work now though; if anyone wants to try #master branch, that'd be great. Otherwise, I'll keep noodling on the possibility of how to do a chunked array approach. |
I works now |
Oh good! I can still generate scenarios where a memory-constrained windows box will fail, but I'm glad that #426 at least helps when the machine has sufficient RAM. |
The problem is now back. The first time I run it, it works fine, the second time it errors @time a = CSV.read(path; header=0, limit=3)
@time a = CSV.read(path; header=0, limit=3) the error is julia> @time a = CSV.read(path; header=0, limit=3)
ERROR: could not create file mapping: The operation completed successfully.
Stacktrace:
[1] error(::String) at .\error.jl:33
[2] #mmap#1(::Bool, ::Bool, ::Function, ::Mmap.Anonymous, ::Type{Array{UInt8,1}}, ::Tuple{Int64}, ::Int64) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Mmap\src\Mmap.jl:218
[3] #mmap#14 at .\none:0 [inlined]
[4] mmap at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Mmap\src\Mmap.jl:251 [inlined]
[5] getsource(::String, ::Bool) at C:\Users\RTX2080\.julia\packages\CSV\xJZKC\src\utils.jl:159
[6] file(::String, ::Int64, ::Bool, ::Int64, ::Nothing, ::Int64, ::Int64, ::Bool, ::Nothing, ::Bool, ::Bool, ::Array{String,1}, ::String, ::Nothing, ::Bool, ::Char, ::Nothing, ::Nothing, ::Char, ::Nothing, ::UInt8, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Dict{Int8,Int8}, ::Bool, ::Float64, ::Bool, ::Bool, ::Bool, ::Bool, ::Nothing) at C:\Users\RTX2080\.julia\packages\CSV\xJZKC\src\CSV.jl:237
[7] #File#21 at C:\Users\RTX2080\.julia\packages\CSV\xJZKC\src\CSV.jl:160 [inlined]
[8] (::getfield(Core, Symbol("#kw#Type")))(::NamedTuple{(:header, :limit),Tuple{Int64,Int64}}, ::Type{CSV.File}, ::String) at .\none:0
[9] #read#61 at C:\Users\RTX2080\.julia\packages\CSV\xJZKC\src\CSV.jl:645 [inlined]
[10] (::getfield(CSV, Symbol("#kw##read")))(::NamedTuple{(:header, :limit),Tuple{Int64,Int64}}, ::typeof(CSV.read), ::String) at .\none:0
[11] top-level scope at util.jl:156 |
@xiaodaigh, what CSV.jl release are you on? 0.5.5? 0.5.6? Does it work on a specific version (0.5.4?) but not on a subsequent version? I'm trying to figure out if something actually regressed, or if the issue is just finicky in reproducing (which I suspect). |
Sorry noob error here. I am running 0.5.6 the latest. I will try older version to help you narrow down. |
I can reproduce it every time on my desktop which is running Windows 10 Pro with a RAID0 2 SSD configuration. |
I'm supposed that it's possible to read a large compressed CSV using CSV.Rows. But it is failing with the same error:
So, is it possible to read a very large compressed CSV, iterating over the rows, in Julia? Note: I'm in Windows. |
I'm going to close this issue in favor of #432, so we can consolidate discussion around large file memory use. #510 is meant to improve memory efficiency in the general case, and allow for drastically better memory use when column types are provided explicitly by the user. I imagine there are still further improvements we can make in addition to what's in #510. |
I tried to read this example file
I get the following error
The text was updated successfully, but these errors were encountered: