-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EDDB will soon cease operations #110
Comments
Well that's not helpful. |
EDDB has now shut down, is there any plans to update TD to use something else, like inara for example? Edit: Looks like inara doesn't seem to have a API for exporting data |
Working on it. |
Temporarily TD is working, but uses stations and systems from the day EDDB died. That said, the first phase for server work for this change is now functionally complete. We are producing our own listings.csv and continue to produce listings-live.csv as normally. Next thing is dealing with new systems and stations as those need entirely new code to handle so they are imported correctly. |
I don't think any of them have a means of obtaining bulk data. I know I looked at this back when the end of EDDB was first announced, and things didn't pan out. IIRC, one of the places I looked at, I think it was Inara, did have an API, but it was for single queries only, as in "give me the data for this station", so that wouldn't work. I would love to be wrong about this, because figuring out how to do it ourselves sucks. |
I'm not an expert in the slightest, so I don't know if this is even feasible, but can @Tromador read and aggregate EDDNs commodities feed for pricing purposes? Start with what we have now and update hourly/daily from an EDDN feed. I'm pretty sure Inara does this. Perhaps EDSM can be approached for system information. Or we can ask @spansh (I believe that's Gareth) if we can download his Data dumps for systems. What is the major problem, not having an authoritative source for ships, modules, components? |
This is what we have now. Tromador's server runs a python script that does exactly that, that's how listings-live.csv was generated before the end of EDDB, and after the "first phase" as Tromador put it, that is also how the listings.csv file is generated as well. For details: https://github.com/eyeonus/EDDBlink-listener
Yes. As far as market data is concerned, we've got that covered. However, we have no means of updating TD with new anything; new commodities, new stations, any of it. (Actually I think I did make it so if new commodities show up they do get added to the DB, but I'm not certain, and I'm too lazy to look right now.) For some things, like rare items, that's not a big problem because it isn't very often a new ship module gets added to the game, so adding them manually isn't a huge deal, even if it would be nice to have it all done automagically. Basically, right now we can get all the information contained in an EDDN Commodities message, and process it for inclusion in the DB, but some information TD needs isn't contained in that EDDN message, and so we need to start also processing other EDDN messages in the script so that we can process that stuff too. For example, if we want to know what star system a station is in, we need to process the docking event of a Journal message. |
I'm not going to lie, my life is in a bit of an upheaval right now, so I haven't had time to work on this very much. If anyone who reads this wants to take a crack at it please feel free. |
Is there any interest in bringing this back? @eyeonus @Tromador in particular. I've got the most egregious problems with eddblink_listener hammered out and running on my machine, and I'm working on a mechanism to replay the archived eddn streams to load data from between when EDDB went out and now. That should get TD up and going with old (EDDB era) star systems. After that, I don't think it would be a very big lift to get a star system feed out of the EDDN journals. I haven't actually looked at the guts of TD to see what else it might need. |
Hi,
The last position we had (bearing in mind my memory is not what it once
was) was there were some issues causing threads to hang up. The database
was still having issues in that it liked to grow larger, though eyeonus had
done a lot of work on that area and it was miles better than it once was.
There was some testing to be done to try and pin down the cause of some
problems, I can't even remember now what, but my health took a downturn, my
cat became diabetic, eyeonus was injured in a road accident and as nobody
was asking for TD to be mended it really seemed like it was very low on the
list of priorities, if indeed it was on the list at all.
Assuming good data goes into the database, I think basically TD should
work, I mean why not? It's problem may be it was written at a time when we
were still in Beta with a lot less stations and perhaps it hasn't scaled
well. It doesn't seem to matter how much in the way of memory or cpu it
gets, or the speed of the storage, some queries are disgustingly slow. That
said, I still maintain my belief that TD's userside query syntax is by far
and away streaks ahead in the questions it can answer.
I am more than willing to host a working version of the software with
associated datasets for download. What I may not have the energy for is a
lot of convoluted testing if wierd and wonderful bugs crop up.
Cheers
Trom
…On Thu, 28 Dec 2023, 04:57 EyeMWing, ***@***.***> wrote:
Is there any interest in bringing this back? @eyeonus
<https://github.com/eyeonus> @Tromador <https://github.com/Tromador> in
particular.
I've got the most egregious problems with eddblink_listener hammered out
and running on my machine, and I'm working on a mechanism to replay the
archived eddn streams to load data from between when EDDB went out and now.
That should get TD up and going with old (EDDB era) star systems.
After that, I don't think it would be a very big lift to get a star system
feed out of the EDDB journals. I haven't actually looked at the guts of TD
to see what else it might need.
—
Reply to this email directly, view it on GitHub
<#110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJJGYLEV3SIMMRST3OTTVDLYLT33PAVCNFSM6AAAAAAWQEPTGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZQHAZDEMZWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
You should absolutely feel encouraged to submit a PR to either/both repositories, I would love to have some help with this stuff.
That said, to the best of my knowledge Tromador's server is still running, and I long ago patched the listener to work without EDDB, so assuming that's true, TD is still up to date, at least regarding market data for the systems that existed when EDDB went down.
I look forward to seeing your fixes.
…On Wed, Dec 27, 2023, 21:57 EyeMWing ***@***.***> wrote:
Is there any interest in bringing this back? @eyeonus
<https://github.com/eyeonus> @Tromador <https://github.com/Tromador> in
particular.
I've got the most egregious problems with eddblink_listener hammered out
and running on my machine, and I'm working on a mechanism to replay the
archived eddn streams to load data from between when EDDB went out and now.
That should get TD up and going with old (EDDB era) star systems.
After that, I don't think it would be a very big lift to get a star system
feed out of the EDDB journals. I haven't actually looked at the guts of TD
to see what else it might need.
—
Reply to this email directly, view it on GitHub
<#110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANHHYCHRTNCO724NU477A3YLT33PAVCNFSM6AAAAAAWQEPTGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZQHAZDEMZWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I note that @spansh (https://github.com/spansh), of neutron plotter fame, now has market data. Here is an example. He also has system data dumps. Should we reach out to him to see if there is anything TradeDangerous can leverage? |
We could potentially use the dumps https://spansh.co.uk/dumps Also what ever happened to @EyeMWing? I expected to see a PR at some point from that one. |
I don't have the knowhow to help in this in any way but I'm so pleased that TD has not been completely forgotten. I could probably help with testing as a user though. |
I have been trying to update TDHelper every time I play Elite Dangerous Odyssey, but it seems to have stopped updating about 7 months ago. Hopefully something will soon happen. I would also be happy to help with testing of a new and improved version. |
TDHelper is run by somebody else, it's not something I have anything to do with. |
I'm more than happy to help populate data. We have the new system dumps at
https://spansh.co.uk/dumps which are purely system data (no bodies).
However if you also want station data you can grab the full galaxy file
though that's probably a little large for players to download.
If you only care about station data you can get galaxy_stations.json.gz.
That contains all known data about the every system which contains a
station including player fleet carriers, I'm. parsing that for my new
trade router and it takes 3-5 minutes to parse using RapidJSON. I'm not as
familiar with python but there are fast JSON parsers available and if
you're worried about memory usage and don't have access to a SAX/Streaming
parser I've made some concessions to make it relatively easy to create a
streaming parser for those files manually.
If you'd like more help with this you can catch me on the EDCD Discord.
…On Mon, 22 May 2023 at 04:18, Avraham Adler ***@***.***> wrote:
I'm not an expert in the slightest, so I don't know if this is even
feasible, but can @Tromador <https://github.com/Tromador> read and
aggregate EDDNs commodities feed for pricing purposes? Start with what we
have now and update hourly/daily from an EDDN feed. I'm pretty sure Inara
does this. Perhaps EDSM can be approached for system information. Or we can
ask @spansh <https://github.com/spansh> (I believe that's Gareth) if we
can download his Data dumps for systems.
What is the major problem, not having an authoritative source for ships,
modules, components?
—
Reply to this email directly, view it on GitHub
<#110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAZGKCN72PJYY25TEOVAH3XHLLJPANCNFSM6AAAAAAWQEPTGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks for the offer of support. Big files don't really scare me. Potentially we can have the server grab it and output something smaller for clients. I always used to have the server configured to grab and hold more data than the average user would download, at least by default (they could still grab it via options if they really wanted it). I too was hoping for this PR from @EyeMWing. That said, with @spansh willing to help with a reliable data source, I am willing to run up the ED server on current code, on the assumption we can start looking again at some of the long standing issues - I mean it does work, but there were some niggles. Assuming we do that, I would ask for patience (especially from @eyeonus 🙂) it's been a very long time since I looked at this and the brain fog from my illness and associated meds will likely have me going over old ground previously discussed as though it never happened. I know this can be a little frustrating at times, it certainly annoys me when I know my cranium isn't firing on all cylinders. |
I’m still here, just got pulled away from ED for a little bit by some priority work. I was actually right in the middle of trying to solution new star systems - looks like we’ve got solution for that now. I’ve got some time this evening, will pull down the dump and see about getting it parsed. Shouldn’t be too bad.Sent from my iPhoneOn Jan 31, 2024, at 6:02 AM, Stefan Morrell ***@***.***> wrote:
I'm more than happy to help populate data. We have the new system dumps at https://spansh.co.uk/dumps which
are purely system data (no bodies). However if you also want station data you can grab the full galaxy file though
that's probably a little large for players to download.
Thanks for the offer of support. Big files don't really scare me. Potentially we can have the server grab it and output something smaller for clients. I always used to have the server configured to grab and hold more data than the average user would download, at least by default (they could still grab it via options if they really wanted it).
I too was hoping for this PR from @EyeMWing. That said, with @spansh willing to help with a reliable data source, I am willing to run up the ED server on current code, on the assumption we can start looking again at some of the long standing issues - I mean it does work, but there were some niggles.
Assuming we do that, I would ask for patience (especially from @eyeonus 🙂) it's been a very long time since I looked at this and the brain fog from my illness and associated meds will likely have me going over old ground previously discussed as though it never happened. I know this can be a little frustrating at times, it certainly annoys me when I know my cranium isn't firing on all cylinders.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@EyeMWing You posted over a month ago that you had some time "this evening". Please can you have a think and honestly decide if you have the time and inclination to do this work. If you don't, that's fine, everything here is voluntary. We'll decide if/how we want to proceed without you and that's ok. |
I had a bit of free time today, so I put together a quick parser for @spansh's dump files. I did a (very cursory) research into fast JSON parsers and settled on csymdjson. It can ingest the 8.8GB (uncompressed size) Here it is as a proof of concept: import cysimdjson
from collections import namedtuple
DEFAULT_INPUT = 'galaxy_stations.json'
Commodity = namedtuple('Commodity', 'name,sell,buy,demand,supply,ts')
def ingest(filename):
parser = cysimdjson.JSONParser()
with open(filename, 'r') as f:
f.readline() # skip over inital open bracket
for line in f:
line = line.rstrip().rstrip(',')
if line == ']':
# end of dump
break
system_data = parser.loads(line)
yield from _ingest_system_data(system_data)
def _ingest_system_data(system_data):
for station_name, update_time, commodities in _find_markets_in_system(system_data):
yield f'{system_data["name"]}/{station_name}', _ingest_commodities(commodities, update_time)
def _ingest_commodities(commodities, update_time):
for category, category_commodities in commodities.items():
yield category, _ingest_category_commodities(category_commodities, update_time)
def _ingest_category_commodities(commodities, update_time):
for commodity, market_data in commodities.items():
yield Commodity(
name=commodity,
sell=market_data["sellPrice"],
buy=market_data["buyPrice"],
demand=market_data["demand"],
supply=market_data["supply"],
ts=update_time,
)
def _find_markets_in_system(system_data):
for station in system_data['stations']:
if 'Market' not in station.get('services', []):
continue
if not station.get('market', {}).get('commodities', []):
continue
yield (
station['name'],
station['market'].get('updateTime', None),
_categorize_commodities(station['market']['commodities'], ),
)
def _categorize_commodities(commodities):
commodities_by_category = {}
for commodity in commodities:
commodities_by_category.setdefault(commodity['category'], {})[commodity['name']] = commodity
return commodities_by_category
if __name__ == '__main__':
print('# {name:35s} {sell:>7s} {buy:>7s} {demand:>10s} {supply:>10s} {ts}'.format(
name='Item Name',
sell='SellCr',
buy='BuyCr',
demand='Demand',
supply='Supply',
ts='Timestamp',
))
print()
for station_name, market in ingest(DEFAULT_INPUT):
print(f'@ {station_name}')
for category, commodities in market:
print(f' + {category}')
for commodity in commodities:
print(' {name:35s} {sell:7d} {buy:7d} {demand:10d} {supply:10d} {ts}'.format(
name=commodity.name,
sell=commodity.sell,
buy=commodity.buy,
demand=commodity.demand,
supply=commodity.supply,
ts=commodity.ts,
))
print() That POC prints out the result in Trade Dangerou's prices format, but it is intended to be used to provide the data in a programmatically convenient way, so it doesn't necessarily need to pass through a conversion step, Trade Dangerous could potentially just load the prices directly from the galaxy dumps. |
You can trust that assumption about the file format, I specifically
formatted it that way so that people without streaming JSON parsers can
roll their own easily. One option for parsing would be pysimdjson which is
a hook into purportedly the fastest JSON parser there is currently.
…On Thu, 14 Mar 2024 at 18:34, Mihail Milushev ***@***.***> wrote:
I had a bit of free time today, so I put together a quick parser for
@spansh <https://github.com/spansh>'s dump files. I did a (very cursory)
research into fast JSON parsers and settled on csymdjson
<https://pypi.org/project/cysimdjson/>. It can ingest the 8.8GB
galaxy_stations.json.gz in about 23 seconds on my M1 Pro Macbook (without
doing anything with the data, that's just load time). It does process the
input line by line to avoid needing insane amounts of memory, which means
it makes some assumptions about the format of the galaxy dumps, namely that
each system is on a single line, and the first and last lines of the JSON
are the opening and closing square brackets.
Here it is as a proof of concept:
import cysimdjsonfrom collections import namedtuple
DEFAULT_INPUT = 'galaxy_stations.json'
Commodity = namedtuple('Commodity', 'name,sell,buy,demand,supply,ts')
def ingest(filename):
parser = cysimdjson.JSONParser()
with open(filename, 'r') as f:
f.readline() # skip over inital open bracket
for line in f:
line = line.rstrip().rstrip(',')
if line == ']':
# end of dump
break
system_data = parser.loads(line)
yield from _ingest_system_data(system_data)
def _ingest_system_data(system_data):
for station_name, update_time, commodities in _find_markets_in_system(system_data):
yield f'{system_data["name"]}/{station_name}', _ingest_commodities(commodities, update_time)
def _ingest_commodities(commodities, update_time):
for category, category_commodities in commodities.items():
yield category, _ingest_category_commodities(category_commodities, update_time)
def _ingest_category_commodities(commodities, update_time):
for commodity, market_data in commodities.items():
yield Commodity(
name=commodity,
sell=market_data["sellPrice"],
buy=market_data["buyPrice"],
demand=market_data["demand"],
supply=market_data["supply"],
ts=update_time,
)
def _find_markets_in_system(system_data):
for station in system_data['stations']:
if 'Market' not in station.get('services', []):
continue
if not station.get('market', {}).get('commodities', []):
continue
yield (
station['name'],
station['market'].get('updateTime', None),
_categorize_commodities(station['market']['commodities'], ),
)
def _categorize_commodities(commodities):
commodities_by_category = {}
for commodity in commodities:
commodities_by_category.setdefault(commodity['category'], {})[commodity['name']] = commodity
return commodities_by_category
if __name__ == '__main__':
print('# {name:35s} {sell:>7s} {buy:>7s} {demand:>10s} {supply:>10s} {ts}'.format(
name='Item Name',
sell='SellCr',
buy='BuyCr',
demand='Demand',
supply='Supply',
ts='Timestamp',
))
print()
for station_name, market in ingest(DEFAULT_INPUT):
print(f'@ {station_name}')
for category, commodities in market:
print(f' + {category}')
for commodity in commodities:
print(' {name:35s} {sell:7d} {buy:7d} {demand:10d} {supply:10d} {ts}'.format(
name=commodity.name,
sell=commodity.sell,
buy=commodity.buy,
demand=commodity.demand,
supply=commodity.supply,
ts=commodity.ts,
))
print()
That POC prints out the result in Trade Dangerou's prices format, but it
is intended to be used to provide the data in a programmatically convenient
way, so it doesn't necessarily need to pass through a conversion step,
Trade Dangerous could potentially just load the prices directly from the
galaxy dumps.
—
Reply to this email directly, view it on GitHub
<#110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAZGKCQGHCS7BNECGAEHP3YYHUSNAVCNFSM6AAAAAAWQEPTGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJYGA4DIOBSGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
I've fixed a bug, it wasn't picking up surface stations, so now ingestion times have jumped to the 50-70 second range. import cysimdjson
import simdjson
import time
from collections import namedtuple
DEFAULT_INPUT = 'galaxy_stations.json'
DEFAULT_PARSER = cysimdjson.JSONParser().loads
ALT_PARSER = lambda line: simdjson.Parser().parse(line)
Commodity = namedtuple('Commodity', 'name,sell,buy,demand,supply,ts')
def ingest(filename, parser):
"""Ingest a spansh-style galaxy dump and emits a generator cascade yielding the market data."""
with open(filename, 'r') as f:
f.readline() # skip over inital open bracket
for line in f:
line = line.rstrip().rstrip(',')
if line == ']':
# end of dump
break
system_data = parser(line)
yield from _ingest_system_data(system_data)
def _ingest_system_data(system_data):
for station_name, update_time, commodities in _find_markets_in_system(system_data):
yield f'{system_data["name"].upper()}/{station_name}', _ingest_commodities(commodities, update_time)
def _ingest_commodities(commodities, update_time):
for category, category_commodities in commodities.items():
yield category, _ingest_category_commodities(category_commodities, update_time)
def _ingest_category_commodities(commodities, update_time):
for commodity, market_data in commodities.items():
yield Commodity(
name=commodity,
sell=market_data["sellPrice"],
buy=market_data["buyPrice"],
demand=market_data["demand"],
supply=market_data["supply"],
ts=update_time,
)
def _find_markets_in_system(system_data):
# look for stations in the system and on all bodies
targets = [system_data, *system_data.get('bodies', [])]
for target in targets:
for station in target['stations']:
if 'Market' not in station.get('services', []):
continue
if not station.get('market', {}).get('commodities', []):
continue
yield (
station['name'],
station['market'].get('updateTime', None),
_categorize_commodities(station['market']['commodities'], ),
)
def _categorize_commodities(commodities):
commodities_by_category = {}
for commodity in commodities:
commodities_by_category.setdefault(commodity['category'], {})[commodity['name']] = commodity
return commodities_by_category
def benchmark(filename, parser, parser_name=None, iterations=5):
"""Benchmark a JSON parser.
Prints timing for consuming the entire stream, without doing anything with the data.
"""
times = []
for _ in range(iterations):
start_ts = time.perf_counter()
stream = ingest(filename, parser)
for _, market in stream:
for _, commodities in market:
for _ in commodities:
pass
end_ts = time.perf_counter()
elapsed = end_ts - start_ts
times.append(elapsed)
min_time = min(times)
avg_time = sum(times) / len(times)
max_time = max(times)
if parser_name is None:
parser_name = repr(parser)
print(f'{min_time:6.2f} {avg_time:6.2f} {max_time:6.2f} {parser_name}')
def benchmark_parsers(filename=DEFAULT_INPUT, **parsers):
"""Benchmark all parsers passed in as keyword arguments."""
for name, parser in parsers.items():
benchmark(filename, parser, parser_name=name)
def convert(filename, parser=DEFAULT_PARSER):
"""Converts spansh-style galaxy dump into TradeDangerous-style prices."""
print('# {name:35s} {sell:>7s} {buy:>7s} {demand:>10s} {supply:>10s} {ts}'.format(
name='Item Name',
sell='SellCr',
buy='BuyCr',
demand='Demand',
supply='Supply',
ts='Timestamp',
))
print()
for station_name, market in ingest(DEFAULT_INPUT, parser=parser):
print(f'@ {station_name}')
for category, commodities in market:
print(f' + {category}')
for commodity in commodities:
pass
print(' {name:35s} {sell:7d} {buy:7d} {demand:10d} {supply:10d} {ts}'.format(
name=commodity.name,
sell=commodity.sell,
buy=commodity.buy,
demand=commodity.demand,
supply=commodity.supply,
ts=commodity.ts,
))
print()
if __name__ == '__main__':
benchmark_parsers(
cysimdjson=DEFAULT_PARSER,
pysimdjson=ALT_PARSER,
) I've benchmarked them and
|
Very nice. Do me a flavour and submit a PR for this, formatted as an import plugin. |
Yeah, that's WIP, I was just focusing on getting the parsing logic right first 👍 |
You don't need the category in the price file (saves some bytes), see: |
@lanzz Probably a stupid question but I rather ask and not need to: I presume that carriers count as "stations in the system"? |
I did not even see that. Thanks! |
It's like the four doctors in here :) 👋 @bgol :) |
Yeah, hi Oliver, nice to hear from you. Now, where is madavo? :) |
Sorry I haven't been following up the developments in this thread. Yes, the reason I implemented it to generate .prices file was because that seemed like what the plugin system itself was expecting. |
Chiming in for the 50+ crew 👴 |
Lol, only if I'm Tennant. Also bowties are not cool.
No worries. I've changed it to go directly into the database myself, we all kind of assumed that's the reason why you did that. The RAM usage was just too much for the server doing it the way you had it. Returning False means that the plugin handled everything itself. There's no expectation either way, it's just to give the plugin author the ability to go either way. Your work is sincerely appreciated, after all, you did the hard part, all I did was some refactoring to make it less RAM intensive. |
@lanzz the simdjson usage seemed to run into a problem I had at Super Evil recently with our python-based asset pipeline and recent optimizations in CPython so that it doesn't garbage collect so often. As you'd spotted, you had to allocate a new simdjson parser each loop or else it complained you had references. Also, the td code is full of make-a-string-with-joins because that was the optimal way to join two short strings back in those versions of python. Now it seems to be bad because with the aforementioned garbage-reduction, the likelihood python will actually have to That got me to looking at (please note: I live in a near perpetual state of 'wtf python, really, why?' these days -- if any of that seems to be pointed at anything but python and/or myself, I derped at typing) |
If you let me know what the missing/extra data is I can point you to the fields if they're available in the dump and/or where I normally source that data from when/if I put them into my search index if they're not. |
@spansh
It'd also be nice to have a way to automatically add new rares to the RareItem table:
|
Does that make me the Tardis, as I'm hosting?
…On Mon, 22 Apr 2024 at 07:56, Oliver Smith ***@***.***> wrote:
It's like the four doctors in here :) 👋 @bgol <https://github.com/bgol>
:)
—
Reply to this email directly, view it on GitHub
<#110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJJGYLGWTO4Z33X35ZCYGYLY6SYCDAVCNFSM6AAAAAAWQEPTGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRYGYZDINRRHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Omnia dicta fortiora si dicta Latina!
|
Is there a reason to prefer the human-readable/strptime datetimes? Having them as utc timestamps either int or float would make parsing and import much, much faster. |
Historical inertia (that's how it was setup when I took over TD maintenance), and potentially backwards compatibility. Regarding the former, I've no problems with changing it. Regarding the latter, as long as it doesn't break anything, I've no problems changing it. |
"Your fault, you daft old fart" is perfectly fine by me :) |
@eyeonus seeing this at the moment with latest, this my fault?
that seems like something that shouldn't be missing at the end of an import? |
Oh, that's not the same as regenerating prices file. Duh. I used to figure that eventually the cost of generating the .prices file and stability of TD would mean we didn't have to keep generating the thing, is that all this is? |
Nope, it's fine. It's a warning by tdb.reloadCache(), and is expected since it's a clean run. |
Also since you did solo, it didn't download or import the listings, there's nothing to export to a prices file, and so you won't have a prices file after the command finishes, either. |
No, it will regenerate prices IFF listings are imported, but not otherwise. |
While I'm refinding my feet, I've made a number of qol changes - at least, if you're using an IDE like PyCharm/VSCode. I tend to configure tox so that my IDEs pick up the settings from it and I get in-ide guidance and refactoring options. I've also introduced a little bit of flair to make watching it do its import thing a little less tedious, but I'm trying to stagger how I do it so that there's always an easy way to dump the new presentation. This is what happens when I've been watching Sebastian Lague videos lately https://www.youtube.com/watch?v=SO83KQuuZvg ... but it's probably also going to be nice for end-users too. Recording.2024-04-22.171344.mp4These are currently in my kfsone/cleanup-pass branch. |
I'm doing some tuning of the tox config, it doesn't seem like we were actually running a "just flake8" run, or we weren't really using it? I got it enabled in my test branch. It should be fast (demo from pycharm) Recording.2024-04-22.221802.mp4 |
Also, both the eddblink and spansh plugins use the existence of that file to determine if the database needs to be built. |
I was checking a few of the 1MR posts/videos about how they tackle it in Python. We don't have 1 billion but it's not that dissimilar to what we do. Discovering that just using "rb" and doing byte-related operations was a bit of a stunner, but it's annoying trying to switch large tracts of code from utf8-to-bytes. However, it can provide a 4-8x speed up. For instance, we count the number of lines in some of our files so we can do a progress bar, right? If the file is 86mb that takes ~250ms. Just using "rb" gets that down to 50ms and a little use of fixed-size buffering gets it down to 41ms. https://gist.github.com/kfsone/dcb0d7811570e40e73136a14c23bf128 |
Faster is good. I like faster. |
|
ooorrrrr.... https://github.com/KingFisherSoftware/traderusty/ :) I'm thinking I should have called it "tradedangersy" since rusty projects like to end with "rs" and python with "y" :) |
Don't read too much into that - an excuse to try a rust-python extension in anger (see https://github.com/kfsone/rumao3) and see how much pain setting up pypi and everything was (it wasn't). And I'm not sure eyeonus is like to want a second language adding to the problem :) |
@Tromador is listings.csv guaranteed to be in station,item order? I think I can optimize by doing a lock-step walk thru the database and listings (you create two generators, one with database entries the other with listing entries, and you keep advancing the one that is "behind"; if the listings one runs out, you stop; if the database one runs out you just don't need to compare) |
Yes, both the listings.csv and listings-livs.csv are guaranteed to be sorted by |
In case you didn't notice:
https://forums.frontier.co.uk/threads/eddb-a-site-about-systems-stations-commodities-and-trade-routes-in-elite-dangerous.97059/page-37#post-10114765
The text was updated successfully, but these errors were encountered: