-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store all torrent fields in the database #284
Comments
Should the |
I think that it should be the original. Since it is closest to the real date. However dose the GUI show this date? |
No, It doesn't. It only shows the upload date and the canonical info-hash |
We have the problem that there is a many-to-one relationship with the original creation date, and our torrent in the DB... 😮💨 |
Maybe we should keep the original creation date is the info-hash does not change, and update the creation date to the upload date if the canonical info-hash is different. As I see it we are creating a new torrent because the torrent identity is defined by its info-hash. |
So the torrent creation date is canonical. Then that is easy: it is a one-to-one relationship. 😄 |
Hi @da2ce7, Maybe some fields were not included intentionally because they are described in BEPs we do not support or because they are not officially defined in any BEP. Anyway, i would include them. Original unofficial specification
BEP: There's no specific BEP for "creation_date". Description: This field is an optional key that contains the creation time of the torrent, in standard UNIX epoch format (seconds since 1-Jan-1970 00:00:00 UTC).
BEP: Again, there's no specific BEP for "comment". It's part of the original unofficial specification. Description: An optional field that contains free-form comments for the torrent. It's essentially a text field where the creator of the torrent can put any desired information.
BEP: Like "comment" and "creation_date", there's no specific BEP for "created_by". It's part of the original unofficial specification. Description: Another optional field that identifies the software used to create the .torrent file.
BEP: There's no specific BEP that defines the "encoding" field. It's part of the original unofficial specification. Description: It specifies the character encoding used for various strings within the torrent meta-info, such as the "comment" or "created by" fields. The most common encoding is UTF-8.
BEP 5: DHT Protocol
Description: This BEP defines the Distributed Hash Table (DHT) protocol that BitTorrent clients use to find peers without using a central tracker. The "nodes" field in a .torrent file provides an initial list of nodes for bootstrapping into the DHT. BEP 32: IPv6 extension for DHT
BEP 17: HTTP Seeding (Hoffman-style)
In the main area of the metadata file and not part of the "info" section, will be a new key, "httpseeds". This key will refer to a list of URLs, and will contain a list of web addresses where torrent data can be retrieved. This key may be safely ignored if the client is not capable of using it. BEP 19: HTTP/FTP Seeding (GetRight-style)
Using HTTP or FTP servers as seeds for BitTorrent downloads. |
Hello @josecelano I like idea of including these unofficial fields. However we should be careful that the uploaded torrents may not have valid or even malicious data that is uploaded into these fields. |
Hi @da2ce7 Maybe we should draft a new BEP collecting and describing all the unofficial fields in order to make them official. I think that would be a really great contribution. |
@josecelano I would love to do that once I find time. I would use our https://github.com/torrust/teps repo for the draft, then once we are happy with it; we can submit it to https://www.bittorrent.org/ |
OK, then I think I can start implementing this issue so we have more info for the TEP/BEP. Besides, I have the database with the 2400 torrents from academictorrenst, so we can check what values those fields have. |
Just for the record and to be clear about the purpose if this issue. This package: https://github.com/ttlajus/lava_torrent/blob/master/src/torrent/v1/mod.rs#L58-L88 uses a different strategy. It models the struct after the specifications and unknown fields are added separately. /// Everything found in a *.torrent* file.
///
/// Modeled after the specifications
/// in [BEP 3](http://bittorrent.org/beps/bep_0003.html) and
/// [BEP 12](http://bittorrent.org/beps/bep_0012.html). Unknown/extension
/// fields will be placed in `extra_fields` (if the unknown
/// fields are found in the `info` dictionary then they are placed in
/// `extra_info_fields`). If you need any of those extra fields you would
/// have to parse it yourself.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Torrent {
/// URL of the torrent's tracker.
pub announce: Option<String>,
/// Announce list as defined in [BEP 12](http://bittorrent.org/beps/bep_0012.html).
pub announce_list: Option<AnnounceList>,
/// Total torrent size in bytes (i.e. sum of all files' sizes).
pub length: Integer,
/// If the torrent contains only 1 file then `files` is `None`.
pub files: Option<Vec<File>>,
/// If the torrent contains only 1 file then `name` is the file name.
/// Otherwise it's the suggested root directory's name.
pub name: String,
/// Block size in bytes.
pub piece_length: Integer,
/// SHA1 hashes of each block.
pub pieces: Vec<Piece>,
/// Top-level fields not defined in [BEP 3](http://bittorrent.org/beps/bep_0003.html).
pub extra_fields: Option<Dictionary>,
/// Fields in `info` not defined in [BEP 3](http://bittorrent.org/beps/bep_0003.html).
pub extra_info_fields: Option<Dictionary>,
} In our case, we are using explicit optional fields: pub nodes: Option<Vec<TorrentNode>>,
pub httpseeds: Option<Vec<String>>, What we want to do is to persist them in the database. Since the strcut and the persisted data don't match. |
I've created a console command that helps create torrent files with the fields you want to add for testing purposes. See #511. |
All files in the
We should also persist all extra fields as described in my previous comment. pub struct Torrent {
pub info: TorrentInfoDictionary, //
#[serde(default)]
pub announce: Option<String>,
#[serde(default)]
pub nodes: Option<Vec<(String, i64)>>,
#[serde(default)]
pub encoding: Option<String>,
#[serde(default)]
pub httpseeds: Option<Vec<String>>,
#[serde(default)]
#[serde(rename = "announce-list")]
pub announce_list: Option<Vec<Vec<String>>>,
#[serde(default)]
#[serde(rename = "creation date")]
pub creation_date: Option<i64>,
#[serde(default)]
pub comment: Option<String>,
#[serde(default)]
#[serde(rename = "created by")]
pub created_by: Option<String>,
} |
For the record, fields related to files ( CREATE TABLE "torrust_torrent_files" (
"file_id" INTEGER NOT NULL,
"torrent_id" INTEGER NOT NULL,
"md5sum" TEXT DEFAULT NULL,
"length" BIGINT NOT NULL,
"path" TEXT DEFAULT NULL,
FOREIGN KEY("torrent_id") REFERENCES "torrust_torrents"("torrent_id") ON DELETE CASCADE,
PRIMARY KEY("file_id" AUTOINCREMENT)
); Single file torrent: {
"created by": "qBittorrent v4.5.4",
"creation date": 1691149572,
"info": {
"length": 11,
"name": "sample.txt",
"piece length": 16384,
"pieces": "<hex>D4 91 58 7F 1C 42 DF F0 CB 0F F5 C2 B8 CE FE 22 B3 AD 31 0A</hex>"
}
} Multiple file torrent: {
"created by": "qBittorrent v4.5.4",
"creation date": 1691151958,
"info": {
"files": [
{
"length": 11,
"path": [
"sample.txt"
]
}
],
"name": "sample",
"piece length": 16384,
"pieces": "<hex>D4 91 58 7F 1C 42 DF F0 CB 0F F5 C2 B8 CE FE 22 B3 AD 31 0A</hex>"
}
} https://wiki.theory.org/BitTorrentSpecification#Info_in_Single_File_Mode |
Relates to: #285
Currently, we have these fields when we parse/decode the torrent file:
Some are not persisted in the database. We could persist all fields in the
Torrent
struct. Some fields are missing in the database like:nodes
encoding
httpseeds
creation_date
comment
created_by
Subtasks
creation_date
,comment
,created_by
,encoding
#296httpseeds
#438. BEP 17.nodes
#437. BEP 5.The text was updated successfully, but these errors were encountered: