-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add WebSocket support with API for connecting to external devices #744
Comments
Thanks @TheLastProject for the very thorough explanation and of course thanks for your volunteering for this. If indeed WebSocket servers can't be used on the browser, we might use the Kontalk server as an encrypted and secure support channel to pass data through. It will open a whole new can of worms though (and it's a totally different idea anyway - just throwing stuff right now). EDIT: I did some more research, and there seems to be some effort to standardize peer-to-peer communication within the browser with HTML 5 and WebSockets, but it's still rough and not very precise; it does not support raw TCP/IP as far as I can see and it doesn't support encryption either. |
You indeed raise a very valid point. The way I see it, the best way is probably to use a connection broker like Peer.js, which allows two clients to connect to each other through a central server. After the connection is set up, the server is no longer needed. I tested this with 2 Peer.js clients and the Peer.js connection broker server, which I think killed, and the connection stayed stable. Assuming the Peer.js connection broker also works with other WebRTC clients, this is probably a great solution. Using a connection broker like that does remove the nice authentication system we have in step 4 and 5, but a possible workaround would be to have the web client generate a random number and put this in the QR code together with their Peer.js ID (making sure the number is only transmitted over the QR code). Then, the web client and Android client could both show this number and the Android client could ask the user to confirm it. This would make it extremely unlikely for the broker server to send a connection to the wrong device without it being noticed, as the "bad device" would also have to generate the exact same number. |
Nice. We could host a PeerJS instance on the Kontalk server.
Wouldn't be enough to just insert the random number inside the QR code like you said, and avoid the manual number typing step? Did you mean it as an additional confirmation step for security reasons? |
The whole inserting a random number is indeed meant it as additional (and possibly paranoid, but I prefer to err on the side of caution) confirmation step, in the case the PeerServer would, for whatever reason, connect someone to the wrong system (which you would notice pretty quickly, but this extra confirmation would allow the user to notice it before the other system is allowed to request any information). I did however mean showing the number both on the web client and in the Android client and letting the user simply press "Continue" or "Cancel" on the Android client instead of typing the number manually, sorry for my unclarity there. |
No problem, ok I got it now. If we use a broker, does it mean over-the-Internet connections are possible too? Security concerns aside for a moment. |
Ok let's say we use PeerJS or ICE or whatever method to handshake the connection. It doesn't matter where the two devices are. We can allow over-the-Internet connections (with proper encryption and security measures). Anyway I like the idea so let's proceed with further research. If you'd like you may draft some specs for the protocol (APIs) that we should use, we'll discuss it together when the time comes - of course after the proper research have been done. Use the wiki in this repository to share your research if you want. I'll set this to 4.0.0 for now and we'll see how the group chat development efforts will drive the overall implementation of this feature. |
Personally, I'd like to avoid having to use a broker when possible, but from my research it seems impossible to do it without one. I would love to be proven wrong, though! I have played with PeerJS before and it indeed would allow us to set up connections between different networks, which I believe is a convenience feature WhatsApp lacks. If we can do this without a broker, though, it would be (slightly) more secure. I'd happily write up a first version of what I'd like the API to do and put it on the wiki. How should I name the wiki page for this feature? Does Being able to join the team would be great, but I'm very willing to develop the web client in an own repository and later move it to the GitHub organisation when it gets accepted. On the note of setting the milestone to 4.0.0, I'd like to urge you to not do so. While I agree it would be a cool feature, group chats is already a huge new thing that has the potential of having some small edge-case bugs after release. Having a start for this planned in the same period will probably cause a load of unexpected work after 4.0.0 and emergency bugfix releases are never fun. |
Sure, whatever you believe is the right title.
No you're right. Since this will be highly WIP stuff, please use the specs wiki for now (I've granted you commit access so you can modify the wiki) for anything you'd like to note about the APIs, any research and/or insights you might find useful for the development. Think of it as a brainstorming notebook if you want. When it will reach a more defined state, we'll convert the wiki pages into .md documents in the specs repository itself. |
And I've set it to 4.1.0 for now (absolutely not a requirement, I'm constantly changing the milestones; let's say it will be some time "after group chat"). |
Thanks for moving the milestone. Group chats definitely matter more. For what it's worth, I've set up a first revision at https://github.com/kontalk/specs/wiki/[WIP]-API-for-app-to-app-communication. There still need to be changes, including documenting on how to start a new session, but at least it shows my general idea for the API. Feedback and/or questions from you (or other Kontalk community members) would be very welcome. The API definitely looks a lot more complicated than it is, due to the ability to request a specific field instead of all data all the time, but should cover everything to manage the user, contacts and conversations, including sending and receiving messages. |
I noticed that my WIP is very similar to JSON-RPC yet somewhat different. I'll probably go through it again sooner or later to reform it to use JSON-RPC and make it easier to parse and hopefully create with existing libraries. |
I updated the wiki page, converting the whole system to JSON-RPC 2.0, which means it should become much easier to generate all this with existing libraries in all major languages, improving interoperability. |
Reasoning
WhatsApp has a "web client" which allows the user to run a browser application that communicates with WhatsApp on their smartphone. These "web clients" tremendously help usability, because it is much easier to type larger amounts of text on a computer than on a smartphone. This system also protects less tech-savvy Kontalk users from leaking their private key to untrustworthy or improperly secured devices.
The downside of this kind of client that it is not usable without another device already running Kontalk. However, I feel this is of limited issue, seeing how Kontalk already supports basic federation with XMPP and will improve this in the future. Therefore, users who want to use a web client to chat with Kontalk users can create an XMPP account on another server and use one of the many existing web-based XMPP clients.
Note: In this document, I use the term web client to tell Android and non-Android apart easily, but there is no reason why the client could not be a desktop client, smartwatch client or anything else. The parts that describe the behaviour of the web client are only mentioned for clarity's sake, and are obviously not the responsibility of the Android client.
Necessary additional functionality in the Kontalk Android client
Intended workflow
Note: All IP and port numbers are examples, the actual IP and port numbers will vary depending on the client.
In this system, the QR code is used as authentication, making it very simple to authenticate a system and prevent unwanted connections. For security reasons, the Android client will NEVER run a server and will ONLY connect as a client after specifically being instructed to do so, disabling all related functionality AS SOON AS the connection dies.
Reference material
These are just the first things I found, there may be better references and libraries.
Android WebSocket library
Android WebSocket example
Android JSONObject
Android JsonReader
Notes
The text was updated successfully, but these errors were encountered: