Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The library can not parse JSON generate by Chome DevTools Protocol #3903

Closed
2 tasks
goodguysoft opened this issue Dec 29, 2022 · 3 comments
Closed
2 tasks
Labels
kind: bug solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope)

Comments

@goodguysoft
Copy link

Description

The library throws exception when I try to parse JSON message produced by Chrome browser via Chrome DevTools protocol. I have no idea whether Chrome breaks any standards or not, but Chrome is standard de-facto, so at least any flag that will make the library compatible with Chrome may be good idea. I also can visualize the JSON with Visual Studio, Notepad++ and some other tools, so this format is OK for other widely used tools.

Reproduction steps

I add the code sample (Visual Studio 2022 native unit test). I just call json::parse for some JSON file generated by Chrome browser.

Expected vs. actual results

I expect that if Chrome, Visual Studio, Notepad++ can parse the JSON, nlohmann::json should also supply a way to do so. You can download the file that contains target JSON here: data.json

Minimal code example

#include "pch.h"
#include "CppUnitTest.h"

#include <format>
#include <fstream>
#include <string>

#include <nlohmann/json.hpp>

using namespace Microsoft::VisualStudio::CppUnitTestFramework;

namespace JsonTest
{
	TEST_CLASS(JsonTest)
	{
	public:
		
		TEST_METHOD(Nlohmann)
		{
			using namespace nlohmann;
			using namespace std;
			try
			{
				fstream json_reader(R"__(c:\data.json)__");
				json json_data = json::parse(json_reader);
			}
			catch (const exception& error)
			{
				Logger::WriteMessage(format("Can not parse JSON file. {}", error.what()).c_str());
			}
		}

	};
}

Error messages

Can not parse JSON file. [json.exception.parse_error.101] parse error at line 1, column 7198: syntax error while parsing value - invalid string: surrogate U+D800..U+DBFF must be followed by U+DC00..U+DFFF; last read: '"\u001f\b\u0000\u0000\u0000\u0000\u0000\u0000\u0003s~�EJvoCJ\n$A\u0011I)Y~2Ye4�\u017cs';�;O;yHT\u06f4m\u00f2\r{tk\r\u000ea�97 8]\u0013\u000e�!<32p]9\u0004Yw\u000e�ZN'w&FkwQL\u0005fx#k4\u00188=:\u001fvXc'-x,\b\\sl\u0006\r\u00181L\u000f${\u00159K\u0005ND\"QS\u0001$\u0007^\u428d-\"8,HBz\u0001\u001ai\u001f&\u001ag/xE\ucaae\u0006 9\"9X*-\u03d0x~MvLgy\u0225V-zz-d~2cI\u00a3\u001fiv~dDD\u0011~\u00073\u0539da1\"\u06c6\u000eT\u0019N3?c\u000b9-\u0006\u001b\u000f}r;0\u0016\u0329\u0007a16G`\u001c\fLn\u0006\u001b#\u00160gFccn\u001a~\u000f,%p]{i\u001f5-hK\u069c\u04b1\rGm'N\u0014\u0001\u0011i=6\u0002ao\u00028)<\u000ft\u0013sJ32g\u04f0#\u59dd\u03d4\u001dMFwyl|&2\u001a`A\u0006\u0010=\u0341\u0003Xp;>Fg\u00d1e;dd\u001fgp\rO@&L yw\u0007�\u0002/\u053dtyr9\n[$�:1r`\u0307\u001e\f\u0546<tg+\u0019\u0002h4e!,Y+e\u0002B~\u001d7L@P\u001dnGI\u001b\u0001j^\u001e\\\u000b~\u0003#\u0007op\rfr\u05a8gI0w\u0019J\u001e\u0751/61kR$4pesK}\u0016IH\"\u030d(\u04c8I}S?\u02dd:\u0003\\\"\n\u0015l\u0732PW \u001351, ?\u000b3_,_@\u001c\u0010r\u0010�9(eK\u0013\u0002)8K\u000fI]a\u0006|1?sjU%d+meYFQu\u02fb\"!5\u000fv�\u000e.\u001aN\u05c4\u0012\u0010\u0014Rx\u03627`m\u0010=\u001f!zr}Y\u0004>-z\u00039O.\u0335|\u001f{K?=Z@\\N\nf54&D}0co1f\u00160b M\u0013v`,%!X_\u000bB=<\u001aB\u0014\f/]3\u0010e/\u0002\u001eda\u0011D\u0011E5\u0233D\u001e[$\u001aH$:\u001bF#\b\u0014\rB\u03c5\",A\u001e%T\u0002\u0004C\u000e}84sqB5\u000e\u01e9\u0013\u0012y\u001e[$HdA$C\u0003Y\u01e2\b\u0016?7�\fY\n8gB\u001eZ\"jg=W{M`\"\u03f1;O\"aQ$\nr\u001f^|\u0006g_m8\"\u001bQ`5_> y\u000eI\u000b\u001e>o\u0004\u0002)J<isA1sRBL#\u001esKM'Su\u0010\u001a3q\u001f+\u012ex\u001eOx\u0016<\u001eA\u0166@\\$Qp\u001anT\u0001?F\u001cL?<FCnW+&WR~n\u0004\b)\u0011y\u0018ae\u0018LkCPB\u0013bL\u0017CB9\u0003?,1<Af\u001b^K&\u0002\u0018aL3\u0393G0q?<\u0000\u0006N\u0011/B\fe\u02db\"YX\u000fn\u000f\u0000\u00135cTt\u0018\u0018<$\u000fJ'\u0002w\u0267\u0120?j7\u0006T\u001d\u0007s\u001c\u001c\u0004Q\u0016b\u00060\b_:=--\u0019Y\u0018\u0000W\u0015C\u06547D\u0012zYsH\u0014L\u0011~^\u0015\u001etF<\b33\u0011mhT\u020d\u0000> a\u000e\u001di|2[ \u0013\u001a\u0011i`#\u000b1:\u0000\f$\u05bdxsirs\u0005W9\fL*\u07d82o\u001bb9\fh&r`\u28c8e\u0012+\f \u078b\u001a\"YA{XXJ\rDItSX2\u000by\u0006\u0014\u0012l\u0015\u00142@`p]4CE3\u0012  Dwq7o33\u02c4/[TYy\u0016n\u0014J\u0018#Bg4\u01fa.s\u0016S\bKaZ0\u0010y)h\bxh\b,\u0018$=\u0000\"yR\u0016\u000f\"\b`qlZ:Q\r\u0016\u000f)\u00042Y&'b\u001a\u0002Z\t2\u0012\u0214\u0000\u0011+F\u0000`6\\f,\u0006DQ@R\u00195_24|M#3#\u0003VK\u030aays\u00018\u0002uw!\u04cf[B8),MKL\u000bE\u0006CQ&\u001f\u0552\u001c\u0014\u001f6Rz\u0004K3J3TBz\u0015\u001a6\u0018\tVt\u001eOWd\\R71!\"X\\N^\u0014O\u000f\u0019\"\u0003\u0013syCq0<o\u0002\u0013\u0003X4(Y\u000e\u0013J\u0002\u0011nHA\u03a2\"N\u000f\u0013X>x&\u000bVe\u0014X\u000fY\u0002:< `fr\n\u0017v\u0000WQ\u001b?\u0001b\u00109E\u0000\u0006^\\7\b~+cj3g\u0014Rq\b~X7\u00163XlV~\u0003<\u000e\"CV7tSX\u0013\u0017\u0018'\u001c\fR\u0004sj\u0004\u001ea\u001aeF?\u0011zmKq0+^Z8\u000b]\u0017bNDVN?\u0019H\u0014s\u0005\u001bE\bbq#3\u00069\u0411/<\bb\u0017=R\u0006i.Ae�$\u0003K,\u0005\u000f6\u001a,SnW\u001cpi\u0007{+\\#\u0016!_\u001a,|\u0011(tw\u0011wA\u0006qNEE9+\u071aPDE\u0756\u0720\rb\u0018pjC\u0019[Ck`)b=Ig\u0016 �\u0152g\u0000_\u0014:\u0018,Y`T0z\u0398[qd\u000e\u0006NB%\u000f\u0000X\u0012,\u00ed@'I,\t3\bQ\u0013~\u001aJ\u0016X!7t\u0012%\u001am*S\u001fj\f\u0674N3Qiij\u0015\"\"\r\u0011q\u001c\u001cm3\b\u0006*\u0012J:B+f\u0010V\u0007a\u0005HS_.!\u0011mbH![\u001b\u0142gi955UE*\\%baq3t~!c,t5Ex2hV$\u0014S*`\u001bwr\b\u02e0Y\u0010\u0005\u0018\u0015sV_k^<\u00166M\u0002^cyK\u000f\u0017\u001emg\u0004\u0016F~\na\u06cbP\u00152<}\u0012x\tx\u015f\tX9\u000fS\u001d<5X<t!LD\u000e{\u0004:4A\u0293\u001bE1\u01dc5K.{=\u0000,X|pdXh\bvY\u0006S(@p\u0002\u02f6\u0001\u00032\u0016eHX\u0016E\u0440!\u0540+i9_50I\u0012&\u0088D\u0010#\u017c-1_rJ\u0018\u0018Q\u0019uSn19O0\bB:\f20+o\u0018qO\b.H'\u0011a\u0014\\\u0007Z\u0002#\t\u0735\u0007md\u0006M,%\u0011`R\u000f\u0003HHD4\u07390\u001dXc,\u01a9&$*D8$1X\\\u05f0\"@\u0016h4a3\u00d0~\u0001G\b\u0004eWJ~IXPLG+\u0001L[!\fh\u0000\u6ee2R\u0014\u0011\u0000\u607f{`\u026aYHN\u0005E\u0002s\n3RJ\u0014Ad+\u000e\u0011\u0017l\u0011`gIl7\u0015\u00129c\u0018ha\u062er9701\u0002O\u001dE\r@l1*me\u001c6\u0006I4B`\u0004Q)F\u001a\u0003\u00199\u0019\bFY\u000bF\u03ee\u0014G\u0001@\"\u0004k\u0000BG\u001d\u0004+S&T$1\u0014u\u0015P,\f9A\u000eQ\u0001\u0007~&V\r\"Z\u0004\u0015J4C]V \u000e&S\u0739\u0631-u!mM5O\n\u001bZE!\u031eW\u0241UD:` J\u000b\t8\"?%rM\u0004\u0002\u0001C<\u0005\u001c\u0198eA\u001d08\u0010\u0011\u0000\u000bJ8 \u001a*TRj;Wv\bD\u001a`\u04e9[`\u0013`?R\u7498&\nVVGs;sC\u00043\u001a\u0012\u0017,b\u0018Qn,eJ+=i<6v%{!^o\u0620F\t[J.\u0013~P\u001f\u000bJ,\u0002\u05f2]h\u000brkWnu\u0000\u001dF\u07c4\u001bjkTd\u0005lzsj\u000b\nW<D(\nDM\u0018cx\u000f\u0003-+tgp\u0001Y\u001eD).\u0772v/9UT\u0016|\u0013'\u0011.CA\u0017eb0#4\u0015\u001d\\\u0015ki\u042eMSa<j\u0002b)7Kg\u0001\u03da\u0016n.\u001e njq]>KXS=mg\bjs\u0616\u0014sj7\u0005?h\u4008]\u0015^\u0017@\"\u000eJ<p_gh�\u04ba\u042d\u06b1!>Vr\u000bh\u5f0f\\o\"+g\u001f)!\u001a\u0018\u0015Qm\u06f4\rp;\u000fmO\u0269 p h\u00027V\feU\u0016\u00012\u0615lve+\u0010\u0018nVX\u0016\u04e36m}|'1J6\u03a9A 1J\u0165\u0016k*H8C\u0005*\u0003�\u0012[Sf0}GuJpq\u00efQRI\u001cvT\u0330Y\u0006\u072a`\fZ5\u000eLVNY\u001eGj\u0007\u0018t\u0000Ll1;F[xam9@\u0014*i]YW\u0019L6\u0018t\u0003JAO:.\f}*Um6/\u0003Yn\u0003\\@<^\u0013;YJ\u0014lR8+=\u0011<6b\u0018Ta$:\f%MAE\u0469`.\"5\nw~$-\u001ej<Ss\nr�\u0727\u0004%{y.i\fzG.Y\u0013m*\u00044euj\b7S'1\fUA\u0189'8p\u0002\u0018]:P\u19a8_z'ks\u000e�X\u001f\u001a\u001eG\u06b85)\u032dnMtJgYm&R>`K\u01673Bm%\u0016\u0016\u001dLbc;\u0010nY-\u001a!\u0003\u0017x96q@\u04e8vd\nW\u0011$.:\u0006a\u06a5w\u5c3c[SljK}``X4#\u0006C,\u0002 g\u0272<\"4\u0690\nI6~#\u0010\u0010\u0541\u0004pVP;j\u00154=O\uf8ca\f\u0019SmbKc\u0000(\u079d\u000b|\b\u0001!NE\"\u00158[:C\u000b\u0000H\t\u056fqkp<r<u3MzS\u0466XjD\u0017[YG\u001f\u0010\u05de\\p3\\G9\u001d~HVwx8eYB]jhH/ \r`=IL<K\uc9b9\t\u416d/W[\u001cy\bZ*]d(`\u0013X\u0001YVDy\\\u0000\u0016\\5q7U\u0002^N\u0019z\u0016\u0000\u0005@\rDA2LSE\u0007\b\u0018\r\u0001P4\"7@L!l&\u0017i\"Q\u0012P`qT\u0000C`=9SlH\u00150m.:y.)(\u000fBm\ud81d\"'

Compiler and operating system

Visual Studio 2022, Windows 11 Pro.

Library version

3.11.2

Validation

@nlohmann
Copy link
Owner

The JSON structure is correct, but the payload in postData is invalid Unicode (hence the error message). The JSON specification (RFC 8259) assumes correct Unicode to ensure interoperability. The specification allows implementations to reject inputs otherwise. The library currently has no switch to ignore such an error.

@nlohmann
Copy link
Owner

(Are all your inputs illformed like this or is this an exception?)

@goodguysoft
Copy link
Author

goodguysoft commented Dec 29, 2022

postData is actual post data created by website; it may be binary, and in Chrome DevTools Protocol they for any reason just encode it in such a way; may be base64 is better choice here, but in Google they decided to just escape binary data I suppose. So, wrong data may appear periodically dependently on website that I try to debug with Chrome DevTools. The only way how to fix it for now I see is to filter out postData value and later parse it manually somehow. May be any "incorrect string" callback that will contain the wrong string (postData value in such example) and allow to parse it with custom code is good idea, similar to current parser_callback_t callback, but as I see this callback doesn't allow to handle wrong data.

@nlohmann nlohmann added the solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope) label Sep 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope)
Projects
None yet
Development

No branches or pull requests

2 participants