-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: base64 encoding of source maps with emojis #14607
Conversation
cli/emit.rs
Outdated
// emojis. To alleviate this problem, we're gonna manually edit | ||
// emitted files and append a source map that we'll encode manually. | ||
// This is only happening for the emits for "deno" subcommand, the | ||
// runtime API (`Deno.emit()`) is excluded from this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we identify if this behaviour happens with tsc and if so open a bug in their repo before merging this PR? Then we could open an issue in our repo to eventually remove this code though it already would be once we switch fully to swc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a sample TSC project with following config:
export function hello(): string {
return "📣❓";
}
{
"compilerOptions": {
"allowJs": true,
"allowSyntheticDefaultImports": true,
"emitDecoratorMetadata": false,
"experimentalDecorators": true,
"importsNotUsedAsValues": "remove",
"incremental": true,
"inlineSourceMap": true,
"inlineSources": true,
"isolatedModules": true,
"jsx": "react",
"module": "esnext",
"removeComments": true,
"sourceMap": false,
"strict": true,
"target": "esnext",
"useDefineForClassFields": true,
"useUnknownInCatchVariables": false,
}
}
The output is as follows:
export function hello() {
return "📣❓";
}
//# sourceMappingURL=data:application/json;base64,eyJ2ZXJzaW9uIjozLCJmaWxlIjoiZW1vamkuanMiLCJzb3VyY2VSb290IjoiIiwic291cmNlcyI6WyJlbW9qaS50cyJdLCJuYW1lcyI6W10sIm1hcHBpbmdzIjoiQUFBQSxNQUFNLFVBQVUsS0FBSztJQUNuQixPQUFPLEtBQUssQ0FBQztBQUNmLENBQUMiLCJzb3VyY2VzQ29udGVudCI6WyJleHBvcnQgZnVuY3Rpb24gaGVsbG8oKTogc3RyaW5nIHtcbiAgcmV0dXJuIFwi8J+To+Kdk1wiO1xufVxuIl19
It does contain a proper source map that can be decoded. It looks like this is somehow caused by our implementation of TS host. I wonder if tsc
available for node uses some kind of custom base64
encoding function that is different than the function in default compiler host implementation (that we use).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. Deno is using the following, but Node uses Buffer.from(...).toString("base64")
/**
* Converts a string to a base-64 encoded ASCII string.
*/
function convertToBase64(input) {
var result = "";
var charCodes = getExpandedCharCodes(input);
var i = 0;
var length = charCodes.length;
var byte1, byte2, byte3, byte4;
while (i < length) {
// Convert every 6-bits in the input 3 character points
// into a base64 digit
byte1 = charCodes[i] >> 2;
byte2 = (charCodes[i] & 3) << 4 | charCodes[i + 1] >> 4;
byte3 = (charCodes[i + 1] & 15) << 2 | charCodes[i + 2] >> 6;
byte4 = charCodes[i + 2] & 63;
// We are out of characters in the input, set the extra
// digits to 64 (padding character).
if (i + 1 >= length) {
byte3 = byte4 = 64;
}
else if (i + 2 >= length) {
byte4 = 64;
}
// Write to the output
result += base64Digits.charAt(byte1) + base64Digits.charAt(byte2) + base64Digits.charAt(byte3) + base64Digits.charAt(byte4);
i += 3;
}
return result;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good to know, I guess we could just provide our own JS version to fix this problem? Do you happen to now how it's "injected" into the compiler host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could assign a new function to ts.base64encode
. Basically, copy this, but use something else than convertToBase64
:
function base64encode(host, input) {
if (host && host.base64encode) {
return host.base64encode(input);
}
return convertToBase64(input);
}
ts.base64encode = base64encode;
```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's it:
> convertToBase64("📣❓")
'7aC97bOj4p2T'
> Buffer.from("📣❓").toString("base64")
'8J+To+Kdkw=='
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened microsoft/TypeScript#49150
We can just overwrite ts.convertToBase64
I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This commit fixes source maps for files that contain emojis. This is done by updating "deno_ast" to "0.14.1" for the case of "--no-check" flag (ie using SWC emit) and by overriding TSC's default base64 encoder (which turned out to be buggy) for the type checking case.
Fixes #10936