-
Notifications
You must be signed in to change notification settings - Fork 258
feat!(NODE-4802): Refactor BSON to work with cross platform JS APIs #518
New issue
Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? No Sign in to your account
Conversation
82f2ccf
to
cfc8f00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments for everything but the byte utils test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test review
docs/upgrade-to-v5.md
Outdated
|
||
> **TL;DR**: Web environments return Uint8Array; Node.js environments return Buffer | ||
|
||
For those that use the BSON library on Node.js, there is no change the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we mention which APIs this actually affects? and can we say in the summary instead of what? (i.e. <what API is affected> in web environments return Uint8Array instead of <...>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://jira.mongodb.org/browse/NODE-4794
That list would be looking at all the public APIs that changed in this PR and listing them out, in the interest of moving this along we will be cleaning up this guide when we reach the end of the EPIC and the diff of this PR will still be available to us then for reference. So let's defer to NODE-4794, I've added a note about creating a list of APIs changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't just about making sure we have a well documented list, it's about knowing what these specific code changes actually affect, so I don't think it's out of scope of this PR to identify what will be changing. This work will still need to be done, and it is easier to do it piecemeal with the corresponding code changes than in one big block, where we might miss something that was affected due to the sheer volume of changes.
@@ -15,19 +15,28 @@ for nodejs and web platforms. | |||
|
|||
> **TL;DR**: Web environments return Uint8Array; Node.js environments return Buffer | |||
|
|||
For those that use the BSON library on Node.js, there is no change the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead. | |||
For those that use the BSON library on Node.js, there is no change - the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead. | |||
|
|||
This allows the BSON library to be better at platform independence while keeping its behavior consistent cross platform. The Buffer shim served the library well but brought in more than was necessary for the concerns of the code here. | |||
|
|||
### `ObjectId.toString` / `UUID.toString` / `Binary.toString` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### `ObjectId.toString` / `UUID.toString` / `Binary.toString` | |
### Restrict supported encodings in `ObjectId.toString` / `UUID.toString` / `Binary.toString` |
@@ -15,19 +15,28 @@ for nodejs and web platforms. | |||
|
|||
> **TL;DR**: Web environments return Uint8Array; Node.js environments return Buffer | |||
|
|||
For those that use the BSON library on Node.js, there is no change the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead. | |||
For those that use the BSON library on Node.js, there is no change - the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead. | |||
|
|||
This allows the BSON library to be better at platform independence while keeping its behavior consistent cross platform. The Buffer shim served the library well but brought in more than was necessary for the concerns of the code here. | |||
|
|||
### `ObjectId.toString` / `UUID.toString` / `Binary.toString` | |||
|
|||
> **TL;DR**: These `toString` methods only support the following encodings: 'hex', 'base64', 'utf8' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the 'utf8' encoding isn't supported on ObjectId.toString
or UUID.toString
, then this summary is incorrect
I actually think that what you currently have as a "tl;dr" can be simply the sub-heading for the corresponding set of changes. I think it would also make the table of contents easier to generate. In the case of this TL;DR, it can be removed entirely if you update L22 the way I am suggesting; then the paragraph underneath can elaborate on the specifics
Then the TL;DR of the section above (L16) can be promoted to a subheading, with both of these sections being under the title "Remove reliance on Node.js buffer", which would be the grouping heading.
So in this guide, the grouping headings will indicate the themes of the changes, and the subheadings in the groups will be the actual breaking change from the user's perspective. (What I mean is, "remove reliance on nodejs buffer" isn't the breaking change, it's the cause of the breaking changes whose summaries follow)
As for the 3rd example, I think we can group it under "Other" for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this is not off-topic, but as a third party maintainer of the deno_mongo project (also web_bson), this is awesome!
Description
NODE-3555 / NODE-4802
What is changing?
Refactors BSON to rely on common JS globals and runs tests for both node and non node global scenarios.
ByteUtils
A new collection of functions that change their implementation based on the environment the BSON library is imported into. If Buffer is available the ByteUtils wraps the same Buffer functions we have been using in BSON thus far. If Buffer is not available ByteUtils uses a combination of existing JS TypedArray APIs as well as some original implementations to get the needed format translations BSON requires.
Using VM context
In order to test how the BSON library behaves when Buffer exists or not, we use
vm.runInContext
and read in the umd bundle as a string. We can then "run" the library with our own definition of a global object. All BSON tests are run both with and without Buffer being defined. The only globals required are:TextEncoder
,TextDecoder
,atob
,btoa
.buffer.copy
Buffer.copy calls have been translated to
dst.set(src.subarray(), offset)
calls. As seen here: https://github.com/nodejs/node/blob/main/lib/buffer.js?rgh-link-date=2022-10-26T15%3A06%3A09Z#L247-L262 this is the same approach taken internally by node we just have inputs that we can trust are the correct type.isArray
flagThe
isArray
flag internal to the the serializer has been removed, it didn't provide functionality other than switching the key encoding to ASCII, but that doesn't actually change the output for a stringified number which is always "ascii". Additionally, we were mistakenly setting this flag with the wrong variableserializeFunctions
, which would cause functions with names beyond the ascii range to be serialized with broken keys. It has to be fixed in this PR b/c of typescript, but I've filed NODE-4771 so we can make a separate entry in the changelog about this.chai
.throws
When running the BSON library from another context, instanceof checks on errors will fail so I've enhanced the throws assertion to permit errors whose
.name
property match the expected error name.ObjectId.toString
/UUID.toString
/Binary.toString
The above methods all supported passing an encoding through to the Buffer.toString implementation, but the various encodings supported are unlikely to be useful and would be out of scope to maintain for this library. The underlying buffer can be fetched and passed to the Node.js Buffer .toString implementation if so desired. However, we will continue to support a common subset of the encodings that are likely to be useful, and the TS asserts that the encodings are only the ones we do accept now. (base64/utf8/hex)
What is the motivation for this change?
Maintainability
Double check the following
npm run lint
script<type>(NODE-xxxx)<!>: <description>