Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to import in nodejs server? #32

Closed
djaffer opened this issue Apr 23, 2023 · 9 comments
Closed

How to import in nodejs server? #32

djaffer opened this issue Apr 23, 2023 · 9 comments

Comments

@djaffer
Copy link

djaffer commented Apr 23, 2023

/node_modules/@dqbd/tiktoken/lite/tiktoken_bg.cjs:375
throw new Error(getStringFromWasm0(arg0, arg1));
^

Error: null pointer passed to rust
at module.exports.__wbindgen_throw (/node_modules/@dqbd/tiktoken/lite/tiktoken_bg.cjs:375:11)
at wasm:https://wasm/0030beca:wasm-function[788]:0x70f59
at wasm:https://wasm/0030beca:wasm-function[786]:0x70f3f
at wasm:https://wasm/0030beca:wasm-function[654]:0x6bdba
at wasm:https://wasm/0030beca:wasm-function[147]:0x477e2
at Tiktoken.encode (/node_modules/@dqbd/tiktoken/lite/tiktoken_bg.cjs:223:18)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Node.js v18.16.0

@djaffer
Copy link
Author

djaffer commented Apr 23, 2023

nevermind use this package tiktoken-node. You guys should clearly write that this will not work with nodejs without module support making it pointless for backend.

@djaffer djaffer closed this as completed Apr 23, 2023
@dqbd
Copy link
Owner

dqbd commented Apr 23, 2023

Hi @djaffer,

Not sure where the issue is, as Node.js is well supported, both for lite and for full-fledged variant. Could you please share a reproducible example?


Full fledged example (tested on v18.16.0)

const { get_encoding } = require("@dqbd/tiktoken");
const encoding = get_encoding("gpt2");
const tokens = encoding.encode("noone is there");
encoding.free();

Lite example (tested on v18.16.0)

const { Tiktoken } = require("@dqbd/tiktoken/lite");
const cl100k_base = require("@dqbd/tiktoken/encoders/cl100k_base.json");

const encoding = new Tiktoken(
  cl100k_base.bpe_ranks,
  cl100k_base.special_tokens,
  cl100k_base.pat_str
);
const tokens = encoding.encode("hello world");
encoding.free();
console.log({ tokens });

main();

@djaffer
Copy link
Author

djaffer commented Apr 24, 2023

it gave rust null pointer error Uncaught (in promise) Error: null pointer passed to rust

@dqbd
Copy link
Owner

dqbd commented Apr 24, 2023

it gave rust null pointer error Uncaught (in promise) Error: null pointer passed to rust

Could you please share a reproducible example? Thanks

@bosunolanrewaju
Copy link

I encountered this same error when I try to encode a string after the encoding has been freed.

const { get_encoding } = require("@dqbd/tiktoken");
const encoding = get_encoding("gpt2");
const tokens = encoding.encode("noone is there");
encoding.free();
const otherTokens = encoding.encode("second encoding"); // <-- this throws Error: null pointer passed to rust

@KitsonBroadhurst
Copy link

I had the same issue, moving encoding.free(); to the end, after all of the calls to encoding.encode appeared to solve the issue!

@djaffer
Copy link
Author

djaffer commented May 23, 2023

so the issue is that we have to reinitialize the encoding after freeing. Is that a good practice or we can reuse.

This works

function getTokens(){
const encoding = new Tiktoken(
  cl100k_base.bpe_ranks,
  cl100k_base.special_tokens,
  cl100k_base.pat_str
);
const tokens = encoding.encode("hello world");
encoding.free();
console.log({ tokens });
return tokens;
}

The below does not work. I thought initializing multiple times is not good.

const encoding = new Tiktoken(
  cl100k_base.bpe_ranks,
  cl100k_base.special_tokens,
  cl100k_base.pat_str
);
function getTokens(){

const tokens = encoding.encode("hello world");
encoding.free();
console.log({ tokens });
return tokens;
}

@djaffer djaffer reopened this May 23, 2023
@dqbd
Copy link
Owner

dqbd commented May 23, 2023

Multiple initialisation is definitely fine, as seen in your first example. However, in the second example, the encoder is being free'd and then accessed after freeing, which is invalid.

@djaffer
Copy link
Author

djaffer commented May 24, 2023

Thanks. Why is free not automated by refactoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants