-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed parsing of tags #382
Conversation
.map(m => m[2].trim()) | ||
.filter(tag => !isNumeric(tag)) | ||
Array.from(text.matchAll(HASHTAG_REGEX), m => m[2].trim()).filter( | ||
tag => !isNumeric(tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the regex could be more restrictive as to avoid this check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work const HASHTAG_REGEX = /(^|[ ])#([a-zA-Z][\w_-]+\b)/gm;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that more regex magic could do the trick, but after spending so much time on it I got pragmatic (tired) and settled here.
This is not the best tag parsing code out there, I have seen something by twitter https://github.com/twitter/twitter-text/ but it's much more (I have also pulled out their tag code, it just feels a bit heavier and doesn't quite match the tags cases we want).
I have not been able to find a simple tag extraction library/gist/file out there, would love pointers if people are aware of some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition I made is simple enough to work in "every" case, you just force the first character to be a letter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thing is, it can start with a number, it just can't be all numbers:
62 | expect(
63 | extractHashtags('this #123 tag should be ignore, but not #123four')
> 64 | ).toEqual(new Set(['123four']));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was merged, have you tried with the latest version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, using v0.77. The code is a bit different know.. But i don't know if i'm doing something wrong...
export const extractHashtags = (text: string): Set<string> => {
return isSome(text)
? new Set(Array.from(text.matchAll(HASHTAG_REGEX), m => m[2].trim()))
: new Set();
};
Vscode highlight it as a tag but it's not recognized as one in tag explorer. Maybe the problem is with tag explorer itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riccardoferretti do you have any idea what this might be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meantime, please submit an issue @bserrao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting.. I reproduced it as well, that's odd. yes let's start by filing an issue
fixes problem referred to in #302 (see when reopened)