Skip to content

fixed wrong word limiter for answer warnings#2540

Open
AlexanderSchicktanz wants to merge 10 commits intoe-valuation:mainfrom
AlexanderSchicktanz:main
Open

fixed wrong word limiter for answer warnings#2540
AlexanderSchicktanz wants to merge 10 commits intoe-valuation:mainfrom
AlexanderSchicktanz:main

Conversation

@AlexanderSchicktanz
Copy link
Collaborator

fix #2516

hansegucker
hansegucker previously approved these changes Oct 27, 2025
Comment on lines +11 to +15
let j;
for (j = 0; j < sub.length; j++) {
if (arr[i + j] !== sub[j]) break;
}
if (j == sub.length) return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be slightly nicer if you create another helper function arrayEquals(a: string[]. b: string[]): bool :)

Bonus points: Can you remove the string type and make a generic function instead? Take a look at our utils.ts file for examples, or ask if you need some input :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that this change would make a nice improvement to the current implementation

janno42
janno42 previously approved these changes Oct 27, 2025
Copy link
Member

@janno42 janno42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ functionality tested

Comment on lines +21 to +22
const words = text.split(" ");
const triggerWords = triggerString.split(" ");
Copy link
Member

@niklasmohrin niklasmohrin Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting by a single space character means that we don't match if users accidentally put two spaces. We should split by any amount of whitespace (like what happens in Python when using str.split without passing any argument)

function matchesTriggerString(text: string, triggerString: string): boolean {
const words = text.split(" ");
const triggerWords = triggerString.split(" ");
return containsSubArray(words, triggerWords);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably means that if users write something like "see above." at the end of a sentence we will not show a warning because the dot is included in the last word.

Doing this completely correct seems tough, unless we at some point change the UI so that staff users can enter arbitrary regexes as trigger words. What do you think we should aim for in this PR @janno42 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could first preprocess (before splitting) by regex-replacing all non-word characters with a delimiter, then splitting on the delimiter. Then, punctuation, arbitrarily weird (repeated) whitespace and other non-printable characters would not be an issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now matches strings including dots, question- and exclamation-marks by trying to match the start of the string and then checking, if the rest is made up of only the aforementioned characters.
This way, trigger words including these delimiters (i.e. "s.o.") can also be matched.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but there is still the problem that if there is more than one space character we have an empty string as one of the words

Comment on lines +11 to +15
let j;
for (j = 0; j < sub.length; j++) {
if (arr[i + j] !== sub[j]) break;
}
if (j == sub.length) return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that this change would make a nice improvement to the current implementation

@richardebeling richardebeling removed their request for review November 3, 2025 19:24
return text.length > 0 && ["", "ka", "na", "none", "keine", "keines", "keiner"].includes(text.replace(/\W/g, ""));
}

function containsPhrase(arr: string[], sub: string[]): boolean {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is very difficult to read, it does too many things at once. We should probably have something like

function matchesTriggerString(text: string, triggerString: string): boolean {
    const words = extractWords(text);
    const triggerWords = triggerString.split(" ");
    return isSubArray(triggerWords, words);
}

Then the isSubArray should also not be this complicated, it should use another function areArraysEqual, as I previously suggested

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this also means that all of our custom "how do we handle weird text things" is separated from the entire "do the arrays match" logic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Regard word limits in text answer warnings

6 participants