Skip to main content

Issue with umlaut characters

Completed

Comments

10 comments

  • Fletcher Penney

    Can you send a screencast of exactly what you're doing and what happens?  (You'll need to do this as a support ticket, not as a community post)

     

    I've tested with some basic French and German, including umlauts, etc., as well as copy-pasting some Chinese, Japanese, and Korean without problems.  

     

    In my tests, indexing, search, and using the search bar all work properly for me.

    0
  • juha.ranta

    OK, I'll send a screenshot, but here's some further info about the issue. If I have a note starting with "Tässä" and try to write something starting with "ta" in the search field, then it selects the note starting with "Tässä" and removes the "a". Effectively I can't write anything starting with "ta" in the search field with this configuration. I think there's some kind of an 'ä'/'a' and also 'ö'/o' conversion there that messes it up. Also happens with the Swedish 'å'/'a' and German 'ü'/'u'. 

    BTW while these umlaut characters ('a' vs 'ä') are often entirely different characters in the corresponding language, it does make sense that searching for "munchen" will return "münchen" as well. However, with this issue it's not possible to write "munchen", or any word starting with "mu" for that matter, in the search bar if there's a note name starting with "münchen".

    Also for instance, in Spanish an accented letter may work similarly to the unaccented one, but in Finnish the 'a' is as close to 'ä' as it is to 'e'. In Finnish "väärä" means "wrong", while "vaara" means "danger", "fjell" among other things.

    I also noticed that when I create a new note with these characters in the title/filename, then they appear twice on the notes list! 

    0
  • Fletcher Penney

    Ok -- this additional information allowed me to reproduce.

     

    First -- yes, I agree that `u` and `ü` are not the same, and that for search purposes it is better to standardize UTF-8 encoding and strip diacritics.  For example, when I search for `crepe`, I expect to also find results for `crêpe`.  

     

    The issue here is that the search bar is doing two things -- searching and finding the first matching file name.  Search (to my knowledge) is working properly.  It's the autocomplete that's a problem.

     

    With two files `Tässä` and `Tags`, `t` matches both and is fine.  When you type `a`, however, it matches both in SQLite when searching for file names (I suspect it is stripping the `ä` and just searching by `t`).  If `Tags` is the first match, then we're fine.  If `Tässä` is the first match returned (e.g. if it is edited most recently and you sort by date) we get a problem.

    The problem is that the "shared" text between `Ta` and `Tässä` is `T`.  So the `ässä` is treated as autocompletion.  Typing `a` again, simply gets us right back where started.  (As an aside, typing `Tä` works just fine).

     

    I need to figure out the actual solution for this -- I need to get SQLite matching behavior in sync with search and autocomplete behavior. 

     

    In the meantime, you can work around this by pasting into the search bar, or typing the end characters, and inserting leading characters after -- e.g. `ag` => `Tag` works, as does simply pasting `Tag` all at once.

    0
  • Fletcher Penney

    PS>  I think when this first part is fixed, the duplicated notes will fix itself as they related issues.

    0
  • Fletcher Penney

    PPS>  For others following this thread -- there is another comment by juha.ranta that looks to be awaiting approval after an edit???  My comments may not make perfect since without being able to read that comment, but the overall gist is clear.

     

    0
  • juha.ranta

    "PPS>  For others following this thread -- there is another comment by juha.ranta that looks to be awaiting approval after an edit???  My comments may not make perfect since without being able to read that comment, but the overall gist is clear."

     

    Thanks for the response, it makes sense and I appreciate the technical description of the issue. I work as as a software engineer, so it makes sense. There's nothing really new about this issue in the modified comment, I just tested it with Swedish/German characters and added some thoughts and comments about language. The board perhaps asks for an approval after some edits. :)

    0
  • Fletcher Penney

    (My apologies -- Zendesk flagged your comments as possible spam (first time that has happened), and I did not know that or know how to unflag them.  They should be visible now)

     

    I figured understanding the issue would at least help you work around it until I have a true fix.

     

    Thanks!

    0
  • Fletcher Penney

    Ok -- this should be fixed for next release.

     

    Searching still strips diacritics, etc.

     

    But filename autocompletion normalizes the UTF-8 encoding and *should* now allow you to match filenames with both `a` and `ä`, etc.  It works for me, but let me know if you find a situation where it doesn't work.

     

    Note, this can lead to some seemingly bizarre results, but they do still follow the rules mentioned here.  For example, typing `täg` could match your  `Tags.txt` file, but would disable autocompletion since there is no file starting with `Täg`.

    0
  • juha.ranta

    It seems to work fairly well now in in beta 31. One thing I noticed is that when I delete a note with these characters in the name, it gets deleted from the disk but remains in nvUltra sidebar. However upon restart they're gone.

    0
  • Fletcher Penney

    Fixed.  It was trying to remove "Tässä" from the index instead of "Tässä".

     

    (If you're confused by that example, then you understand where the difficulty lies....  There are situations where those two strings are the same, and situations where they are not the same....   Go figure.  ;)

    0

Please sign in to leave a comment.

Powered by Zendesk