Suddenly occacional problem with htop hardware tokens

After years having no problems at all with Feitan C100 HOTP tokens, we suddenly have about 5 to 10% of our users not being able to login anymore with their token.
My only guess is that the tokens got out of sync, but using the synchronising function in Privacyidea doesn’t solve the problem.
What does help, is put the synchronisation window to a ridiculously high number for a few attempts, like 100000, while our default is 1000. Once the token works again, it seems to stay working.
The tokens all have around 1500 - 2000 authentication attempts. So even a 1000 window seems ridiculously high in this context.
There are a few settings on the HOTP token settings page, which I don’t understand and can’t find in the documentation or the RFC-document for HOTP, like “aantal” and “aantalvenster” (my server is in Dutch - setting 6 and 7 in the list)

Our server is on PrivacyIdea 3.7.1 and auto updates.

Did anyone come across this weird behaviour? Anyone have a clue what could cause this?

I’m sorry that this is a very vague question - I don’t have much more information.
Kind regards

If you set such a high window, I would check the counter with the problematic tokens.

My guess could be, if the tokens are really old, the battery could run weak and something strange happens to the storing of the counter in the token hardware.
Although I would guess that the display would die prior the counter memory gets corrupt.

Thank you for your reply Cronelius.
We had a couple with dying display or dying in general, showing the bars making circles in stead of numbers, showing the same number more then ones, things like that, but these are fine, since they start working again after syncing.
They are not really old - 3 years or so.

The counter:

Type hotp
Actief actief
Max mislukt 10
Foutenteller 0
OTP-lengte 6
Aantal 344
Aantalvenster 10
Synchronisatievenster 1000
Beschrijving imported

count_auth: 442
count_auth_success: 301
dueDate: 1648648263
hashlib: sha1
last_auth: 2022-06-07 13:28:56.015934+02:...
otp1c: 374
tokenkind: hardware

Is that “Aantal” in this example? or otp1c? or count_auth? Quite a few counters there. It is the “Aantal” and “Aantalvenster” I really don’t know what they are. Since I maintain the translation :-o I will try to come up with something more clear for others once I understand what it does …

The “count” is actually the counter, that is used to calculated OTP values. This is important for authentication and cryptography and verifying the OTP value.

the “count_auth” is simply a counter how often the token was tried to be used for authentication.
similar is “count_auth_success”. “otp1c” is the last counter that was tried to be used for autoresync.
So it looks like you have activated autoresync.

see HOTP Token — privacyIDEA 3.6.2 documentation

see 5.3. System Config — privacyIDEA 3.6.2 documentation

So it could be that this is somehow a mixup of your autoresync. Not sure.
I would suggest that you monitor the “count” of a problematic token.

Thanks a lot for this information Cornelius.
I changed some of the Dutch translation accordingly. Especially the link to the developer documentation, describing the class and its parameters was really useful.

I’m still not sure about the difference between Count Window and Sync Window. It seems to me to be the same thing, being how far the server looks ahead to check if a valid token exists in the row of tokens.

No, it is different but it works togeather.

The countWindow is relevat for authentication, the syncWindow is relevant for synchronization.

Ah, syncwindow is only used when using a sync method.
As you said, I checked the counter and learned a lot.
A new non working token came in like this:

Type hotp
Actief actief
Max mislukt 10
Foutenteller 2
OTP-lengte 6
Aantal 0
Aantalvenster 10
Synchronisatievenster 100000
Beschrijving imported

count_auth: 1343
count_auth_success: 1274
dueDate: 1655113795
hashlib: sha1
last_auth: 2022-06-08 16:52:08.691174+02:...
otp1c: 1556
tokenkind: hardware

Which is very weird. The counter is on 0. How can that be?
I managed to get it working again by making the counter window and the sync window larger then the otp1c value and then do a sync.
Anyway, I seem to have a working method to revive the tokens. It’s just weird that the counter on server went to 0. I’ll investigate that a little further and if I find something, I’ll post it here as reference to others.

Thanks a lot for your clarifications!

I am not sure, if the counter is reset when you reassign the token.

If you re-import the hardware token the counter could also be reset.

I had one notification today again, so I could really replay how things were.
counter = 0
otp1c = 1652

So I changed the counter window to 1700 and asked the user to try to login, which worked. After his login, the counter went to 1770.

The token was not reassigned. I’m puzzled to why the counter went to 0. This user had the first auth with this hardware token 2020-08-28T18:59:40.

So I did a search in the database of other tokens having 0 as counter and I found 19 never used ones - fair enough - and 6 other ones. 2 of them assigned to people who recently left / retired and 2 unassigned, but used - probably with people that are not in db anymore, one active person that didn’t use the token for a long time and one other person that I expect to report the problem any time soon :slight_smile:

I don’t think the counter gets reset when I reassign the token. I recently did it a few times and never had a problem doing this.

It is worrying that tokens that work well for years, suddenly have their counter number reset server side. I’m sorry that I couldn’t find a reason for that now - I don’t see a single parallel in tokens with the problem. We have 86 hardwaretokens in db and only had something like 10 problems. It can’t have something to do with the 3.7 or 3.7.1 upgrade because that was 2022-05-12
The first e-mailreport I had was on 9/6/22

The counter can get reset in case of

  • some event handlers…???
  • if you (re)-import the token with a counter=0

This does not happen out of the blue, so I think you have some mechanism, what you are not aware of.

Also, are you really using the autoresync of the tokens?