DR Test: app & DB seem to work, but all tokens fail: Error 500, TypeError: Non-hexadecimal digit found

dashrb · March 6, 2021, 12:52am

Hi, first post!

I’m having a problem after a Disaster Recovery test where I create a new machine using backups from the working machine: the app appears to show all of my users, and they can log in (LDAP / AD user backend), but no existing HOTP tokens work. All tokens, when tested, return a server error 500, with a stack trace in the log file ending with:

  File "/opt/privacyidea/lib/python2.7/site-packages/privacyidea/lib/security/default.py", line 499, in decrypt
    raise e
TypeError: Non-hexadecimal digit found

The line of code in question is this:
data = binascii.unhexlify(output)

My setup: PI v3.3.3 on a RHEL 7.9 system in AWS with an AWS RDS database, and a /data volume (filesystem) where the PrivacyIDEA user’s home directory exists, the application is installed, and the /etc files exist.
Specifically, these softlinks exist in the root filesystem:
/opt/privacyidea → /data/privacyidea/home
/etc/privacyidea → /data/privacyidea/etc

This /data/ volume and the RDS database are snapshotted regularly.

During a Disaster Recovery restore, the database is recreated from the snapshot.
A branch new machine is built (new ec2 instance), and it mounts /data from the snapshotted filesystem.
The OS itself is configured with ansible in a way that installs libraries and things but leaves the /data contents alone as they are in good condition.

I can log into the web app, and see that I have a token. My iPhone app displays a token value, but when I go to “Test Token”, I see the Internal Server Error 500, and the aforementioned stack trace in privacyidea.log. If I enroll a new HOTP token, I am able to use its values successfully as expected.

So there’s something about the restore process that invalidates my existing tokens.

I added some debug around line so it looks like this:
try:
data = binascii.unhexlify(output)
except TypeError as e:
print(’== = binascii.unhexlify(output)
=========== DASHRB TypeError found in output: {}’.format(output))
raise e
print(’============= DASHRB ALL GOOD output: {}’.format(output))

such that it prints the “output” variable during a exception, and tested both a working (new) HOTP token, as well as my existing HOTP token. For the new/working token it says:
[Sat Mar 06 00:49:05.744554 2021] [:error] [pid 4851] ============= DASHRB ALL GOOD output: 33346363626463643062323162613836613636646435343933346435373031396265316262346531

and for the existing/broken token it says:
[Sat Mar 06 00:35:46.822475 2021] [:error] [pid 4851] ============= DASHRB TypeError found in output: \xce\xcb\xae\x1c>\x90U\xcb\xa3:\x171|\x99\xb5\xe5s\xc9r\x11/\xdb\x10\xe3\x04\xd3w\xd3\xd6\x0cz\xdd\xb7F\xaf\xa7%a\bLN\x0c\x12\xf2&\xaa\x8a\xf24\x9ce;\xc2\x13\x9d\x9b\x82Pt\xbadT\x0f\xdcA\x02\xb7\x93\xa4.\v9\xc4\xf0J\xf9\x9b\x9e\xe5\xb4\x18\x82\xe5\xa2

Do you have any idea what could be causing HOTP tokens to fail in this way, and therefore how I can fix this so that I can restore to an operational condition without forcing all my users to destroy and create new tokens?

Thank you in advance!

cornelinux · March 6, 2021, 5:22am

Hello @dashrb
thank you for using privacyIDEA.

There are several indications to see what is wrong:

“non-hexadeicimal digit found” occurs after the otpkey in the database is decrypted. That means, the result from the decryption can not be used. → You might have the wrong encryption key.
You are linking directories. This is rather unusal privacyIDEA setup. → You might get confused with the configuration files and enykey file.
You are saying that newly enrolled tokens work. The otpkey of newly enrolled tokens will be encrypted with the current enckey → issue with your enckey.

So my assumption is, that during recovery you missed to recover the correct otpkey.
(However, this would also mean, that e.g. your LDAP resolvers did not work directly, but you had to re-configure the password of your ldap service account! Did you?)

How can you fix this? Get your old encryption key! But then, your newly enrolled tokens will not decrypt anymore. (this split encrpytion needs to be fixed manually)

dashrb · March 10, 2021, 7:11pm

Hi @cornelinux. Thank you for the quick reply, and for your continued investment in answering everyone else’s questions on this forum. I have read many responses in the past when setting up PrivacyIDEA initially and they were very helpful. So thank you for all of that historical effort too.

Your assumption turned out to be correct. This was a DR test as I was restoring to a test area (the split encryption tokens were discarded along with the entire test area). Your advice to look at my encryption key, stored in the filesystem, was spot-on.

The whole thing is fully automated, no manual steps. My DR recovery goes into a new environment with a new AD/LDAP, and part of the ansible automation reads the new environment’s LDAP password from AWS Secrets Manager and injects it during startup. So in a way, yes, I reconfigured the LDAP service account, but it was automated and thus not evident to me.

However…

There was a flaw in my DR logic which caused the filesystem to be recreated from scratch (i.e. a brand new encryption key was being created, rather than restored from the backup snapshot). I addressed this flaw, and confirmed by destroying and recreating the new test environment, that my “old” MFA tokens are still working after the restore.

So your insight that the key was to blame has enabled me to identify and fix my restore problem. All working now, and totally happy. Thank you!

-rb