Thoughts on the Equifax Data Breach

In 2017, a large amount of data was stolen from Equifax, a US-based company specializing in online creditworthiness checks. In september 2018, a report on the issue from the US Goverment Accountability Office (GAO) was finally released.

One good news article on the subject is from The Register. It is typical in that the emphasis is placed on two issues:

  • The failure of a system-monitoring tool to detect unusual behaviour within the company network, due to an expired SSL cert.
  • The failure of internal processes intended to detect software packages with known security holes (Struts in this case)

However in my opinion, there are more significant issues.

Data Warehousing

The Bitwarden Password Manager

Managing passwords is one of the less pleasant parts of modern computing. I recently discovered the Bitwarden password manager, which has a very nice feature-set and good security design. The most interesting features are:

  • data encrypted on client; server never has access to passwords or the URLs they are associated with
  • passwords can be sharing between accounts
  • all code is open-source

You can use the official hosted service (free for individual accounts, very reasonable pricing for teams), or host your own server for free.

I have written more about Bitwarden here.

Yubikey, FIDO2 and Backups

As I wrote in my recent look at the Yubikey, it seemed to me that the rather primitive approach to backups taken by the Yubikey so far was just not going to be sufficient for the FIDO2/WebAuthn world, where the number of distinct credentials is going to be far larger.

Now that the Yubikey5 is out, with support for FIDO2/WebAuthn, I checked their documentation again - but couldn’t find any updated recommendations. As there is no apparent customer forum, I filed a support ticket asking about this. Unfortunately, the response was disappointing.

More databases - MemSQL and RocksDB

Seems like new databases (or at least database-like storage approaches) have been springing up like weeds in the recent years. I recently encountered mention of two ones I wasn’t really familiar with and so did a little reading. The following two articles provide a brief intro to:

  • MemSQL - a proprietary distributed relational-like database which keeps row-oriented tables completely in memory but supports column-oriented tables on disk
  • RocksDB - an open-source high-performance key-value store used often as background storage for more complex projects (eg Kafka Streams, Samza, MySQL)

Yubikey Concepts, Configuration and Use

Logging in to internet sites (and private servers) with just a password is really not acceptable these days, at least for someone (like me) claiming to be interested in IT security. I therefore recently bought a Yubikey-4 authentication token.

Sadly, the documentation available from the manufacturer, and the internet in general, was not very helpful. I have therefore created some extensive notes on the Yubikey-4 which may be useful if you are also considering buying one (or have already done so).

UPDATE: The Yubikey-5 is available (since late September 2018). Note that the above article also covers Yubikey-5 features.

On a similar topic, I have very brief notes on the pass commandline password-manager for Linux and totp commandline tools for Linux. All feedback is very welcome!

The Snowflake Data Warehouse

Storage Space Efficiency in Avro and HBase

I recently had a customer who suggested (for various reasons) storing large amounts of write-once data in HBase, using an (implicit) schema with long and complicated column names. I had immediate concerns about efficient use of disk storage with this approach (these were quite large amounts of data). Various sites warn about long column-names with HBase, but I could not find any actual statistics on it. A colleague and I therefore measured the efficiency of HBase with various column name lengths, and compared it to Avro.