As I wrote in my recent look at the Yubikey, it seemed to me that the rather primitive approach to backups taken by the Yubikey so far was just not going to be sufficient for the FIDO2/WebAuthn world, where the number of distinct credentials is going to be far larger.
Now that the Yubikey5 is out, with support for FIDO2/WebAuthn, I checked their documentation again - but couldn’t find any updated recommendations. As there is no apparent customer forum, I filed a support ticket asking about this. Unfortunately, the response was disappointing.
Seems like new databases (or at least database-like storage approaches) have been springing up like weeds in the recent years. I recently encountered mention of two ones I wasn’t really familiar with and so did a little reading. The following two articles provide a brief intro to:
MemSQL - a proprietary distributed relational-like database which keeps row-oriented tables completely in memory but supports column-oriented tables on disk
RocksDB - an open-source high-performance key-value store used often as background storage for more complex projects (eg Kafka Streams, Samza, MySQL)
Logging in to internet sites (and private servers) with just a password is really not acceptable these days, at least for someone (like me) claiming to be interested in IT security. I therefore recently bought a Yubikey-4 authentication token.
Sadly, the documentation available from the manufacturer, and the internet in general, was not very helpful. I have therefore created some extensive notes on the Yubikey-4 which may be useful if you are also considering buying one (or have already done so).
UPDATE: The Yubikey-5 is available (since late September 2018). Note that the above article also covers Yubikey-5 features.
I recently had a customer who suggested (for various reasons) storing large amounts of write-once data in HBase, using an (implicit) schema with long and complicated column names. I had immediate concerns about efficient use of disk storage with this approach (these were quite large amounts of data). Various sites warn about long column-names with HBase, but I could not find any actual statistics on it. A colleague and I therefore measured the efficiency of HBase with various column name lengths, and compared it to Avro.
I was part of a project that tried to do streaming processing with Spark a year or so ago. That didn’t go at all well; we had little resources and time, and (IMO) Spark-streaming was simply not mature enough for production.
One of the nasty problem we had was that landing data into Hive created large numbers of small files; Walmart Labs solve that by using KairosDB as the target storage instead. KairosDB is a layer on Cassandra, ie HBase-like.
Another serious problem with Spark-streaming is session-detection; it is possible but only with significant complexity. If I understand correctly, they solve that via the lambda archtecture - rough session detection in streaming, and better detection in the batch pass.
They still apparently had to fiddle with lots of Spark-streaming parameters though (batch duration, memory.fraction, locality.wait, executor/core ratios), and write custom monitoring code. And they were running on a dedicated spark cluster, not yarn. My conclusion from this is: yes Spark-streaming can work for production use-cases, but it is hard.
After my experiences, and some confirmation from this article, a solution based on Flink, Kafka-streaming, or maybe Apache Beam seems simpler to me. Those are all robust enough to process data fully in streaming mode, ie the kappa architecture.
Java bytecode (production) - includes Java, Scala, Groovy, Kotlin
LLVM bitcode, ie apps compiled from C, C++, Rust and other languages via the LLVM compiler (experimental)
Python, Ruby, and R (experimental)
Code in these languages can call into other code running within Graal, regardless of the language it was written in! Arranging for additional libraries (including the language standard libraries) to be available requires some steps, but is possible.
Not only does this allow running apps in a “standalone” environment, it means that any larger software package which embeds the Graal VM and allows user code to run in that VM can support any language that Graal supports. Examples include database servers which embed the VM for stored procedure logic.
With Oracle, it is important to look at the licencing terms-and-conditions. This does initially seem to be OK; the code is completely licensed under the GPL2-with-classpath-exception, like OpenJDK. Oracle does warn that there is “no support” for the open-source code (aka “community edition”) and recommends that a support licence be bought for the “enterprise edition” instead - but OpenJDK is reliable enough, and so the Graal “community edition” will hopefully be so too.