Must Read Books   My Server   Web   Music

Building a Diskless Cluster

As my internship reaches to an end, I would like to thank Intel for hosting the dissertation for my master degree. During my stay here in Shannon, I found many new innovative directions that could be explored and I had the chance to extend/reveal my skills. Inspired from the field of computer security, with obvious influences from the area of cryptography, I discovered a personal paradise at Intel’s Lab.

How it started

In a nutshell, I built a multi-node diskless cluster that is controlled from a main board and it is able to schedule jobs in parallel. The interesting part comes from the fact that the entire approach is diskless and it shares only a single shared storage point, located at the main board. That in practice means that every processing on the nodes is done on the fly through the network and apparently the operating system is also loaded on the fly by using the Pre-Boot eXecution Environment (PXE) & Tftpboot.

How it finished

The idea behind the cluster was to explore the processing that we could obtain for general hashing operations. There are many open source tools for High Performance Computing and I was more than glad to explore them. Intel Shannon, significantly contributed to the entire project, since it provided me with all the necessary hardware toys and a high-tech lab to setup my system. I would like to personally thank the following people: Benne de WegerPadraic GarveyMichael HennesseyPierre LaurentSergio Borghese and Joseph Gasparakis.

Document:    “A Diskless Cluster Architecture for Hashing Operations”
Presentation:     “A Diskless Cluster Architecture for Hashing Operations (Pres)”

Comments are closed

The Undisputed Truth: SSL/TLS

Recently, I found myself working at the Secure Socket Layer (SSL) and Transport Layer Security (TLS) standards, mostly for optimizing and accelerating the expensivecryptographic operations. In particular, I examined the OpenSSL toolkit and I tried to figure out, what applies into the real world (web browsers, web servers, open source SSL/TLS stacks). The scenery is quite fuzzy and sometimes insecure. The lack of a unified direction and the unavoidable need for backward compatibility with all the versions of standards create an environment with several possible configurations. Unfortunately, this introduces significant space for security flaws (e.g. man-in-the-middle attacks) and thus, the proper attention is needed when SSL/TLS is applied. Ivan Ristic conducted an excellent research for SSL Labs. This research gives a great overview of the usage of SSL/TLS in real-life and presents an in depth analysis for all the cryptographic features. The results of this investigation will be presented in Black Hat 2010 USA.

The presentation can be found here.

Comments are closed

A Scalable Cluster for Cryptography

Usually, cryptographic computations constitute a heavy bottleneck for almost any common processor unit. In order to overcome this drawback, new dedicated processors have been developed. By aiming mainly to accelerate the cryptographic operations and off-load the cryptographic workload from the main functionality of a processor, these processors achieve a much better performance at a low-per-watt power consumption. By embedding dedicated acceleration mechanisms into a common processor chip, the hardware once again is proved to be incomparable faster from any software approach. At the moment, there is a number of existing motherboards on the market, which are using processors and co-processors that integrate cryptographic acceleration. The idea is to build a small diskless cluster based on this kind of motherboards, in order toscale up even more the existing acceleration. The concept of diskless cluster exists several years now and there are many open source tools, which could be used in order to build such a cluster. In our case, DRBLTORQUE and MAUI were used for building a scalable crypto-cluster.

Of course, several important questions remain to be answered:

  1. How much will the performance be increased ?
  2. Does it worth to implement such an architecture for cryptography ?
  3. At which point, we will be blocked by memory and ethernet limitations ?
  4. ….

Even though that this approach may hides limitations, there are applications on which this infrastructure could be useful. The need for better performance at low cost is always present and especially for security areas such as forensicscryptanalysis and popular encryption protocols, this kind of architecture might be promising.

Proof-of-concept: Download

Comments are closed

WPA-PSK – Cracking Approaches

Wi-Fi Protected Access (WPA) is one of the most popular security protocols for wireless data encryption. As it is already known, WPA-PSK which represents the personal mode of WPA is proved to be vulnerable to dictionary attacks. The fact that the initial 4-way handshake which is performed between the AP and the Client is transmittedunencrypted, gives to the attacker all the needed information in order to start an off-line attack. Until now, there are several groups and companies that investigated this kind of attack [ 1, 2, 3, 4 ] and tried to optimize the entire process. The usage of hardware with additional computational power, such as the low-level Field Programmable Gate Arrays (FPGAs) or the in parallel processing that Graphics Processing Units (GPUs) offer, revealed interesting approaches that could be followed for pre-computations and searching. The idea of pre-computed Lookup Tables might go back in time, however it constitutes a great technique. More precisely, in the case of WPA-PSK they are able to spare uscomputational time and power, since the 4096 HMAC_SHA1 iterations that are needed will be performed beforehand. Of course, the level of security that WPA-PSK is stillrelatively high for today’s hardware capabilities. The population of all possible keys is salted by using the SSID of the wireless network and that makes the already difficult task of applying dictionary attacks, even harder. Imagine, that if someone wants to try every single password, a simple glance would reveal that computations needed areimpossible to be performed in a logical time frame. There are 95 printable ASCII characters and the pass-phrase could be 8-63 characters, covering only the case of one SSID. That leave us with something like 95^63 possible keys for a single SSID test. This topic could be further approached by using additional cryptography acceleration. What if we used dedicated hardware, such as the Intel EP80579 Processor (aka Tolapai), which is meant to accelerate cryptographic computations at a significant low per wattpower consumption. This task could involve a number of important interesting issues for further experiments, such as:

  1. Time-Memory trade-offs.
  2. Optimizations of Data Transferring.
  3. Generation of new Lookup Tables on the fly.
  4. Creation of sophisticated pass-phrases and SSID dictionaries.
  5. Building a small Diskless Cluster

Additional details: Download

Comments are closed