Páginas

Saturday, November 6, 2010

Passing data from a jQuery script to a controller method in Rails using Ajax

First of all you have to write your script to get the data and store it in variables. This code should be written either in public/javascripts/application.js or in a new file in the same directory and then included in the HTML, like so:

<%= javascript_include_tag 'name_of_file' %>
or

<%= javascript_include_tag :defaults %>

to include the application.js file.

Have in mind though that this includes must be done after including the jquery.js, and that jQuery can affect other scripts that use prototype. Therefore, if you are using any code that needs prototype or just as a good practice, the first line of your jQuery script should be jQuery.noConflict();.

This being said, the way you pass your data to the controller is through an Ajax post, which is something like this:

$.post(path,{ "string1": variable1,"string2": array1},null,"script");

Check the jQuery documentation for more details.

The path variable should be in the form controller/action, so that the message is sent to the desired action.

The next thing to do is to create the action in the controller that will receive the message. This action can respond to different kind of requests in different ways, what I mean is that if it is an HTTP request it behaves one way, and if it is our Ajax request it behaves another.

This separation of behaviours is achieved with respond_to like this:

respond_to do |format|
format.html { redirect_to :action=>"list" }
format.js
end


As it is easily perceivable, for an HTTP request it redirects to the action named list of the same controller and renders it. What it does in the case it receives an Ajax request might not be that obvious, because the code that will be run can be written in a file called action_name.js.erb that has to be in the views/controller directory.

This file can be something as simple as this:

$("#justForTest").html('
params[:actions]
');


One last thing for this to work is that the post request in our client side script must be an Ajax request. This can be done by adding .js to each path or by adding this setup code to the top of your scripts file:


jQuery.ajaxSetup({
'beforeSend':function(xhr) {xhr.setRequestHeader('Accept','text/javascript')}
})


This should make your request work as intended. Have fun with jQuery and Ruby on Rails, which are very powerful tools for any web developer.

Saturday, October 16, 2010

Cassandra, the Data Model

UPDATE: Sorry for the images being down for so long. I've finally had the time to re-upload them, and while I was at it, I re wrote the whole post.

For my master's thesis I'm going to be working with Cassandra, an open source distributed database management system, and therefore I'll probably write a lot about it throughout the next year. To get started let's take a look at one of the biggest differences from this kind of DBMS to the classical relational systems, the data model.

Cassandra that was created on Facebook, first started as an incubation project at Apache in January of 2009 and is based on Dynamo and BigTable. This system can be defined as an open source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, tuneably consistent, column-oriented database.

Cassandra is distributed, which means that it is capable of running on multiple machines while the users see it as if it was running in only one. More than that, Cassandra is built and optimized to run in more than one machine. So much that you cannot take full advantage of all of its features without doing so. In Cassandra, all of the nodes are identical, there is no such thing as a node that is responsible for certain organizing operations, as in BigTable or HBase. Instead, Cassandra features a peer-to-peer protocol and uses gossip to maintain and keep in sync a list of nodes that are alive or dead.

Being decentralized means that there is no single point of failure, because all the servers are symmetrical. The main advantages of decentralization are that it is easier to use than master/slave and it helps to avoid suspension in service, thus supporting high availability.

Scalability is the ability to have little degradation in performance when facing a greater number of requests. It can be of two types:

  • Vertical - Adding hardware capacity and/or memory
  • Horizontal - Adding more machines with all or some of the data so that all of it is replicated at least in two machines. The software must keep all the machines in sync. 

Elastic scalability refers to the capability of a cluster to seamlessly accept new nodes or removing them without any need to change the queries, rebalance data manually or restart the system.

Cassandra is highly available in the sense that if a node fails it can be replaced with no downtime and the data can be replicated through data centers to prevent that same downtime in the case of one of them experiencing a catastrophe, such as an earthquake or flood. 

Eric Brewer's CAP theorem states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
  • Consistency
  • Availability
  • Partition Tolerance
The next figure provides a visual explanation of the theorem, with a focus on the two guarantees given by Cassandra.



Consistency essentially means that a read always return the most recently written value, which is guaranteed to happen when the state of a write is consistent among all nodes that have that data (the updates have a global order). Most NoSQL implementations, including Cassandra, focus on availability and partition tolerance, relaxing the consistency guarantee, providing eventual consistency.

Eventual consistency is seen by many as impracticable for sensitive data, data that cannot be lost. The reality is not so black and white, and the binary opposition between consistent and not-consistent is not truly reflected in practice, there are instead degrees of consistency such as serializability and causal consistency. In the particular case of Cassandra the consistency can be considered tuneable in the sense that the number of replicas that will block on an update can be configured on an operation basis by setting the consistency level combined with the replication factor.


 The NoSQL movement members (which includes Cassandra) focus on the last two, relaxing the consistency bit. They provide what is know as eventual consistency. Having said that, let's take a closer look at Cassandra's data model.

Usually, NoSQL implementations are key-value stores that have nearly no structure in their data model apart from what can be perceived as an associative array. On the other hand, Cassandra is a row oriented database system, with a rather complex data model. It is frequently referred to as column oriented, and this is not wrong in the sense that it is not relational. But data in Cassandra is actually stored in rows indexed by a unique key, but each row does not need to have the same columns (number or type) as the ones in the same column family.

The basic building block of Cassandra are the Columns. They are nothing but a tuple with three elements, a name, a value and a timestamp. The name of column can be a string but, unlike its relational counterpart, can also be long integers, UUIDs or any kind of byte array.



Sets of columns are organized in rows that are referenced by a unique key, the row key, as demonstrated in the following figure. A row can have any number of columns that are relevant, there is no schema binding it to a predefined structure. Rows have a very important feature, that is that every operation under a single row key is atomic per replica, despite the number of columns affected. This is the only concurrency control mechanism provided by Cassandra.




The maximum level of complexity is achieved with the column families, which "glue" this whole system together, it is a structure that can keep an infinite (limited by physical storage space) number of rows, has a name and a map of keys to rows as shown here:



Cassandra also provides another dimension to columns, the SuperColumns, these are also tuples, but only have two elements, the name and the value. The value has the particularity of being a map of keys to columns (the key has to be the same as the column's name).




There is a variation of ColumnFamilies that are SuperColumnFamilies. The only difference is that where a ColumnFamily has a collection of name/value pairs, a SuperColumnFamily has subcolumns (named groups of columns).

Multiple column families can coexist in an outer container called keyspace. The system allows for multiple keyspaces, but most of deployments have only one.

This is pretty much it. Now, it all depends on the way you use these constructs.

Be aware of one thing when using Cassandra, the values on the timestamps can be anything, but they must be consistent throughout the cluster, since this value is what allows Cassandra to define which updates are new and which are outdated (an update can be an insert, a delete or an actual update of a record).

Sunday, October 3, 2010

Bash 101: Variables and Conditions

First off I would like to make a little note on the use of quotes. In the shell, variables are separated by whitespaces, if you want those characters to belong to the variable you'll have to quote them.

There are 3 types of quotes, double, single and the backslash, that have the following results:
  • Double - Accepts whitespaces and expands other variables
  • Single -   Accepts whitespaces and doesn't expand other variables
  • Backslash - Escapes the value of $
In my previous post I've talked about normal variables, here I'll talk about the other two types of variables, environment and parameter.

Environment Variables


At the start of any shell script some variables are initialized with values defined in the environment (you can change them with export or in the .bash_profile file). For convenience these variables are all uppercase, as opposed to the user defined that should be lowercase. Here's a list of the main ones and a brief description.
  • $HOME - Home directory of the current user
  • $PATH - The list of directories to search commands
  • $PS1 - The definition of the command prompt (eg: \h:\W \u\$)
  • $PS2 - The secondary prompt, usually >
  • $IFS - Input Field Separator. List of characters used to separate words when reading input
  • $0 - The name of the shell script
  • $# - The number of parameters passed
  • $$ - The PID of the shell script (normally used to create temporary files)
If you want to check all your environment variables just type printenv in the shell.

IBM has a pretty good hands on post to understand the setting and unsetting of environment variables, here, and refer to this other post for a more extensive overview of variables.

Parameter Variables


If your scripts is invoked with parameters it has some more variables which are defined (you can check if there are any parameters if the $# variable has a value greater than 0). These are the parameter variables:
  • $1, $2, ... - The parameters given to the script in order
  • $* - A list of all parameters in a single variables, separated by the first character in IFS
  • $@ - A variation of $*, that always uses a space to separate the parameters
I've written a small script that should make the difference between $* and $@ clear:

#!/bin/bash

export IFS=*
echo "IFS = $IFS"
echo "With IFS - $*"
echo "Without IFS - $@"

exit 0

Run it as ./script param1 param2 param3 ...

Note: The export command sets the variable for the script and all its subordinates.

Conditions


One the fundamental things of any programming language is the ability to test conditions. A shell script can test the exit code of any command it invokes, even of scripts written by you. That is why it is very important to include an exit command with a value (0 if it is ok), at the end of all your scripts.

The commands used to test conditions are two synonyms, test and [. Obviously, if you use [ it must have a matching ], and because it makes your code much easier to read, this is the most used construct.

In a shell script, a test should look something like

if [ -f file ]
then
    echo "File exists"
else
    echo "File does not exist"
fi

The exit code of either these commands is what determines the veracity or not of the statement (again, 0 for true and 1 for false). A little thing to remember is that [ is a command, therefore you must put spaces between it and the condition, or else it won't work.

There are 3 types of conditions that can be used with these commands:

String Comparison

  • string1 = string2 - True if strings are equal
  • string1 != string2 - True if strings are not equal
  • -n string - True if string is not null
  • -z string - True if string is null (empty)

Arithmetic Comparison

  • exp1 -eq exp2 - True if the expressions are equal
  • exp1 -ne exp2 - True if the expressions are not equal
  • exp1 -gt exp2 - True if exp1 is greater than exp2
  • exp1 -ge exp2 - True if exp1 is greater than or equal to exp2
  • exp1 -lt exp2 - True if exp1 is less than exp2
  • exp1 -le exp2 - True if exp1 is less than or equal to exp2
  • ! exp - True if the expression is false and vice versa

File Conditional

  • -d file - True if the file is a directory
  • -e file - True if the file exists (-f is usually used instead)
  • -f file - True if the file is a regular file
  • -g file - True if set-group-id is set on file
  • -u file - True if the set-user-id is set on file
  • -s file - True if the file has nonzero size
  • -r file - True if the file is readable
  • -w file - True if the file is writable
  • -x file - True is the file is executable
These are the more commonly used options, for a complete list type help test in your bash.

Tuesday, September 14, 2010

Bash 101: The Shell as a Programming Language

There are two different ways of writing shell programs, interactively (type a sequence of commands on the shell and let it execute them), or store the commands in a file that can be invoked as a program. We'll focus on the later, but having in mind that they are pretty much alike.

The first little "trick" you should be aware of are the wildcard expansions (or globbing), here are some of the most used:
  • * - Matches any string of characters
  • ? - Matches any character
  • [...] - Matches the defined characters
  • [^...] - Negates the previous one (everything but what matches)
  • {...} -  Matches the specified strings
Here's an example of the last one:
ls my_{file,doc}s

Which will list the files my_files and my_docs.

There is another thing you should know before starting to write bash scripts and that's the $(...) operation. What it does is represent the output of the program you invoke inside the brackets. Let's see it with an example:

Note:
  1. If the string contains spaces it must be delimited by quote marks;
  2. There can't be any spaces before or after the equal sign.
You can also assign user input to a variable, using read. It waits for the user to write something and press Enter. At that moment the variable has been assigned what the user wrote:

$read variable
abcde (Enter)
$echo $variable
abcde


At this time you're ready for your first program. In order to do that just open your favorite text editor and write the commands.

Comments in bash are represented by the character #, and continue to the end of the line. The only exception is the first line that should start with #! and the path to the program used to run the file (usually /bin/bash). Also by convention the last line of the file should be exit 0 (for now just know that 0 represents successful in shell programming).

The actual script could be something like this:

#!/bin/bash

echo "Name of file to cat:"
read file
cat $file

greeting="\nHello World"
echo -e $greeting

exit 0


This script waits for the name of a file, "cats" it and then writes Hello World to the output (not worrying with errors such as the file not existing).

The final step is to make the file executable, with chmod:
$chmod +x file
and run it!
$./filename

Monday, August 30, 2010

Accessor Methods

Following my last post, I'm going to talk a little bit about the accessor methods in Objective-C.

Accessor methods, as you may know, are those methods used to get or set the value of an object's variable, without actually "seeing" it. As you might imagine these methods are used many times in most, if not all, object-oriented programming languages. Objective-C 2.0 provides a very elegant way to declare these methods and saving a lot of lines of code.

It uses the key words @property and @synthesize. In general, a declaration of a property looks like this:

@property (attributes) type name;

The attributes can include readwrite (default) or readonly (doesn't get a setter method). To describe how the setter works it can also include assign, retain or copy.
  • assign (default) - simple assignment, does not retain the new value. If it's an object type and you're not using the GC, don't use this.
  • retain - releases the old value and retains the new. With GC is the same as assign.
  • copy - makes a copy of the new value and assigns the variable to the copy. (often used for strings).
So, a property declaration in the header file should look something like this:


@interface ClassName : NSObject {
    int foo;
}
@property (readwrite,assign) int foo;
@end


And then, in the implementation file you just need to write @synthesize foo; and your accessor methods are defined.

NOTE: There are two ways of using these methods, the normal one is by sending a
message to the object:

[object setValue:newValue];

and there is the other way, that's called the dot syntax and is a lot like what you do in Java:

object.value = newValue;

Although you can use the dot syntax, I do not recommend it. Read this post for the reasons why.

Sunday, August 29, 2010

An Objective-C/Cocoa character counter program

I've been learning Objective-C and Cocoa since yesterday (yes, I'm a newbie, so don't judge... :P), but I'm really loving it.

For those who have no idea of what this is, Objective-C is a C object oriented extension and Cocoa is, according to the Wikipedia, one of Apple Inc.'s native object-oriented API's for the Mac OS X operating system. Both of them together provide a great way to create programs to Mac OS X.

I'm not going to try to explain Objective-C or Cocoa in detail, there are really good books for that (the idea for this example was taken from Cocoa® Programming for Mac® OS X (3rd Edition)), nor will I explain how to use Xcode as an IDE. I'll just write the code and a little explanation of what it does.

So, let's get to what really matters, the code! :D

Every class in Obj-C is composed by two files a header file (.h) and a source file (.m). The first one has the instance variables and methods declarations and the second one the actual code (remember that Objective-C is an extension of C).

Our header file will be something like this:

#import <Cocoa/Cocoa.h>

@interface Counter : NSObject {
    IBOutlet NSTextField *line;
    IBOutlet NSTextField *output;
}

-(IBAction)count:(id)sender;
@end


The first line imports the declaration of NSObject which Counter inherits from, similar to Java's Object. All the Objective-C keywords start with @, to minimize conflicts with C code, as @interface.

Both instance variables are of type pointer to NSTextField, that can be either a text field or a label. IBOutlet is a macro that evaluates to nothing, it's a hint to the Interface Builder.

Finally, there's the method declaration. The method has the name count, returns IBAction (the same as void, also a hint for the IB) and has one argument named sender and of type id (a pointer to any type of object).

The .m file will have to implement the declared method (Java's public methods), but other methods can be declared as well, new methods (Java's private methods) or override inherited methods. Ours will be like this:


#import "Counter.h"

@implementation Counter

-(void)awakeFromNib
{
    [output setStringValue:@"???"];
}
   
-(IBAction)count:(id)sender
{
    NSString *theLine = [line stringValue];
    int noChars = [theLine length];
    NSLog(@"Counted %d chars",noChars);
    [output setStringValue:[NSString stringWithFormat:@"'%@' has %d characters", theLine, noChars]];
}   

@end


The nib file is a collection of objects that have been archived. When the program is launched, the objects are brought back to life before the application handles any events from the user. After being brought to life but before any events are handled, all objects are automatically sent the message awakeFromNib. This means that the label will have the string value "???" when the program starts, it should look like the following picture:


In the Interface Builder you should connect the text field and label to the corresponding outlets and the action of the button to the count method. For this you will have to add an object of the class Counter to the Doc Window.


Now, you've made sure the count method will be called when the button is pressed. So, what does the count method do?

It's actually a pretty simple method, it gets the string the user has typed, by sending the stringValue message to the text field (this method is an accessor, the correspondent to Java's get). It then sends the length message to the string, to get the number of characters. We now have all the information we need.

After that it writes the character number to the console, and finally it sets the label's string.

The final result should be something like this:



In this code I assume that you have a version of Mac OS above 10.4 and that you have the garbage collector on. I'll cover the usage of retain counts, the alternative solution to the GC in the future.

Saturday, August 28, 2010

How to create a man page

OA very useful thing, as everyone that has ever used a shell knows, are the man pages. In this post I'll explain how to create a man page for your programs, from scratch.

First you have to create a file and write the actual text for the man page, using some special tags, the most important can be found here. This is the main step of the creation of the man page, and where you should spend most of the time.

So that you can check how your man page will look while you're writing it and/or after you've written it, there is a very useful program, nroff, that is used the following way:

nroff -e -mandoc yourFile | less -s

Once you feel your page is as you want it, you'll have to rename your file to XXX.1 (or any other number from 1 to 8, according to this standard) and then compress it using gzip or bzip2. Your man page is now ready!

If you try the command man yourFile, you'll notice it doesn't work. That's because you're man page isn't in the MANPATH yet. To do this you'll first have to move your file to a directory with the name manX, being X the number you chose from the 8 possible (it is normal to have a directory called man and all the manX directories to be it's children). So, let's do that:

mkdir path/to/program/man
mkdir path/to/program/man/man1
mv yourFile.1.gz path/to/program/man/man1


The last thing to be done is actually adding your man folder to the MANPATH, if it isn't already added. The first thing to do is check if you have to add it or not:

echo $MANPATH (Just to check if the folder is in the current MANPATH)
export MANPATH=path/to/program/man:$MANPATH


Instead of doing this, if you have root privileges, you can create a directory in one of the directories already pointed by the MANPATH, like /usr/share/man.

And that's it! Go try it out!

Thursday, August 26, 2010

Writing to NTFS with Mac OS

I recently had the need to write to an NTFS formatted disk and found out that I only had read permissions. I could not have this, therefore, and after some Internet searching,  I found out how to do this. This "hack" is a really easy one, but really useful as well, nonetheless.

Native file handling capabilities of the Mac Os can be extended using Google's MacFUSE, as can be read at the project's home page. At the same page you can download the .dmg and install it.

You're halfway through. Now you just need the NTFS part, that can be downloaded here, install it, restart and you're good to go!

The Tuxera product needs a license after 15 days, if you find one that works just the same and is free, please tell me something.

*UPDATE*: See the comments for a free product.

Sunday, August 15, 2010

Bash 101 - I/O Redirection and Pipes

Bash scripting is a subject I'm really fond of, but before starting to write scripts one needs to understand how the bash commands work and how they can be connected. That's the purpose of this post, to provide some background before getting into the actual writing of scripts.

I/O Redirection


Many commands such as ls print their output on the display. We can change this using operators under the I/O redirection category, to redirect the output of those commands to files, devices and also to the input of other commands.

There are 3 file descriptors, stdin (standard input), stdout (standard output) and stderr (standard error). Basically you can use redirection in 7 different ways:
  1. stdout 2 file
  2. stderr 2 file
  3. stdout 2 stderr
  4. stderr 2 stdout
  5. stderr and stdout 2 file
  6. stderr and stdout 2 stdout
  7. stderr and stdout 2 stderr
This file descriptors are represented by the numbers 0 (stdin), 1 (stdout) and 2 (stderr).
To make this clearer, let's give examples for each of the redirections.
  1. ls -l > output.txt
  2. grep something * 2> errors.txt
  3. ls -l 1>&2
  4. grep something * 2>&1
  5. rm -f $(find / -name core) &> /dev/null (redirects both stdout and stderr to /dev/null that automatically deletes that information)

Other cases:

> filename - this truncates the file to zero length and if the file is not present has the same effect as touch

>> filename - this operator can be used as normal redirection to a file, but instead of writing from the start it appends to the end of that file

command < filename - accepts input from a file

3<> filename - open file for reading and writing and assigns file descriptor 3 to it

n <&- - closes file descriptor n (this is useful with pipes, because child processes inherit open file descriptors)


Another kind of redirection are the HERE documents that I've mentioned before. With those you can redirect a dynamic file to the input of a program that receives a normal file. An example of this can be seen in my earlier post.

Pipes


Very simply put, pipes let you use the output of a program as the input of another. To do this, a process is created for each command. That's why being careful with which file descriptors are open at the moment of creation (before sending the output as input) of each of those processes is important.

The basic functionality of pipes can be understood with a very simple example that has the same effect as ls -l *.sh:

ls -l  |  grep "\.sh$"

Here, the output of the program ls -l is sent to the grep program, which, in turn, will print lines which match the regex "\.txt$".

Saturday, July 24, 2010

Change the password on someone else's Mac OS

Every modern OS has a mode that grants you superuser permissions, without the need for a password. This mode is supposed to be used to run tasks that require exclusive access to shared resources or if you lost your password, for example.

The only reason this mode exists is because these OS's are seen as multiuser and the environment in which the computer is set should be a friendly one. Nevertheless this mode, the single user mode, can be used for some very nasty things if you now your BASH (or the existing shell) basics. Here we're going to change the root password.

First, you have to turn the computer off, because this mode is entered at boot time. While the system is booting keep cmd+S pressed, until a console appears. When it does, you're good to go. Just mount the file system with mount -wu /.

Now for the final step, type passwd and return. You should be asked to enter the new root password twice. Once that's done, you have successfully gained root rights to the OS. Reboot it and try it out!

If you want to be "protected" from this, you can install SecureIt, that requests you to enter a password before showing the console, when booting.

Notes:
  1. You cannot enter single user or verbose mode if the computer owner or administrator has enabled Open Firmware Password Protection.
  2. When in single-user mode, the keyboard layout is US English.
  3. In Linux systems you have to change the configuration of the booter, can be easily done
  4. In Windows there's the Safe Mode

Friday, July 23, 2010

Java static blocks

During the course of the year I found out about a very interesting thing in Java, the static blocks. Java has a special word for variables that are initialized once per class, instead of once per instance of the class. That word is static.

static int class_variable = 0;

This way, all the instances of the class share this variable (this also means you have to be careful about concurrency control). Now, what if you want to do this with a Collection, instead of a primitive type?

That's where static blocks come in handy. Imagine we have this:

static HashMap<Integer,String> sharedHash = new HashMap<Integer,String>();

What we want now is to populate the HashMap, either from hardcoded values or a function that does the job. The solution is simple, a static block!

static
{
    sharedHash.put(1,"one");
    this.populateHash(sharedHash);
}

This populates the HashMap only once, and only when the first instance of the class is created, saving a lot of unnecessary processing and memory usage.

Using HERE documents to do repetitive programming

In recent work I had to write Factories for a lot of classes in diferent packages, these factories all had the same general format and so, because it would be really boring to code this by hand, I wrote a little script that takes advantage of HERE documents. There were two types of files, the interfaces and the actual implementations, recognized by having the Impl sufix.

The first thing is to get the paths of the interfaces, we do this with an inverted grep:

for path in $(find . -type f | grep -v Impl)

Then we get the path to directory, excluding the name of the actual file:

factoryPath=${path%/*.java}

After that we get the file name:

javaFile=${path##*/}

and the class name (the name of the file without the .java)

className=${javaFile%.java}

We now have all the basic blocks needed to build our script. From now on it's just a matter of how to combine them. From the class name we get the factory file name:

factoryFile=${className}Factory.java

The package name for the interfaces is the factoryPath variable, but changing the / for .

packageName=$(echo ${factoryPath##./} | tr '/' '.')

and the one for the actual implementation is:

packageImpl=${packageName}.impl

The next step is to create the file we are going to write into:

touch ${factoryPath}/impl/${factoryFile}

And finally let's write the file using the HERE document:

cat >${factoryPath}/impl/${factoryFile} <<HERE
package ${packageImpl};  
  

import ${packageName}.${className};
import tsl.jaxb.${className}XMLImpl;    
  
public class ${className}Factory {
    public static ${className} getInstance(String instance)

    {
        if(instance.equals("XML"))
            return new ${className}XMLImpl();
        if(instance.equals("DER"))

            return new ${className}DERImpl();
  
        return null;
    }
}  

HERE

DER Signing using Java and Bouncy Castle

Following my last post about DER coding in Java, I will now explain how to sign an object, using java.security.PrivateKey, java.security.cert.X509Certificate and the TSL standard as pointed in that post.

The TSL signature is an enveloped signature, that can be described by the following picture:


First let's assume that we have some previously initialized variables:
  • tslDER - Unsigned TSL object (instance of DEREncodable)
  • privateKey - Private key being used to sign
  • cert - PKI Certificate of the signer (this one is optional, but recommended)
Now let's define the algorithms used for signing and message digesting:

AlgorithmIdentifier sigAlg = new AlgorithmIdentifier(ObjectIdentifiers.getAlgorithmOID(sigAlgorithm));
ASN1Set sigAlgs = new DERSet(sigAlg);
AlgorithmIdentifier digestAlg = new AlgorithmIdentifier(ObjectIdentifiers.getAlgorithmOID(digestAlgorithm));
ASN1Set digestAlgs = new DERSet(digestAlg);


The getAlgorithmOID is a method that given a String returns the matching Object Identifier or throws an Exception.

After that, let's define the identification of the signer:

SignerIdentifier sid = null;
String[] names = cert.getIssuerDN().getName().split(",");

ASN1EncodableVector vec = new ASN1EncodableVector();
for (String name : names)
{
    ASN1EncodableVector nameVec = new ASN1EncodableVector();
    String[] vals = name.split("=");
    nameVec.add((DEREncodable) X509Name.DefaultLookUp.get(vals[0].trim().toLowerCase()));
    nameVec.add(new DERPrintableString(vals[1]));
    vec.add(new DERSequence(nameVec));
}

ASN1Set attr = new DERSet(vec);
X509Name issuer = new X509Name(new DERSequence(attr));
sid = new SignerIdentifier(new IssuerAndSerialNumber(issuer, cert.getSerialNumber()));


Now the encapsulated content info:

ContentInfo encapContent = new ContentInfo(ObjectIdentifiers.id_eContentType_signedTSL, tslDER);

The next step is building the byte array to be signed:


MessageDigest md = MessageDigest.getInstance(digestAlgorithm);
md.reset();
md.update(tslDER.getDERObject().getEncoded());

ASN1EncodableVector attrs = new ASN1EncodableVector();

ASN1EncodableVector contentTypeAttr = new ASN1EncodableVector();
contentTypeAttr.add(ObjectIdentifiers.id_contentType);
contentTypeAttr.add(ObjectIdentifiers.id_eContentType_signedTSL);
attrs.add(new DERSequence(contentTypeAttr));

ASN1EncodableVector messageDigestAttr = new ASN1EncodableVector();
messageDigestAttr.add(ObjectIdentifiers.id_messageDigest);
messageDigestAttr.add(new DEROctetString(md.digest()));
attrs.add(new DERSequence(messageDigestAttr));

ASN1Set signedAttr = new DERSet(attrs);

md.reset();
md.update(tslDER.getDERObject().getDEREncoded());
md.update(signedAttr.getDEREncoded());
byte[] data = md.digest();


And to finalize the signature:


Signature sig = Signature.getInstance(sigAlgorithm);
sig.initSign(privateKey);
sig.update(data);
byte[] signed = sig.sign();

SignerInfo signerInfo = new SignerInfo(sid, digestAlg, signedAttr, sigAlg, new DEROctetString(signed), null);
ASN1Set signerInfos = new DERSet(signerInfo);

ASN1Set certs = new DERSet(new DEROctetString(cert.getEncoded()));

SignedData signedData = new SignedData(digestAlgs, encapContent, certs, null, signerInfos);
ContentInfo content = new ContentInfo(CMSObjectIdentifiers.signedData, signedData);

DER using Java and Bouncy Castle

In the last few months I have been working with Distinguished Encoding Rules (DER) of ASN.1 a standard placed by X.509, used in cryptography. I needed to code this in Java, for which I chose to use the Bouncy Castle ASN.1 library.

DER has some primitives that let's you define the various structures and types needed for your endeavors. Here are some:
  • Sequence
  • Choice
  • IA5String
  • PrintableString
  • Integer
  • Object Identifier
  • GeneralizedTime
  • Boolean
  • OctetString
Any of these types can be marked as OPTIONAL and have a tag, in the form [n].

As an example I shall use the Trust-service Status List ETSI TS 102 231 V3.1.2 standard specification.

A sequence:

LangPointer ::= SEQUENCE {
  languageTag LanguageTag,
  uRI         NonEmptyURI
  } 


Encoding:

ASN1EncodableVector vec = new ASN1EncodableVector();
vec.add(LanguageTag);
vec.add(NonEmptyURI);
DERSequence seq = new DERSequence(vec);


Decoding:


DEREncodable obj = ...;
DERSequence tslscheme = (DERSequence) obj;
Enumeration e = tslscheme.getObjects();
LanguageTag lt = (LanguageTag) e.nextElement();


A choice:

TSLpolicy ::= CHOICE {
 pointer [0] MultiLangPointer,
 text [1] MultiLangString
}


Encoding (using a sequence):


if(multiLangPointer != null)
    vec.add(new DERTaggedObject(0, MultiLangPointer));
else if(multiLangString !=null)
    vec.add(new DERTaggedObject(1, MultiLangString));


Decoding:


while(e.hasMoreElements())
{
    DERTaggedObject tagged = (DERTaggedObject) e.nextElement();
    switch (tagged.getTagNo())
    {
        case 0: multiLangPointer = tagged.getObject(); break;
        case 1: multiLangString = tagged.getObject(); break;
    }
}


The primitive types are all encoded and decoded in a similar way, the only thing changing being the name, therefore I will only give one example:

NonEmptyURI ::= IA5String (SIZE (1..MAX))

Encoding:

DERIA5String neuri = new DERIA5String(String);

Decoding (in a sequence):

DERIA5String neuri = (DERIA5String) e.nextElement();
String str = type.getString();


A little bit trickier is casting a DERGeneralizedTime to a GregorianCalendar, but here's how you do it:

DERGeneralizedTime issueTime = (DERGeneralizedTime) e.nextElement();
SimpleDateFormat f = new SimpleDateFormat("yyyyMMddkkmmss");
GregorianCalendar greg = new GregorianCalendar();
greg.setTime(f.parse(issueTime.getTimeString()));


The date format can be changed as you see fit, to do what you need.

The last thing are the OPTIONALS, that are very much like a CHOICE, except that there can be more than one at the same time. Even though, the encoding and decoding are done in the same way, using DERTaggedObjects.

To write and read from DER encoded files, you can use ASN1Streams, initialized like this:

ASN1InputStream ain = new ASN1InputStream(new DataInputStream(new BufferedInputStream(new FileInputStream(new File(file)))));

First post

As the blog's title suggest I will be writing about the things that I find interesting. May they come to my awareness through work, fun or chance, either way I hope this is as interesting to read as it sure will be to write.

I leave you now with two quotes that I relate two, being distributed systems and cryptography my main areas of interest.

We reject kings, presidents and voting. We believe in rough consensus and running code.

ETF Credo. David Clark (MIT).


Encryption...is a powerful defensive weapon for free people. It offers a technical guarantee of privacy, regardless of who is running the government... It's hard to think of a more powerful, less dangerous tool for liberty.

Esther Dyson, EFF.


This being said, I'll to write something of substance tomorrow.