2009-06-09

Moose's MOP: Schematize

Last time, we met Moose::Meta::Attribute. This meta-level class tracks information about Moose attributes. As Moose users know, attributes can have a lot of information associated with them, such as accessors, type constraint, predicate, delegation, default value, and more. Today we will explore such information to generate a simplistic SQLite schema for a Moose class.

Since Moose adds the concept of attributes to Perl, plain classes do not contain enough useful information to generate a schema. We have nothing to inspect since all of the attribute metadata is locked up in opaque constructors and accessors. If you are playing along at home, you will need to use a real Moose class. You have plenty handy, right?

As always, please follow along on github and try it for yourself! Experimentation is a great way to learn a system.

Prelude

use strict;
use warnings;
use Moose ();
use DBI;

All Perl code should, of course, enforce strictures and warnings. We need Moose to inspect classes. Finally, we need DBI, Perl's database abstraction layer, to safely quote strings.

my $class = shift or die "usage: $0 Class::Name\n";
Class::MOP::load_class($class);

my $meta = Moose::Meta::Class->initialize($class);

The user passes in as a command-line argument the class name for which she wants a schema. We load it up, then get the class metaobject. The class metaobject is the gateway into meta-land. From the class metaobject we can access all of the attribute metaobjects, as we will do next.

print header($meta);

print join ",\n",
    "id INTEGER PRIMARY KEY AUTOINCREMENT",
    map { attribute($_) } $meta->get_all_attributes;

print footer();

The header function begins the CREATE TABLE statement. footer ends it.

The middle bit is where it starts to get interesting. We hardcode an integer id primary key column for the table. This is not great, but it serves us for the toy example. If you are interested in something high-tech, see Sam Vilain's comment on the previous article, pointing to his design of marrying Perl 6's metaobject protocol to a database.

After the id column, we loop over all of the class's attributes and serialize them with the attribute function. get_all_attributes will include superclass attributes (use get_attributes if you only want local attributes). I am not going to design object-oriented tables; if I wanted an object-oriented database I would use KiokuDB!

Create Table

sub class_to_table {
    my $class = shift;
    my $table = lc $class;
    $table =~ s/::/_/g;
    return $table;
}

class_to_table is a helper function for transforming class name Foo::Bar to table name foo_bar. Though whether table names should be singular or plural is apparently a hotly-debated topic, we're going to sidestep the issue by just using the class name.

sub header {
    my $meta = shift;

    my $name = class_to_table($meta->name);

    return "CREATE TABLE $name (\n";
}

The header function does just start the CREATE TABLE statement. Recall that $meta->name is the name of class that $meta represents. We could have just passed in the class name to this function, but we may want to declare table constraints in the future, such as compound primary keys.

sub footer { "\n);\n" }

footer closes the CREATE TABLE that header opened. Now onto the good stuff: attributes → columns.

Columns

sub attribute {
    my $attribute = shift;
    my @constraints;

    push @constraints, type_of($attribute);
    push @constraints, 'NOT NULL' if $attribute->is_required;
    push @constraints, default_of($attribute);
    push @constraints, foreign_key_of($attribute);

    return join ' ', $attribute->name, @constraints;
}

The attribute function takes an attribute metaobject and returns a string suitable for creating a column in the schema. We generate a list of constraints (type, default, requiredness, foreign key) from the values in the attribute. These correspond to the Moose attributes that we can usefully translate to the relational database.

If the attribute requires a value in the Perl OO side ($attribute->is_required), then it should require a value in the SQLite relational side. We can enforce this by declaring the column to be NOT NULL.

Once we have all the constraints, we return them along with the column name as a plain string.

sub type_of {
    my $attribute = shift;

    return if !$attribute->has_type_constraint;
    my $tc = $attribute->type_constraint;

    my @sqlite_types = (
        [Int => 'INTEGER'],
        [Num => 'REAL'],
        [Str => 'TEXT'],
    );

    for (@sqlite_types) {
        my ($moose_type, $sqlite_type) = @$_;
        return $sqlite_type
            if $tc->is_a_type_of($moose_type);
    }

    return;
}

type_of maps a Moose type constraint to a SQLite data type. If there is no type constraint, then we do not need to specify one in the schema.

SQLite has a dearth of data types, so there are only three mappings. If we had swapped the order of the Int and Num checks, then Moose Ints would map to REALs in the database. Moose's types are hierarchical, and Int is a type of Num. We need to check most-specific type constraints first.

In a few weeks, we will see how we can get more power out of the type constraint system by extending it to provide arbitrary check constraints.

sub default_of {
    my $attribute = shift;

    return unless $attribute->has_default
               && !$attribute->is_default_a_coderef;

    my $default = $attribute->default;

    if ($default =~ /^\d+$/) {
        return ('DEFAULT', $default);
    }

    return ('DEFAULT', DBD::_::db->quote($default));
}

default_of returns a suitable default value for the table column.

Default values in Moose are interesting because there are two very different kinds. There are plain defaults, which are numbers and strings. Those are interesting to us.

There are also subroutine (coderef) defaults. Moose calls the provided subroutine to generate a default value for each instance of a class. Since each instance of the class can get a different default, there is no sane default for the database. Subroutines in Perl are (effectively) opaque. One of the great benefits of Moose is that it adds transparency to classes by making them declarative. As I lamented in the beginning of this post, Perl's default OO is almost completely opaque, which harshly limits its second-order utility. Second-order utility happens to be the theme of this series of articles!

Anyway, if the default value is an integer, we pass it through to the CREATE TABLE without escaping it. This could be relaxed a little, perhaps to allow a decimal point. Or scientific notation. Down that path lies Scalar::Util's looks_like_number. I doubt Perl's notion of numerality exactly matches SQLite's, so I remain conservative. SQLite does support what it calls manifest typing, so there is little need to get the distinction exactly right. Just like Perl.

The DBD::_::db->quote method quotes string default values for the database. I am not so concerned about SQL injection in this code as I am about syntax errors. I apologize for the expression's vulgarity, but Perl decided to elide the builtin mysql_real_escape_string. Usually when interacting with DBI, you have a database handle for calling quote. Or better yet, placeholders. I did not want to have to set up a database just to generate a correct schema, but when you are writing an application you will probably have a handle handy.

sub foreign_key_of {
    my $attribute = shift;

    return if !$attribute->has_type_constraint;
    my $tc = $attribute->type_constraint;

    return if !$tc->isa('Moose::Meta::TypeConstraint::Class');
    my $table = class_to_table($tc->class);

    return ('REFERENCES', $table, '(id)');
}

foreign_key_of is used to generate foreign key constraints when an attribute contains an object. Though foreign_key_of inspects the attribute's type constraint, it is a different kind of constraint in the database, so it exists here as a separate function.

Moose::Meta::TypeConstraint::Class is a type constraint that accepts object of a particular class. This type constraint metaobject has a class accessor for determining what the class is.

Obviously we are optimistic in assuming that any delegate object's class will exist in the database. If you were building up a MOP-aware ORM, you might inspect the delegate to see if it does a particular role, one that requests the delegate be serialized as a string in the database. DateTime is an obvious choice here, serializing to a timestamp. Another option (beyond us for a few weeks, but sit tight!) would be to tag the attribute, indicating that the attribute should not exist as a column in the database.

Example

As an example, I'll generate a schema of Wallace Reis's Business::CardInfo, version 0.02.

package Business::CardInfo;
use Moose;
use Moose::Util::TypeConstraints;

subtype 'CardNumber'
  => as 'Int'
   => where { validate($_) };

coerce 'CardNumber'
  => from 'Str'
   => via {
    my $cc = shift;
    $cc =~ s/\s//g;
    return $cc;
   };

no Moose::Util::TypeConstraints;

has 'country' => (
  isa => 'Str',
  is  => 'rw',
  default => 'UK'
  );

has 'number' => (
  isa => 'CardNumber',
  is  => 'rw',
  required => 1,
  coerce => 1,
  trigger => sub { shift->clear_type }
);

has 'type' => (
  isa => 'Str',
  is  => 'rw',
  lazy_build => 1,
);

sub _build_type { ... }

sub _search { ... }

sub validate { ... }

This class generates the following schema.

CREATE TABLE business_cardinfo (
id INTEGER PRIMARY KEY AUTOINCREMENT,
number INTEGER NOT NULL,
country TEXT DEFAULT 'UK',
type TEXT
);

One note of interest is that the number attribute was a user-defined subtype of Int, but we still sussed out the correct data type.

Conclusion

Though there were several instances where I deferred features for later articles, I think we built a usable, if simplistic, system. Its output could serve as a first pass for writing a real, production schema. With any luck, it may inspire someone to write the mythical Moose-based-ORM-for-SQL-haters. Moose is already the basis for an ORM for SQL lovers.

A particularly nice feature of this system is that it generates a schema without touching the class's code. The class is just an ordinary Moose affair that uses no extensions and no special semantics.

One way to extend this system using still more vanilla Moose inspection would be to transform ArrayRef[Class::Name] type constraints into many-to-many relationships. Some of the interesting MOP questions include:

  • What is the class of a ArrayRef[Class::Name] type constraint metaobject?
  • How do you extract Class::Name from that type constraint metaobject? You probably need to dive into Moose's code. One incorrect answer is: with a regular expression.
  • How do you load a type constraint metaobject by name?

Next time we will see that we can do much more than inspect metaobjects with a MOP!

0 comments:

Post a Comment